APPLICATIONS OF ESTIMATION THEORY TO NUMERICAL WEATHER PREDICTION * K. Bube, and E. Isaacson M. Ghil, S. Coho, J. Tavantzis ** Courant Institute of Mathematical Sciences New York University, New York, New York 10012
Abstract Numerical weather prediction (NWP) is an initial-value problem for a system of nonlinear partial differential equations (PDEs)
in which the initial values are known
only incompletely and inaccurately.
Data at initial time
can be supplemented, however, by observations of. the system distributed over a time interval preceding it.
Estimation
theory has been successful in approachinq such problems for models governed by systems of ordinary differential equations and of linear POEs. sequential estimation
We develop methods of
for NWP.
A model exhibiting many features of large-scale atmospheric flow important in NWP is the one governed by the shallow-fluid equations.
We study first the estimation
problem for a linearized version of these equations.
The
vector of observations corresponds to the different atmospheric quantities measured and space-time patterns associated with conventional and satellite-borne
meteorological
observing systems.
filter
A discrete Kalman-Bucy (K-B)
is applied to a finite-difference version of the equations, which simulates the numerical models used in NWP.
AlSO Laboratory for Atmospheric Sciences,
NASA Goddard Space Flight Center, Greenbelt, MD 20771
ll'l rtment of Mathematics, New Jersey Institute of Technology, N w k, N.J. 07102
1~9
The specific character of the equations' dynamics
CONTENTS
gives rise to the necessity of modifying the usual K-B filter. The modification consists in eliminating the high-frequency inertia-gravity waves which would otherwise be generated
l.
Introduction
2.
A
by the insertion of observational data. The modified filtering
Review
of the State-Space Approach to Estimation
2.1 Statistical considerations in estimation: a simple
procedure developed here combines in an optimal way dynamic illustration initialization (i.e., elimination of fast waves) and 2.2 Estimation for stochastic-dynamic systems four-dimensional
(space-time) assimilation of observational 2.3 General remarks on the Kalman-Bucy (K-B) filter
data, two procedures which traditionally have been carried 3. out separately in NWP. filter
Comparisons between the modified
Estimation
for the Shallow-Water Equations
3.1 The equations
and the standard K-B filter have been made.
3.2 Discretization
The matrix of weighting coefficients, or filter, 3.3 The modified K-B filter applied to the observational corrections of state variables
3.4 Observational pattern and choice of parameters
converges rapidly to an asymptotic, constant matrix. using
3.5 Previous work
realistic values of observational noise and system noise, 4.
Resul ts
this convergence has been shown to occur in numerical experi4.1 Estimation for a perfect, noise-free model ments
with the linear system studied; it has also been 4.2 Estimation in the presence of system noise
analyzed theoretically
in a simplified, scalar
case. 4.3 Theoretical analysis of the scalar case
The relatively rapid convergence of the filter in our simula5.
Concluding Remarks
tions leads us to expect that the filter will be efficiently References computable for operational
NWP models and real observation
patterns. Appendix A.
List of Major Symbols
Our program calls for the study of the asymptotic filter's dependence on observation patterns, noise levels, and the system's dynamics. Furthermore', the covariance ma trices of system noise
and observational
noise will be determined
from the data themselves in the process of sequential estimation, rather than be assigned predetermined, heuristic values. Finally, the estimation procedure will be extended to the full, nonlinear shallow-fluid equations.
)40
111
1.
INTRODUCTION One of the main reasons we cannot tell what the weather
will be tomorrow is that we do not know what the weather is oday.
In other words, numerical weather prediction (NWP)
s an initial-value problem
in sufficient quantity and with sufficient
~vailable I
for which initial data are not
curacy. Numerical forecasts are produced now routinely on a
llily basis by a number of weather services in different 'lIotries.
The models used in NWP are discretized versions
he partial differential equations (PDEs) governing large',lIe atmospheric flow. The discretization is performed by
nite differencing, finite element or spectral representa11 ns.
The number of degrees of freedom of the discretized
6 5 Is is typically of the order of 10 -10 .
The spatial
om in of the models is the entire globe or at least an entire ",j·phere. A large number of observations is made by the conven-
11.l' h 1
ground-based
meteorological network, coordinated
World Weather watch (I*/w).
They consist of point
of temperature, humidity, pressure and horizontal
I
,Ily.
These observations, of the order of 10
It duced at the so-called synoptic times,
.MT.
in number,
0000 GMT and
It is customary therefore to choose a synoptic initial time for a numerical forecast.
.••••• v ..
5
Conventional
l! ns are insufficient in number in order to determine
the initial state of the
model atmosphere.
Furthermore,
.... 0
they are very unevenly distributed in space, being
-" h
concentrated over the continents of the Northern Hemisphere,
0
00
c
200
~ e 0
· '" ·" ~
m c
and much sparser over the oceans and over the Southern
~
j
Hemisphere (Fig. l).
> h
Q
> Q
A number of additional observations are
made at the
0 0
00
.0
0
0
u
.'" "" ... ~
so-called subsynoptic times, 0600 GMT and 1800 GMT.
.c
.... .00,
A still
.. ·· u
'"
larger number of observations, exceeding by now that given 0
by the
l~
network, is gathered in an essentially time-
w
"00
~
'" "• ... "... "
· '"· · ·" "· '" " " · " '"'" ·•'" "" ". C
h
0
continuous manner
.0
from polar-orbiting satellites and other
non-conventional measuring platforms and references therein).
(Fleming et al., 1979 a,b,
All these observations
0
.0
...
.
~
~
:;: .... 0
00
0
0
together
00
...
00
0 C
E
form a rather bewildering array by their uneven distribution
C .0
in space and time, as well as by their different error
0
,
N
g
~
2c
m h
characteristics.
E
"
,,~
.c
In order to obtain the best possible estimate of the
0
w
0
Q
00
0
~
I
model state at the initial instant, NWP centers use the
0 0
data available over a time interval preceding that instant. The most common procedure to use such data, called updating,
u
0 0
... ....0'"
0
.00
.'j C
" "... ~ ;: • "" ....m E -" h 0
was suggested by Charney et al.
(1969).
·
The model is provided
the best available data at some preceding instant, e.g., 24 h g
or 48 h earlier,
,
and is integrated forward in time.
.0
0
~
available:
the model values as they become
the model is updated.
reaches the initial instant
When the model integration
for the next scheduled forecast,
its guess of the initial state is blended with the data available at that
instant to produce the desired estimate.
Thus the model itself is used to assimilate the data
144
., 0
· h
,
h
'" • '" ...." '" .C
Additional data replace
E
~
'"
0
available up to the initial instant.
'hls field's outstanding problems and eventually lead to
variations of this
dimensional (4-0), space-time assimilation of
ter practical algorithms for solving them.
IJ
procedure, as well as other procedures for the four-
The purpose of the present article is to apply the
meteorological
-a
observations are reviewed in Bengtsson (1975).
learn as much as possible from this application about
The blending of observations and model forecast values has been made most recently
filter to the linearized shallow-water equations and
h
using in an explicit manner
properties of the filter relevant to operational 4-0 assimilation.
the error structure of the data (Phillips, 1976; Rutherford, 1972).
This linear system already contains
I' rtant features of the equations used in operational
This error structure is determined from past data
models, and our application should be instructive.
and the resulting linear regression coefficients are computed once and for all, their constant values being used all along
We present a brief review of the K-B filter in Section 2.
the assimilation cycle of the model (McPherson et al., 1979;
dynamical model and observing pattern studied are presented
Schlatter et al., 1976).
, ction 3,
A modification of this approach,
which combines more intimately the dynamics of the model
with
111'
the time-continuously changing observation patterns, appears in Ghil et al.
(1979).
It
Clearly, a mathematical framework well suited for the 4-0
along with the modification of the K-B filter
ted by the system's dynamics.
Numerical results
me analytical ones
Section 4.
follow in
E the results, comparison with operational practice nclusions follow in Section 5.
assimilation problem of NWP is the state-space approach of estimation theory.
It deals with the estimation of
stochastic processes which are generated by randomly perturbed differential or difference equations.
This approach was
first formulated for processes governed by linear systems of ordinary differential equations (OOEs: Kalman, 1960; Kalman and Bucy, 1961); its
results are widely known as the Kalman
or the Kalman-Bucy (K-B)
filter.
We expect that applications
of the K-B filter, with suitable modifications, to 4-0 data assimilation will provide additional physical insight into
147 146
A discus-
2.
A REVIEW OF THE STATE-SPACE APPROACH Ta ESTIMATION
E(X
The intent of this section is to familiarize the
- x)
l
=
E(X
2
o ,
- x)
and we say the measurements are unbiased.
reader with the ide as and methods of sequential estimation.
It is natural to
require that the estimate x also be unbiased:
Systematic, more or less rigorous expositions of the theory
E(~ - x)
=
0
are in existence for the interested reader (Bucy and Joseph, 196~;
Curtain
this requirement is equivalent to
and Pritchard, 1978; Davis, 1977; Gelb, 1974;
Jazwinski, 1970).
(2.1b)
Here we shall stay on the purely formaI, so that (2.la) becomes
and hopefully intuitive, level. 2.1
Given a quanti ty
X,
suppose
=
x
Statistical Considerations in Estimation: A Simple Illustration
(2.lc)
tha t two independent measureNext it is assumed that the measurement errors xl-x
ments of this quantity, xl and x
2
' are available. For
instance x could be the temperature in a room and xl and x the readings of two thermometers placed in the room.
and x - x 2
are uncorrelated:
2
In
E (xl - x) (x 2 - x) = 0 ,
the absence of any additional information about x, it is natural to seek an estimate of x, x say, as a linear function of xl and x
2
and that their variances
al
2
2
and a 2 '
'
x = alx + a x . (2.1a) l 2 2 The function itself is called an estimator; the estimate is
are known from previous measurements, viz.
its value.
bration.
We wish to determine al and a7. so that the estimate x will be optimal in sorne sense.
The conditions to achieve
;2 = E(~
Then the variance of the estimation error, - x)2 , is given by
such
~2
errors in the measurements
xl and x
repeat our measurements many times equal the true value x.
2
' i.e.
that if we
then their average would
In terms of the expectation operator
E, this is written as
(2.2)
a
optimality depend on the nature of the measurements. Let us assume first of aIl that there are no systematic
from instrument cali-
Suppose for the moment that in addition to satisfying Eq.
(2.1b), al and a
2
are nonnegative but otherwise arbitrary;
the linear combination in Eq. convex. xl and x
(2.la) is then said to be
Convexity will imply that x always lies between 2
' and furthermore th~t ~2
a
< max
2
2
{a ,a 2 } l
148
14 P
(2.3a)
While property (2.3a) do better.
Notice also that formula (2.4c)
is reassuring, one should be able to
variance irnmediately implies property (2.3b). In particular,
We expect to be able to achieve (2.3b)
otherwise there would be no point in making more than one 2 2 The assumed knowledge of 01 and 02 will in measurement.
when 01
02
=
0, one obtains ;
=
0/12 , which generalizes
= o/IN
2.2
Estimation for Stochastic-Oynamic Systems
for N independent measurements of equal variance.
The purpose of sequential estimation theory for dynamic systems is to extend the simple ideas outlined above to the
requirement.
case in which the quantity x of interest evolves in time
This requirement can now be formulated precisely: x should be a minimum variance estimate, which in our case means that ~2 in Eq. 2
=
to °
fact yield (2.3b); it is accomplished by our optimality
to al and a
for the optimal error
(2.2) should be minimized with respect
' subject to (2.1b).
according to a given (system of ordinary or partial) differential or difference equation(s).
In this case, xl
will represent the state Df the system as determined from
The resultinq optimal
previous observations (measurements), while x
weights are 2 °2
al and a
A2 ° 2" °1 A2 °
2 2 0 + °2 l 2 °1 2 2 °1 + °2
2
2"
observations at the current time. To stress the analogy, let us consider here a system of
,
(2.4b)
randomly perturbed difference equations for the state vector x:
°2
~k+l
where the optimal error variance ° is given by
Notice that al and a
2
-2
° 1,
represents
(2.4a)
A
=
2
-2
+ °2
(2.4c)
.
~
=
'l'~k
+ ~k '
k = 0,1,2, ...
;
(2.5a)
and ç have dimension n, and 'l' is a constant nXn matrix.
(See Appendix A for a list of reçurring syrnbols.) In our
are nonnegative, so that the optimal application,
estimator,
~k
will stand for the meteorological variables
at time k at the grid points of a global atrnospheric predicx
tion model; 'l' stands for the finite-difference operator is, in fact, convex. Moreover, the optimal weights (2.4a,b)
which advances the variables by one time step. The random
satisfy the intuitive requifement that they should reflect our relative confidence in xl and x 2 : if 01 is smaller than ° 2 , for example,
then xl is weighted more
heavily.
151 150
vector sequence {:k:
k
=
uncorrelated with
O,1,2, ••. } is assumed to be a
~
, T
E~k~2
white noise sequence with mean zero and covariance matrix Q,
=
0 .
(2.6d)
The zero-mean assumptions (2.5b, 2.6b) are made for convenience (2.5b,c) onlYl they are not essential for the theory we describe here. The transpose of a vector or matrix by (
and ô
)T
k2
is indicated
is the Kronecker delta,
ô k2
=
a
if k ~ 2
s
The observation vectors and H is a pxn matrix.
~k
have dimension p
This formulation
~
n,
describes in
= l if k = 2. The white noise represents k2 dynamical and physical processes not described by the model
observations at any time k are incomplete; measurements
~, especially smaller-scale phenomena
are made over some areas (continents) and not over others
and ô
not resolved by
(oceans)
the grid.
~O
particular the meteorological situation
1
in which the
at a given location some variables are measured
suppose for the moment that an initial unbiased estimate
and not others, e.g., geostationary satellites de termine
E~O is available from observations at time zero, and
winds but not temperatures.
Finally, sorne functional of
that no further observations are available at later times.
the variables may be observed rather than a variable itself,
In this case the best estimate of ~k ' ~k say, would be the
e.g., polar-orbiting satellites measure radiances, which
mean of x , x -k -k
=
Ex , and is computed according to (2.5), -k
by the recursion ~O
~k+l
=
de pend on vertical temperature profiles.
Thus the entries
of H need not be only zero or one, i.e., H is not necessarily a permutation matrix.
E~O
this estimate can be improved if further observations become
In fact, H, as weIl as
~,
0, and R, need not be
constant in timel they are only taken constant here for available.
simplicity.
In particular, the rank of H, i.e., the effective
dimension of
~k
We have seen in the introduction that the description of ' can change from one time to the next:
the atmospheric state at a given synoptic time from observations the largest number of observations are available at synoptic at that time is entirely inadequate and that we are interested times, with fewer observations provided at subsynoptic times, in using observations at other times as weIl. This situation is modeled by assuming a co~tinuous stream of observations
~k
=
H~k + ~k'
k
=
1,2,3,...
(2.6a)
~k models observational errors and is also assumed to be a
white noise sequence, (2.6b,c)
and even fewer in between, at intermediate times. The linearity assumption that
~
and H are independent of x
is, however, important for the theory we shall use here. This assumptionis essential in guaranteeing the optimality of the filtering algorithm (2.11) below. Extensions to the nonlinear case will be discussed in Sec. 5. Having described our stochastic-dynamical model
(~,H),
Eqs.
(2.5, 2.6), we are now in a position to make
the
which is analogous to (2.1b). f the form
connection with (2.1-2.4). Given an estimate ~k(+) based on all the observations up to and including the time k,
~k+l (+)
the best prediction at time k+l, ~k+l(-)' is simply
=
~k+l (-) + Kk + l (~k+1 - H~k+l (-»);
the analogy with (2.1c) (2.7)
~k+l(-) will be the analogue of xl in Section 2.1. of x 2 is the actual observation
~k+l
The analogue
at time k+l. We
to be:
(a) linear,
we
gain matrix ~k+l
K
It remains for the
k + l to reflect the relative uncertainties in
~k+l(-)'
and
is obvious.
(2.9)
This will be achieved by imposing the
optimality requirement (c), and will yield formulae analogous
wish to combine ~k+l(-) and ~k+l in order to obtain an estimate ~k+l (+), which
Our estimator is therefore
to (2.4).
require
In order to formulate the optimality criterion, we
(b) unbiased, and (c) optimal in
define first the estimation error covariance matrices
some suitable :sense. In Sec. 2.1, we discussed the simple illustrative example of estimating the room temperature x readings xl and x
2
of two thermometers.
from the
In that situation,
r k + l (-) is the counterpart of lJ
the requirements (a) and (b) lead to formulae (2.1).
by one time step to Pk + l (-) P k + l (- ) -_
The optimality condition was expressed in (2.4), which yields the minimum
0
among all linear, unbiased estimators.
For our dynamic system
(~,H),
(2.8a)
according to
'P k (+), T + Q ;
A
E: k (~k - ~k)
(f.
~k+l (+)
A2
(2.10a )
derivation of (2.10a) depends on the fact that
h
requirement (a) leads
to the formula
2
and P + (+) is that of 0 • 1 k l ing Eqs. (2.5) and (2.7), one finds that P (+) is advanced k 0
(2.5c». Eqs.
T
=
(2.6) and (2.9)
f observations ~k+l '
P
0
imply that, in the
k + l (+) is found from
P
presence
k + l (-) by the formula
which is analogous to (2.1a), while requirement (b) leads to n Lhe total absence of observatiOlE at time k+l, we have instead
(2.8b) 11 (+)
= ~k+l(-)
and P k + l (+)
=
P k + l (-).
The estimation error covariance matrices Ilcitly I
lit
contain
all
relevant
he error structure of the current
155 1~4
information estimate.
time k are accumulated in P (+). k
T
trace CP + (+) C k l
J
The statistics of all errors committed up to and including FOrmula (2.10a) shows how
Using Pk + l (+)
this information is advanced to the next time step. For
from Eq.
(2.10b)
and setting the derivative
of J with respect to each element of K +
k l
example, this equation determines how the presumably small
one
find~
equal to zero,
that a unique, absolute minimum is attained at
errors committed over a continent, or other data-rich region, propagate over an ocean, or other data-sparse region. Equation (2.10b)
then determines precisely the extent to which
the estimate is improved by the new observations.
ame for all possible positive definite matrices S.
We have defined the estimation error covariance matrices and considered their changes in time.
This result is valid independently of C, and hence is the
other words, Kk + l above minimizes simultaneously all reason-
We are ready now to
'Jble measures, or norms, of the expected estimation error.
derive the optimal gain matrix by imposing the optimality requirement (c):
In
The formula above gives the so-called Kalman-Bucy
it is required that ~_k+ 1(+) be a minimum
.ptimal gain matrix, or filter.
variance estimate in the sense that
(K-B)
Substituting this into (2.10b)
yields the optimal error covariance matrix
J
be minimized with respect to each element of Kk + sYmmetric, for S
=
positive definite
matrices S.
l
' for all
Assuming the availability of an unbiased initial estimate
In particular,
~O
I, we see from the definition of P + (+) that the k l
trace, or sum of the diagonal elements, of P k + l (+) is to be minimized.
= ~O(+) = E~O
ncl an initial estimation error covariance matrix
The trace of P + (+) is the expected mean-square k l
(m-s) estimation error. We recall that a symmetric matrix S, ST
=
S, is positive
definite if, for any vector x t 0, the scalar product Since every such matrix has a factorization S
=
~
T
S~
>
It
o.
description of the Kalman filtering algorithm is now
'm 1 te:
8k +l
that in general
for k (-)
Pk+l (-)
Kk +1
15
=
0,1,2, ... , one computes in order
CTc, one finds
'!'~k
(+)
(2.lla)
,
'!'Pk(+)'!'T + Q ,
T
T-l P + l (-) H (HP + (-) H + R) , k k 1
157
(2.llb) (2.llc)
Pk+l (+)
(I - Kk+1H)
~k+l (+)
~k+l (-)
Pk+l (-)
H~k+l (-))
+ Kk + l (::k+l -
(2.11d)
are weighted more (less) heavily than the predictions ~k+l (-) .
(2.11e)
2.3
General Remarks on the K-B Filter. Before proceeding with the description of our dynamical and
In the absence of observations at time k+l, EqS.
(2.11c,d,e)
are replaced by
Recall first that Kk + l
0
(2 .llc')
P k + l (+)
Pk+l (-)
(2.11d' )
~k+l(+) = ~k+l (-)
(2 .lle')
(~,H),
a number of theoretical remarks are in order.
~k+l(+) was chosen to be the optimal,
minimum variance, unbiased estimate among all estimators of the form (2.8a).
It is not clear that in computing 8k+l(+)
all past observations ::j , j
=
Actually, the gain matrix sequence {K : k k
1,2,3, ... } may be
precomputed once and for all, i.e., for all realizations of the state and noise
observational model
processes, ~k '
~k'
~k'
Indeed,
(2.11b,c,d) do not
utilized.
= 1,2, ... ,k,
have been fully
It can be shown, however (e.g. Jazwinski, 1970,
Sec. 7.3), that our estimate is in fact the optimum unbiased t'stimate among all estimators which are linear combinations f all the available data
depend on the estimates (2.11a,e). This is a result of the assumed
A. z . .
(2.13)
J-J
linearity of the model
(~,H).
This wider optimality is due to assumptions (2.5c, 2.6c,d)
To complete the analogy with our earlier notice that Eqs.
(2.11c,d)
-1 P k + l (+)
illustra~ive
example,
can be rewritten as
lime. It can sometimes still be achieved
P~~l (-) + HTR-IH
(2.12a)
(+)HTR- l
(2.12b)
P
k+l
hat system errors and observational errors are uncorrelated in without these assumptions
(Jazwinski, 1970, Sec. 7.3, Examples 7.5-7.7). Carrying the timation error covariance matrices along in the computation m.kes i t possible for the filtering algorithm to be sequential,
In our analogy,
(P + (-) ,R,P k + l (+) ,K k + l ) correspond to k l
(Oi,o~,&2,a2); Eqs. (2.12a,b) are analogous to (2.4c) and
r recursive: each observation is discarded
lr
as soon as it is
essed. This sequential nature of the estimation makes the
(2.4b ), respectively. Some intuitively appealing results follow, as in the
Iq rithm conceptually simple,
as well as having great practical
lv, ntages. It is one of the major reasons for the broad simple example.
For instance, Eq.
(2.12a)
implies that plicability of Kalman filtering.
Pk+l (+) ::. P k + l (-)
;
the matrix inequality A ::. B means that C nonnegative
Another important feature of the filtering algorithm (2.11)
=
B - A is
semidefinite, xTCx > 0 for all x. Eq.
implies that if R is small
Ih
(2.12b)
fact that only first-order statistics, i.e" cond-order statistics,
i.e., covariance matrices, of the
(large) then the observations zk+l
159
158
means,
vector processes of interest are involved.
In other words,
it suffices to know these statistics for the system noise ~ , observational noise 1;, and initial error ~O - ~O. will provide the estimate P at all future times.
x
These
The sequential nature of the filter implies in particular that, in the absence of further observations at times k > N, the best prediction is simply given by (2.11a, 2.11e'):
as well as its error covariance k
~k+l
This property is important since in
N, N+l, N+2,
... ,
(2.l4b)
practice it is usually difficult to obtain even this much information about the random errors one wishes to filter: it is well nigh
impossible to obtain more, i.e., to prescribe
higher-order statistics. Moreover, the first-order and secondorder statistics of the error processes can be determined adaptively, i.e., by the filtering algorithm itself (Chin, 1979, and references therein). Gaussian processes
The covariance of this predictor is then given by (2.11b, 2.11d'). This corresponds roughly to what is done in operational practice:
Central Limit Theorem (e.g.
assimilation
Furthermore, the
Parzen, 1960, Sec. 8.5 and 10.4)
large number of random effects is approximately Gaussian, regardless of the distribution of the individual effects. It is reasonable, therefore, to expect our errors, which come from a large number of sources, to be approximately Gaussian. It follows that retention of only first and second-order statistics in the filtering algorithm (2.11) should be rather ~,
EN
is determined by 4-D data
from all observations up to and including the
synoptic time of interest.
Then the forecast model is
without further use of the dcta.
and
~O
sati~factory.
are Gaussian, it is known
.Issimilation
interval
k
= 0,1,2, •.. ,N .
The framework of estimation theory can provide also insight Into the nature of the system noise (~,Q) and ways to its d termination.
It could lead to improvements in modeling
"'d hence forecasting, by helping to pinpoint deterministic ,ornponents of~. h
This, however, is not our purpose here. With
e remarks, we turn to the description
of the dynamic
y tern to which the theory outlined in this section will be
1'1 lied.
(e.g. Jazwinski, 1970, Sec. 5.2) that the best possible nonlinear estimate of x, i.e., one whiyh might depend nonlinearly on all the observations, is still our linear estimate x.
161 16u
We intend therefore to study
nly the assimilation and initialization problem, over an
states, in its various forms, that the superposition of a
~,
an initial state
integrated forward in time from the initial state obtained, are in fact completely determined by
their first and second-order statistics.
Actually, when
(2.l4a)
3.
ESTIMATION FOR THE SHALLOW-WATER EQUATIONS
3.1
The Equations
Ut + UU x + vu y + ~x
-
0
,
(3.2a)
v
+ fu = 0
,
(3.2b)
~t + u~x + V~y + ~(ux+Vy)= 0
,
(3.2c)
t + uv x + VVy +
~y
fv
The shallow-water equations are a simple system whose solutions exhibit some of the properties of large-scale atmospheric flow.
They have certain important characteristics
by linearization around a solution (u = U, v = 0, ~ = ~)
in common with the more complicated, three-dimensional systems
ing
currently used in NWP models.
the perturbation quant~ties, i.e., the deviations from
We shall study here a linear, spatially one-dimensional version of the equations, written in cartesian coordinates for a plane tangent to the Earth at latitude 6 : 0
v t + uv ~t
+
+ fu
x
U~x
o o o
+
~ux-fUv
fU = -~y= const. We assume in this
satisfy-
derivation that
equilibrium values of (u,v,~), do not depend on y. It is advantageous
to work with a constant-coefficient
system, as long as the basic phenomena of interest are (3.1a) (3.1b) (3.1c)
not obscured by this simplification.
Hence f, U and ~
in (3.1) are taken to be constants. The variation with latitude of the CoriOlis parameter f, however, has an important effect on planetary flows. This
The coordinate x points eastward, in the zonal direction, along the circle of latitude
6 = 6
0
' while y points north-
ward, in the meridional direction; u and v are velocity components in the x and y directions, f = 2\1 sin 6 is the 0 Coriolis parameter, with \1 the angular velocity of the Earth. The geopotential h+H ~
~
= gh
measures the deviation of the height
of the free surface from its equilibrium value H , with
= gH, U is a constant zonal mean flow velocity. All quantities
so-called a-effect,
a = f y (6 ) I 0
ffect of bottom topography,
shallow-water
equatio~s
~y I
0, in the tangent-plane
approximation (Pedlosky, 1979, Sec. 3.17 and Ch. 6). The erm (- fUv)
in (3.1c) introduces this effect into the
.olutions of the system considered, without sacrificing the implicity of constant coefficients. All the solutions ~ = (u,v,~) of (3.1) can be expressed a superposition of plane waves:
are independent of y. These equations are
0, is equivalent to the
H(x-c,Q.t) d~rived
e
from the full, nonlinear
on a tangent plane,
"'h re ,Q. is the wave number. For each ,Q., the speed of the
n ividual waves, c,Q.
162
(3.3a)
,is given by the
163
dispersion relation
o .
(3.3b)
modified by the presence of the Coriolis force. Their dispersion and dissipation plays a role in the mechanism
This is a cubic equation f or cJl,(m), m
=
1,2, 3 .
c~
, Which has three real roots,
It turns out that (3.1) has two types of
geostrophic adjustment,
which maintains the atmosphere in
a quasi-geostrophic state, in which the Coriolis force nearly balances the
solutions: slow waves, corresponding to (3.4a)
(1)
of
c~
pressure-gradient force.
little energy at any given time, and
But they carry very appear mostly as
higher frequency oscillations superimposed
on the meteoro-
logically significant ones, i.e., as meteorological noise.
and fast waves, with +
2
(~
2
+f
. t erm (fU) In the absence of the 8-1~ke - v
2
u .
An important problem in NWP is the filtering of the
(3.4b)
)
fast waves in large-scale numerical forecast& in order to
in (3.1c), the last
prevent their spurious growth to amplitudes larger than
term in (3.4a), as well as in (3.4b), would not be present.
hose
. (3.4) are actually exact to first order in The expressions ~n
his filtering is different from that discussed in the previ-
the small
nondimensional parameter U/~.
;ng which correspond to the s 1 ow t rav el ~
(I
U
=
compara ble to that of the mean zonal current,
0(10 m/sec), and they retorogress:
their
relative to the mean flow is westward.
propagation
the
atmosphere.
need
for
it stems from the two time scales of the
extraneous, random noise perturbing the deterministic
J
ion.
In particular, the fast waves can be eliminated
I
reduced at the initial time of the forecast by an
nltialization procedure (Bengtsson, 1975;
Leith, 1980; and
rences therein). Once such an initialization has been
=
0(10
-4
sec
-1
, rCormed, the fast waves will not grow excessively over riods of time comparable to the evolution time of the slow
).
The speed of the fast waves is dominated by the second
v
(Browning et al., 1979; Bube and Ghil, 1980).
term in (3.4b), being O(lO~m/sec). They are called the familiar inertia-gravity waves, since they are gravity waves of shallow-water theory, for which c~
=U
~ ~ ,
165 1 4
The
These slow, retro-
gressing waves are named after Rossby, and they are an important feature of mid-latitude atmospheric dynamics. Their frequency is always smaller than f
in
.( terministic motion itself, rather than from the presence
of planetary waves
on which synoptic weat h er systems are superimposed. Their speed is
section:
0US
The slow waves are the meteorologically important ones,
found
Clearly, the two problems of:
a) determining a solution
The grid
to the forecast equations from a continuous stream of noisy data (4-D data assimilation) and
b)
Xj
rendering this solution
=
,A JUX,
J'
~~
1,2, ... ,M,
as free of fast waves as possible (initialization) are
is introduced, with
related.
state vector ~k of Sec. 2
The connection was stressed and a first step in
the direction of their joint solution taken in Ghil
approximating
t
k
=
kt::.t,
k
~(jt::.x,kt::.t).
0,1,2, ••.
(3.5a)
Then the
corresponds simply to ~~ ' with
(1980) (3.5b)
and Leith (1980), among others. We shall present in the sequel a systematic way of
so that n
=
3M.
combining the two aspects of filtering within our framework.
We used the Richtmyer two-step formulation of the
This will involve a modification of the standard K-B filter
Lax-Wendroff scheme (Richtmyer and Morton, 1967, Sec. 12.7
outlined
and 13.4).
in the previous section.
Before proceeding with
Let A and B denote the matrices of system (3.1),
this modification, we shall discretize the equations. This ~t
corresponds to what is done in operational practice and will bring the problem to the form (2.5a).
A~x + B~
The scheme can then be written as j+l/2
Wj + l / 2 3 •. 2
=
+ t::.t W , 2 -t
-k+1/2
Discretization
k
The discretization chosen for (3.1)
is in terms of finite (3.Ga)
differences.
Finite-difference models are still the most
widely used in NWP.
j
They also facilitate somewhat the assimila-
~k+l
tion of observations made in irregular patterns. Spectral
~~ j
models, on the other hand, have certain advantages with regard
j + (t::.tlwtl k+l/2 t::.t
j+l/2
~k + t::.x A(~k+l/2
to the initialization aspect of our problem. An analysis
j-l/2 t::.t j -1/2 j +1/2 ~k+l/2) + 2' B(~k+1/2 + ~k+l/2) (3.6b)
similar to the present one should be easy to reproduce for spectral, finite-element or hybrid
Eqs.
models. h
166
(3.6)
define the dynamics f of our system for
state vector ~k which is given by (3.5)
167
in terms of ~k '
solutions with fast components which are vanishing or small. The discrete system (3.6) has the same type of slow and fast
Obtaining a best estimate in the presence of such a
plane wave solutions as the continuous system (3.1). Their
constraint corresponds to a constrained optimization problem.
dispersive properties are similar. The Lax-Wendroff (L-W)
We shall study in this subsection an appropriate modification
scheme was used because its numerical dissipation is very
of the standard K-B filter described in Sec. 2. All solutions of (3.6)
useful in our simple model in order to simulate the physical dissipation
can be represented as a super-
position of plane waves, similar to (3.3a). For the purposes
mechanisms active in the atmosphere. Such
mechanisms are also present in more complex NWP models, and
of this discussion it is actually more convenient to think
they are essential for geostrophic adjustment to occur.
of ~k in its physical interpretation ~k ' so that we write
The plane waves of the continuous system (3.1) are better approximated
by those
the L-W scneme (3.6) version.
~(jtox,
ktot)
of the Richtmyer two-step version of than by those of the standard, one-step
[n particular, in the absence of the B-term, the
slow, quasi-geostrophic wave of (3.6) that of (3.1) satisfies u(x,t)
= O.
satisfies
u~
=~
useful check on the departure of solutions
of opposite signs.
as
This turned out to be
For each wave number R-, there is a slow wave w(l) eiR-jtox with -Rspeed c!l), and two fast waves ~R-(2,3) e iR-jtox w~th . ~ speeds cR-(2 ' 3)
a
The decay factors A~m),
IA~m) I
< 1, are
present due to the dissipation in the difference
Wj from geostrophy.
cheme (3.6).
-k
Denote by
R
(for Rossby waves), the span of the real
p rts of all compound n-vectors, n = 3M, of the form
3.3
The Modified K-B Filter In the absence of any constraints on the dynamics, it is
(3.7b)
clear from Sec. 2 that the K-B filter corresponds to the (m)
~R-
solution of an optimization problem: obtaining a minimum variance estimator
for the system (o/,H).
We have seen in
I
r m
= 1,
e
iR-Mtox
as R- ranges over all possible wave numbers.
is the slow wave space. The fast wave space is denoted
Sec. 3.1 that, in the application of interest here, it is
rh1
desirable to select among the solutions of the discrete
Iy G (for gravity waves), and is defined as the span of the
evolution operator 0/ defined by (3.5, 3.6) a special subset
1 parts of all n-vectors of the form (3.7b) for m ~
168
gain ranges over all possible wave numbers.
169
= 2,3,
The spaces Rand G are subspaces of Euclidean n-space En, and they span En,
R ~ G = En.
invariant under the matrix
~,
Furthermore, Rand G are
Having defined the operator IT, we are now ready to describe the modified filter. Let
defined by (3.6), i.e., they
z - H ~_k (-) _k = _k
T)
are invariant subspaces of the system's dynamics.
denote the innovation vector at time k. This vector
Indeed, the eigenvectors of 'I' are precisely the set of Hence any vector x in R
all vectors of the form (3.7b).
be advanced by the unperturbed system (3.6) to a vector in R
~t.
after a time step
evolve to
~y
in G.
(3.9a)
will
~x
Similarly, a vector y in G
will
Notice, however, that Rand G are not
carries the new information contained in the observations at time k; ~k(-) carries all the previously accumulated information, as propagated by the dynamic system. The filtering step (2.11e) is now written
orthogonal to each other.
(3.9b)
Given any n-vector x, there is a unique vector y in R which is closest to x in the sense that II is minimized. projection of
y
T
= IT,
=
~
onto R, and is denoted by y
and satisfies IT
IT~
(y-x) T (y-x)
What is desired is that, for all k, ~k+l(+) lies in R. rt is clear by inspection of Eqs.
(2.11 a,e,e') that, under
This vector y is called the orthogonal
The orthogonal projection operator
IT
¥. - ~1I2 =
2
assumption (3.8), this will be the case provided that,
= ITx.
IT is a symmetric matrix,
or all k, Kk+l~k+l lies in R.
perturbation of the noisy true state ~k+l '
zk+l is a noisy
= IT.
As the observation vector
The orthogonal projection
cf. Eqs. (2.5a, 2.6a), the correction vector Kk+l~k+l does
is found in O(n log n) arithmetic operations by
not, in general, lie in
R.
performing three Fast Fourier Transforms (FFTs) on the What we seek, then, is a modified filter K;+l ' possibly vector
~
(one for each component
~,
~,
~),
then multiplying
by a block diagonal matrix comprised of M 3x3 blocks, then
,!I-pending on ~k+l ' which has the property that K~+l~k+l
--! lie in R.
This filter is found by minimizing trace P + (+), k l before, but subject now to the constraint that the correction
performing three inverse FFTs. We assume in the sequel that
tor lie ~O
(3.8)
IT ~O
in
R.
A Lagrange multiplier method was used
solve this constrained optimization problem. The result is mply that
i.e., that initialization in R already.
has been performed and
~O
lies (3.10a)
These concepts are discussed and similar
notation is used in Leith (1980).
1\1 pendently of ~k+l '
where Kk + l is the
170
171
standard Kalman-Bucy
filter. Eq.
For the modified filter, therefore, we replace
Our physical domain is an interval of the x-axis of
(2.11e) by
length 2L.
~k+1 (+)
(3.l0b)
near 45° N.
It is meant to correspond to a circle of latitude Hence f
= 10-4 sec -l
and L
=
14000 km.
The
actual distribution of land and ocean at this latitude has Since this filter is no longer optimal, we must also been simplified to be 2-periodic, so that in each interval replace (2.11 d) by (2.10 b), in which
K + is replaced by k l
IT Kk + l •
of length L, half
is covered by ocean (Pacific or Atlantic) ,
and half by land (North America or Eurasia).
It is reasonable
In the sequel we shall call the modified algorithm to consider, therefore, only 2-periodic solutions of (3.1), the IT-filter, while the standard algorithm will be called the K-filter.
and consequently our computational domain is of length L. We shall see in Sec. 4 that the IT-filter
produces optimal estimates of the slow waves at the expense
We
consider
the left
half
of
the
computational L-domain
to be covered by land, and the right half to be covered by of estimation errors only slightly larger than the fast-wave ocean, so that our observation matrix is contaminated estimates produced by the K-filter. H =
3.4
Observational Pattern
The mean
and Choice of Parameters.
the
•
"ocean ll
are obs,srved over "land", and none over
This is only meant to serve as an illustra-
handle observations which are arbitrarily distributed in space
o
inertia-gravity waves. The slow waves in the solution
lnd mental L-domain of one continent and one ocean in a • 110
of approximately L/U; cf. hoice of L, U, ~ and t,
(3.4a), the actual time, for is roughly 12 days.
17:1
172
3xl04m2/s2. The value of U
(3.1) which we wish to estimate will travel across the
'r
and time.
=
lmosphere of H ~ 3 km, which gives realistic phase speeds
. 11 y familiar situation. tion of K-B filtering in a meteoro 1 oglca Clearly, the power of this approach lies in its ability to
20 m/s and ~
orresponds to an equivalent depth for a homogeneous
ponding to t h e convent ional meteorological upper-air network: (u,v,~) 'I'
=
typical for mid-tropospheric flow at this latitude;
the study of a "classical" observational pattern, corres-
.. all quantltles
•
flow about which (3.2) is linearized was
ken to have U
In the present article we shall restrict ourselves to
(I 0)
~
For a single wave number!, continuous system (3.1)
initial data for the
which lead only to slow waves
are, to first order in the small parameter
:
(1)
......
PI
rt 0 rt
rt C llJ
a) Erms over land:
::r (1)
rt
,......
(1)
0 P.
:3
(1)
III
1i
~
t
.02 00
.10
..,. ..,.
EXPECTED RMS ERRCJl OVER TDTR!. REGICf'
I
,
I '
2
•
I
5 T1'1:1lll.T51
•
7
e
t
b) Erms over the ocean;'
c) Erms over the entire L-domain'
g-
(1)
1i
:3 0
'
..........
:>:
::r (1)
rt
Ul
llJ
::r
Cl
C
:3
CI1
(1)
rt
:::
(1)
llJ 1i
"Cl "Cl
llJ
ro
5 t1
(1)
:3
0
Cl
(1)
0-
......
....'€ ......
III
8
Cl
::r (1)
Ul
I-'
::l llJ
0
.... rt.... ::l PI
~
P
....0
rt
Cl
(1)
III
0-
c
III
rt
>:
:::
(1)
rt :T
:>
r'
::r
....
~
ro :::
(1)
rt
(1)
0-
ro
:::()
1i (1)
(1)
HI
HI
....P.
ro
..,::r
I-' HI
ro
III
rt
(1)
:3
.... rt
::l
....rt 0
PI
III
PI
(1)
0-
0
::: \Cl
:::(1)
:>
Cl llJ
III
"
0
b)
9
......
III
llJ
c.o
\Cl
(1)
:3
0
0
g
7
I~ ....
T1'1: (OATSl
~~~~ '-""';-----,-u ~~;.
EXPECTED RI'1S ERROR OvER LRND
Components of the actual rms error in the estimation of
RKS ERROR OVER TOTJl. REGION
RMS £RIlOJl OVER TQTAl REGIOtl
an individual realization of our system, with given initial data ~O and observation error Fig. 3a. lI~k -
A
~kll
{5 k :
k
=
a)
1,2, ..• }, appear in
The corresponding components of relative m-s error,
2
Itrace P k ' where lIyll is the length of the vector y
are given in Fig. 3b. and sequence
~k
For a different random choice of
~O
' the same plots appear in Figs. 3c, 3d.
,
It is easy to see that the rms error in an individual realization decreases to the observational noise level in
TlII(I~1
.. ~ '" ,.,
about the same time as its expected value (compare Fig. 3a and 3c with Fig. 2c).
After that time, however, it continues to
fluctuate near zero, rather than decaying monotonically to zero.
Our experimental results (Figs. 3b,d)
spread of individual value is not too large.
show that the
estimation errors about their expected It would be interesting, in general,
to know a priori how large this spread is expected to be, and
I..
b)
::I~'
::~:] \l;;; "
we intend to study this question further. We show in Fig. 4 the actual time histories at a number of grid points for the estimated solution corresponding to the realization in Figs. 3a,b; ~ is shown in Fig. 4a, v in Fig. 4b and ~ in Fig. 4c.
We chose to show a point on the West Coast
of the continent, labeled SF (for San Francisco), one on the East Coast, labeled NY (for New York), and one in the middle of the ocean, labeled HA (for Hawaii)
=
"New York"
by periodicity.
Note that "Tokyo"
I"
I
'K. 3
I
I
J
t
11-';
~QIIT"
•
J
•
•
10
Components of the actual Toot-mean-square (rms) estimation error and the relative mean-square (m-s) estimation error, for two realizations of the experiment whose Erms errors appear io Fig. 2. The two realizations correspond to two different random choices of inl tial eondi ticn ;.0 and observational noise ~k' sampled from the same respective probability distributions, with covariance matrices Po and R, respectively. a) The rms errors for a realization of the run whose Erms errors are given in Fig. 2. The curves have the same labels, and are plotted on the same scale, as in Fig. 2. b) Relative m-s errors for the same realization. The tb.r-ee panels give the ratio of total m-s error over land, ocean, and the entire region, respectively, to the corresponding expected m-s error. .) Same as Fig. 3a, for a different realization. d) Same as Fig. 3b, for the run in Fig. 3c. Notice the difference between Fig. 3a and Fig. 3c: the actual rms estimation error in two realizations of the filtering process can be qui te different. Figs. 3b and 3d Dhow the variation of the relative m-s error: equality of absolute m-s orror and expected m-a error corresponds to a value of 1 in these figures.
18·1
c)
1.0
·8
r-1-,.-----,-----..---,
.-r----r-
-..-----.--
1.0.-- r·-·....,..--"
~-____r__.~~-~____.--~
~
a)
,
·6 ~
i
.8 "
~
.6 -
~
•
•4 ~
.2 -
'---1---'--
r--- ... ·-- r---"'-r-- ,--,--,.'---
lo---r---·l---'-··
f-'r--l
.4 1
.2 ..
O~A~~A
SF
HA :::::sr+ll(
l~\'
n9
"
-.2 -
1
~ •
-.4 -
-.6 -.8 .. -1.0
L--l....--.-I--L-__.
.1-----..
o
~:~ ~'
1 -,.-
2 -,
__ • .J
....... __ ...J..
3
'1
-r
--L_.L _ _ ........
5
--I--.~ _ _ ..1 6 7
--'--',,- -r--
,-
~---.--
~ _ ____L.._~_.L_--------J 8 9 10 r
-,
-~----'-~l
l:!~ ~ ,o~/~~~\ I~Rj
>
.~: ~'\
~:n
-1.0 /~
,. .l----" __ ~ ~
8
9
10
Fig. 4
location). a) u-component of velocitYi
1
1.2
L_1. .
I~.
__
__
.L --1_-.
~
2
3
(lr
UIJ
810w wov
I"lmpQ
h ,It Ii y,
d ,.mld t
I_.J..
3
L
'1
__ 1..--_
•
...1- _ _ 1.
5 TIME WAYS)
7
8
9
10
/ ~r~' ~\\
'9~
/
.L_...l_ .. - . J - __
6
7
1._~.
1__
8
~ I
9
_.
~
,
__
J
10
Same as Fig. 4, only for a run using our modified K-B filter, the IT-filter. The fast waves have been completely eliminated and the estimated solution is slowly varyinr,.
b) v-component of v 1o 1 Y.
wHh' p rlO(l "r ,pproxlm I Iv ,\ay., UH n whi 11 I~. r l . l wQV • 'NI lh , p jiltt or •• I' U l ~ ly (In
,
__
6
~ ~
-..J.-F_.
2
5
'1
cl t, the geopotentlal No lee th
l
~f~
~ ~_~-J~~_~~ ._._I__ ~__~~_.-L-_~ ~_~
l~~ -'
o Time history of the estimated solution without system noise, at three locations, labeled HA (for Hawaii, a mid-ocean location), SF (for San Francisco, a West Coast location), and NY (for New York, on East Coast
_-4 _ _
:
I'
~~
'9,~'
M
-.6: ;::
"'"\
; '~~V Il\" OD $, h) :!. on text. the 1n discussed are .~~ the v
is due to its statio n, SL, being locate d
110.0 ORYS
hieh ). SF a: interior continen tal location K-B f th . statIons an g'~ven These functions are simply c~~:s~~e ~~:o::i; ht and NY (see Fig. 4). ction to the fOrecast field at algorith~'s gain ma~rix K at day l~~ m~:~~ c:r~::~enine panels, 7a-71, give the The _ ma~n observat lon at stat~on S~, sa~. wh n u c) 4> on u d) v on f . i
OfSL·el(ef:;e~a~~:~L:::S8. Intl~encesefu1ncttiodns ec e are
It is positi ve over land,
station s in the middle of a data-d ense region : neighb oring but a small role. receiv e almost equal weight s and advect ion plays
.7S
.25
-.50
7
I I I
.50
-.2S
Fig.
PHI - U
center ed at SL is the
negati ve out becomi ng nearly zero at SF and NY and slight ly The relativ e small size and symmet ry of into the ocean.
1.00
PHI - PHI
They are the only ones
to have the latter proper ties.
I
0.00
-1.00
its y-comp onents were
1.00
.SO
51-
system noise covari ance matrix Q:
I
I Sf I
0.00
1.00
V- u
.75
U - PHI
In fact, the peak for the SF influen ce func-
er, the former tion is slight ly higher than the NY peak. Moreov at SF itself , locate d one grid point West of SF, rather than Both data densit y and advect ion while the NY peak is at NY. hus play a role. give even It makes sense for the point upstrea m of SF to is closer re weight to SF inform ation than SF itself : SF
197
J
1
Our (v-
(cf. Eg.
subspace
-.2
-1.0
is simply 0
2
3
'1
5
6
7
8
la
9
ITK oo
'
form of the IT-filter, which
gave practical¥ the same results as the
n-filter itself.
a~%
l\ ·
~!
/A~~;1! #, ~ 11
/
~
0
-.2 -.4 -.6 -.8 -1.0 -1.2
-1.'1 1.6 1.4 1.2 1.0 .8 .6
....
:x: Q...
0
2
3
'1
5
6
7
8
9
10
~~ ~10/~ 1
.'1 .2 0 -.2
19,
-.4 -.6 -.8 -1.0
IT-filter are
~ 0
2
3
'1
r:-/
Y
A
5
6
7
8
9
10
TIME WAYS) Fig.IQ
Same as Fig. 9, but for a run using the TI-filter, rather than the K-filter. The fast oscillations have been eliminated.
205
, 011
4.3 Theoretical Analysis of the Scalar Case
observation times.
In order to help explain some of the qualitative
Defining S.
J
features of the numerical results in Sec. 4.1 and
,
4.2, j = 0,1,2, ... ,
we perform an asymptotic analysis of the filtering equations for a scalar state x.
The matrices of interest
will now actually be scalars·, and we assume 'I' f H = 1, Q
~
R > 0 , and Po > O.
0,
assumptions on Q, R, and Po they are variances. observation
'I'2r ,
A
are due to the fact that
r-l B
I
'I'2p,
p=o
We assume furthermore that the
is performed only every
~t
tion, and the quantities
0,
The positivity
numerical experiments reported herein r = 24, as
to be the estimation error variance after the jth observa-
= 30 min.
E
time steps; in the Eqs.
we would have
(4.5.a,b) can be converted into a nonlinear difference
equation for Sj:
and observations are taken at
the standard 12 hour synoptic intervals.
S. = g (S. 1) JJ
The filtering algorithm (2.11) yields in this case
(4.6 a)
where (As + BQ) R
2
(4.5a)
'I' P k - l (+) + Q ,
J Pk(-)R/(P k (-) +
1
P
k
R)
From Eq.
To determine the asymptotic behavior of Eq.
ases
o
Q
and Q > O.
In the case of a perfect model,
= 0, the solution of (4.6) may be written explicitly as
(4.5b), one immediately finds that
Aj(A-l)SOR S.
J
1.
and tending to zero as s + +00.
It follows that the root
S+ of (4.8) is approached monotonically by the solutions Eq. 1'1'1
2
of the recursion (4.6a)
(4.7C) states that for stable dynamics, i.e.,
(Isaacson and Keller, 1966, Ch. 3.1,
Fig. 2a),
1, the estimation error variance, and hence the
= S./R, System (3.1) always tend to zero. Jr J itself is conservative, while our difference scheme (3.6)
filter
as
K.
If So is greater than (less than) S+ ' then Sj will is dissipative.
Hence all eigenvalues of the difference decrease (increase) monotonically to S+ . Since K
scheme matrix
'I' have modulus less than or equal to unity.
jr
=
Sj/R,
we have also We are thus in the stable case (4.7c) and it is only to be expected
K,
that our estimation error covariance matrices
Jr
S+/R
+
as j
+
+ 00,
and filter K approach zero in the absence of system noise. k
the convergence being monotone as well.
This is in accordance with our discussion
accordance with the monotone
of Fig. 2
in Sec. 4.1.
in Fig.
We wish to determine now the asymptotic nature of Sj in the case Q > 0, i.e. , in case system noise is present.
6;
in fact P
jr
decrease of trace P
=
jr
(+)
(-) decreases also monotonically.
Furthermore, 5+ is independent of So asymptotic filter Koo
This is in
=
Po ' and hence the
S+/R is independent of Po as well.
The quadratic equation To find an approximate value for S+ ' we now assume that s
=
(4.8)
g (s)
has a positive discriminant, hence it has real roots.
1'1'1
< 1 and r » 1 ,
term in Eq.
(4.8)
so that A
dg/ds
+ It follows
AR2/(AS+BQ+R)
2
> 0 .
=
S
Let S+ denote the unique positive root of (4.8).
Notice that
with g(O)
monoto~e
1. Then the quadratic
QR Q+(1_'I'2)R·
(4.9a)
in general, by analogy with (2.3b) and (4.5d), that S+
2
min {R, Q/(1_'I'2)}
t least approximately. Therefore g(s) is a
'l'2r «
is negligible and we have, approximately,
Its free term is negative, hence the roots are of opposite sign.
=
,
In particular, when the observational
increasing function of s,
> 0, while its derivative is monotone decreasing
208 209
error variance R is small enough, so that
The asymptotic properties of P and K (4.9bl
discussed above
have counterparts in the full, vector-matrix case, under the assumptions of complete observability and complete
then S+ is roughly equal to R, and the size of Q has little ·ontrollability. influence.
These assumptions concern
properties
If, on the other hand, the observational error f the matrices
~
and H; they are satisfied for our model.
variance is large enough so that The interested reader is referred to Bucy and Joseph (4. 9 cl
(1968, Ch. 5) and to Jazwinski (1970, Sec. 7.6) discussion of the general results.
then S+ is roughly proportional to Q, and the size of R has little influence. The two extreme cases (4.9b,o)
explain much of the
qualitative nature of the results in Sec. 4.2. a matrix version of Eq.
(4.9b)
and we see that the expected
Indeed,
is satisfied over land, m-s errors
at observation
times are approximately equal to the observational error variances.
In fact, they are slightly smaller than R ;
this is in accordance with (4.5d), as well as It also agrees with operational experience,
(4.9~,b).
as stated
in Sec. 3.4. Over the ocean, we can write R
=
00
,
so that a matrix
version of (4.9c) is satisfied. Experiments magnitudes of Q (not shown), of Q
confirmed that the size
is indeed the determining factor in the size of P
over the ocean. ence:
hav~
with different
This also agrees with operational experi-
analysis error
in, data-poor regions
is essentially
equal to the error in forecasts from one analysis time (synoptic or subsynoptic)
to the next.
210
211
for a full
5.
CONCLUDING REMARKS We have shown
observational noise covariance R; it does not depend on the
the ways in which the concepts and
formalism of sequential estimation theory
are relevant to
the 4-D assimilation of meteorological data, by applying them to a simple model.
The stochastic-dynamic model
used for the illustration of the theory
initial errors, PO.
The rapid convergence and good performance
of the W-filter for the linear problem hold filtering of nonlinear problems
Nonlinear
estimation theory is not completely understood
the linear shallow-water equations, including the B-effect
m thematically, at least
of latitudinal changes
form.
dynamics NWP models
The
of this model are similar to those of operational in that they admit as solutions slow,
quasi-geostrophic
Rossby waves, as well as fast
with similar dynamic and
tochastic properties.
was governed by
in planetary vorticity.
hope for the
not in a practically applicable
The filter which is most widely used in engineering
pplications is the extended K-B filter (EKF: Bucy and Joseph, 1968, Ch. 8; Gelb, 1974, Ch. 6; Jazwinski, 1970, Ch. 6
nd Ch. 9).
The principle of the EKF is simple:
linearize
!he problem around an estimated state, and apply the
inertia-gravity waves. We have modified the standard Kalman-Bucy (K-B) filter
rresponding linear filter over a time interval, T
say,
in order to obtain optimal estimates of the slow, meteorologically
v r which the solution of the nonlinear problem is not
significant waves, while eliminating
Kpected to change much.
undesirable waves.
entirely the fast,
In this way, our modified K-B filter
'Iual 6 h to 24 h.
In large-scale NWP this time could
After time T, relinearize around the new
achieves simultaneously the optimal 4-D assimilation of data
te and proceed.
Clearly, T is limited by the dynamics
and the initialization of model states for the purpose of
the system in the problem.
When choosing T, a trade-off
Iween accuracy and expediency has to be made.
noise-free forecasts. It was shown that the optimal filter for the linear
The EKF gives good results when the true characteristic
problem converges rapidly to an asymptotic matrix, the
r lation time T
Wiener filter.
white noise, i.e., T
Furthermore, the asymptotic filter (W-filter)
of the perturbations, which we model
= 0,
is actually
short compared to T,
performs nearly as well as the exact, time-varying filter
T.
(K-B filter).
rs from our experienc,e with linear problems that:
~,
The wiener
filter depends on system dynamics
observational pattern H, system noise covariance Q and
This is certainly the case in NWR Moreover, it
the asymptotic filter for each one of the successive rizations will work sufficiently well, and there is no o compute time-varying filters over a time interval T; he dependence of the W-filter on (linear) system mic
~
is rather weak.
212
213
We conclude that the W-filter for the succeeding T-interval
We saw already that taking into account the dynamics
will be easily computable from the W-filter valid over the
llowed us to unify the assimilation and intialization
preceding T-interval by a perturbation procedure.
spects of preparing initial data
It is more realistic in NWP to let the observation H be a function of time, H
matrb~
H(t), rather than a constant.
Different observations are made at the synoptic times, 0000 GMT and 1200 GMT, the subsynoptic times, 0600 GMT and 1800 GMT, and in between.
orecasts.
be a periodic matrix function of time, with period 24 h, and assume that the actual observation matrix, H(t), differs
o data-poor areas, and of "negative information", i.e., I. rge errors,
by the presence, absence or location of relatively
efficients for observations should be skewed in the irection of the prevailing lpstream;
the amount of skewness should depend on average
nd intensity, i.e., h
winds, with larger weights
on season.
Also
The asymptotic filter corresponding to ('I' (t) ,H* (t)), K*(tl say, should also be periodic, with a period of 24 h, rather than constant.
We expect some easily computed modi-
fication of K*(t) to
be a good approximation to the optimal
filter for ('I'(t) ,H(t)).
We plan to study, therefore,
time-periodic observation patterns and their modifications. Sequential estimation accounts explicitly for the fact that the system whose state we wish to estimate is governed It is this aspect of the theory which
distinguishes it from the so-called "optimal interpolation" currently used in operational
NwP.
The latter essentially
assumes that the system obeys trivial dynamics, 'I'
=
I.
weights used on
Western edge of the continents should be different
lom those used on the Eastern edge.
few observations.
by certain dynamics.
from data-poor to data-rich regions.
In particular, the theory shows that weighting
from H*(t) at every t only by a matrix of small rank, i.e., only
Furthermore, it allows us to account in a
ystematic way for the advection of information from data-rich
The distribution of observations,
how,ever, is not too far from being time-per iodic. Let H* (t)
for numerical weather
the anisotropv
and inhomogeneity
'lucture in the zonal direction he results
The present results of estimation error
should also be supplemented
on inhomogeneity in the latitudinal direc-
n of Ghil et al.
(1980).
It is this aspect of
the
• ry which we expect to have the largest impact in terms Improving operational procedures. One further aspect of the theory merits attention: the riance matrices Q and R do not need to be prescribed r ori.
They can be determined in the estimation process
(e.g., Chin, 1979; Ohap and Stubberud, 1976) by using an The determination of system noise would mportant consequences
for predictability theory, as
for stochastic modifications of numerical schemes nd Schemm, 1977).
The determination of observational
l"minating the well known problem of "ground truth",
214
215
would also help greatly in improving operational objective
References
analysis and data assimilation. 'ngtsson, L., 1975:
4-Dimensional Assimilation of
Meteorological Observations, GARP Publications Series,
Acknowledgement.
No. 15, World Meteorological Organization _ The potential usefulness of estimation theory in meteorological applications
was first pointed out to one
of us (M.G.) by Edward S. Sarachik.
International Council of Scientific Unions, CH-1211 Geneva 20, Switzerland, 76 pp.
It is a pleasure to 'rgman, K. H., 1979:
Multivariate analysis of temperatures
acknowledge many useful discussions on linear and nonlinear and winds using optimum interpolation, Mon. Wea. Rev. 107, filtering with George Papanicolaou. the filters'
eventual
Practical aspects of
1423-1444.
implementation in an operational owning, G., A. Kasahara and H.-O. Kreiss, 1979: Initialization
environment
were discussed with Wayman Baker, Mark Cane,
Milton Halem and Eugenia Kalnay-Rivas. The final version of the manuscript benefited A. Hollingsworth,
from comments by G. Cats, R. Daley,
A. Lorenc
and after the Seminar.
and O. Talagrand, made during
The manuscript could not have been
prepared without the whole-hearted support of Con stance Engle. Our work on this problem has been supported by NASA Grants NSG-5034 and NSG-5130, administered by the Laboratory for Atmospheric Sciences, Goddard Space Flight Center. Computations were carried out on the CDC 6600 of the Courant Mathematics
and Computing Laboratory, New York
University, under Contract DE-AC02-76ER03077 U. S. Department of Energy.
with the
of the primitive equations by the bounded derivative method, NCAR Ms. 0501-79-4, National Center for Atmospheric Research, Boulder, Colo. 80307, 44 pp. ,he, K. P., and M. Ghil, 1980: data and the
initiali~dtion
Assimilation of asynoptic problem, this volume.
I'Y, R. S., and P. D. Joseph, 1968:
Filtering for Stochastic
Processes with Applications to Guidance,
Wiley-Interscience,
New York, 195 pp. rney, J., M. Halem and R. Jastrow, 1969:
Use of incomplete
historical data to infer the present state of the abnosphere, J. Atmos. Sci. 26, 1160-1163. n, L., 1979:
Advances in adaptive filtering, in
Control and Dynamic Systems, Vol. 15, C. T. Leondes (ed.), Academic Press, 278-356.
in, R. F., and A. J. Pritchard, 1978:
Infinite Dimensional
Linear Systems Theory, Lecture Notes in Control and Information Sciences, Vol. 8, Springer-Verlag, New York, 97 pp. , M.H.A., 1977: 216
11 1
Linear Estimation and Stochastic Control,
ed Press, John Wiley and Sons, New York, 224 pp.
17
Faller, A. J., and C. E. Schemm, 1977:
Statistical corrections
Halem, M., M. Ghil, R. Atlas, J. Susskind and W. J. Quirk,
to numerical prediction equations 11, Mon. Wea. Rev. 105,
1978:
37-56.
NASA Tech. Memo. 78063, NASA Goddard Space Flight
Fleming, R. J., T. M. Kaneshige and W. E. McGovern, 1979a: The global weather experiment I. The observational phase through the first special observing period, Bull. Amer. Met. Soc. 60, 649-659.
The global weather experiment 11.
The second special
observing period, Bull. Amer. Met. Soc. 60,
Isaacson, E., and H. B. Keller, 1966:
(ed.), 1974:
Applied Optimal Estimation, The M.I.T.
Press, Cambridge, Mass., 374 pp.
Theory,
initialization, and four-dimensional data assimilation, Tellus 32, 198-206.
Stochastic Processes and Filtering
Jones, R. H., 1965a:
Optimal estimation of initial conditions
for numerical prediction, J. Atmos. Sci. 22, 658-663.
J. Appl. Meteor.
An experiment in nonlinear prediction, 4, 701-705. A new approach to linear fi1tering
and prediction problems, Trans. ASME, Ser. D,
Asynoptic variational
method for satellite data assimilation, in Halem et al. (1978), pp. 3.32-3.49.
A stochastic-dynamic model for global atmospheric mass field statistics, NASA Tech. Memo. 82009, NASA Goddard Space Flight Center, Greenbelt, Md. 20771, 50 pp. ___________ , M. Halem and R. Atlas, 1979:
Time-continuous
assimilation of remote-sounding data and its effect on weather forecasting, Mon. Wea. Rev.
107, 140-171.
New results in linear
filtering and prediction theory, Trans. ASME, Ser. D.: J. Basic Eng., 83, 95-108. r,eith,
___________ , R. C. Balgovind and E. Kalnay-Rivas, 1980:
J. Basic Eng.,
g,35-45. ___________ , and R. S. Bucy, 1961:
and R. Mosebach, 1978:
Analysis of Numerical
Academic Press, New York, 376 pp.
Kalman, R. E., 1960:
The compatible balancing approach to
[NTIS N783l6671.
Methods, Wiley, New York, 541 pp.
_ _ _ _ _ , 1965b:
1316-1322.
Ghil, M., 1980:
Center, Greenbelt, Md. 20771, 421 pp.
Jazwinski, A. H., 1970:
___________ , and T. E. Bryan, 1979b:
Gelb, A.
The GISS sounding temperature impact test.
c.
E., 1978:
Objective methods for weather prediction,
Ann. Rev. Fluid Mech., 10, 107-128. _ _ _ _ _ , 1980:
Nonlinear normal mode initialization and
quasi-geostrophic theory, J. Atmos. Sci., 37, 958-968. renz, E. N., 1969: possesses
The predictability of a flow which
many scales of motion, Tellus, 21, 289-307.
Pherson, R. D., K. H. Bergman, R. E. Kistler, G. E. Rasch and D. S. Gordon, 1979: The NMC operational global data assimilation system, Mon. Wea. Rev., 107, 1445-1461.
218 219
Miyakoda, K., and O. Talagrand, 1971:
The
assim~lation
of
_ _ _ _ _ , 1976:
past data in dynamical analysis. I, Tellus, 23, 310-317. Ohap, R. F., and A. R. Stubberud, 1976:
Adaptive minimum
space-time fields, Info. Sci., 10, 217-241. Phillips, N. A., 1971:
variance estimation in discrete-time linear systems, in
system, J. Atmos. Sci., 28, 1325-1328. _ _ _ _ _ , 1976:
Academic Press, 583-624. palmen, E., and C. W. Newton, 1969:
Atmospheric Circulation
pedlosky, J., 1979:
AIDer. Met. Soc., 57, 1225-1240.
Modern Probability Theory and Its ApplicaYork~
Richtrnyer, R. D., and K. W. Morton, 1967:
464 pp.
Verlag, New York, 624 pp.
New York, 405 pp. Rutherford, I. D., 1972: Data
On the concept and implementation of
673-686.
observations, _______ , 1973a:
29, 809-815. Schlatter, T. W., 1975:
Algorithms
for
sequential
and
random
103, 246-257. _______ , G. W. Branstator and L. G. Thiel, 1976:
sequential analysis, J. Appl. Meteor., 12, 437-440. _______ , 1973b:
Testing a global multivariate statistical objective
A comparison of the performance of
analysis scheme with observed data, Mon. Wea. Rev., 104,
quasi-optimal and conventional objective analysis schemes,
J. Appl. Meteor, 12, 1093-1101. _______ , 1973c:
765-783. , 1977:
~
Static and dynamic constraints on the
estimation of space-time covariance and
Some experiments with a multivariate
stiatistical objective analysis scheme, Mon. Wea. Rev.,
Meteorol. Mono., 11, 100-109. Transient suppression in optimal
wavenumber-
frequency spectral fields, J. Atmos. Sci., 30, 1252-1266.
assimilation by statistical
interpolation of forecast error fields, J. Atmos. Sci.,
sequential analysis for linear random fields, Tellus, 20,
_ _ _ _ _ , 1970:
Difference Methods
for Initial-Value Problems, 2nd. ed., Wiley-Interscience,
Geophysical Fluid Dynamics, Springer-
Petersen, D. P., 1968:
The impact of sYnoptic observing and
analysis systems on flow pattern forecasts, Bull.
Systems, Academic Press, New York, 603 pp.
tions, Wiley, New
Ability of the Tadjbakhsh method
to assimilate temperature data in a meteorological
Control and Dynamic Systems, Vol. 12, C. T. Leondes (ed.),
Parzen, E., 1960:
Linear sequential coding of random
Reply
(to Comments by H. J. Thiebaux on "Testing
a global •.•h
),
Mon. Wea. Rev., 105, 1465-1468.
jbakhsh, I. G., 1969: Utilization of time-dependent data in running solution of initial value problems, J. Appl. Meteor., 389-391.
~,
n r, N., 1949: E'xt r apo1ation, Interpolation and Smoothing of
Stationary Time Series with Engineering Applications, Wiley, New York,
220
163 pp.
221
Appendix A. (m)
c~
List of Major' Symbols
p
the dimension of z
for solutions of the continuous system (3.1), the three phase speeds, m
=
1,2,3, corresponding to wave number
as defined by Eq.
(3.3b).
Q
covariance matrix of the system noise
R
slow wave subspace
R
covariance matrix of the observational noise
t
time
~,
The same notation is used for
the phase speeds of solutions of the discrete system (3.6), which agree with the phase speeds of solutions of
temporal increment in the discrete system (3.6)
the continuous system to order (6t)2. E
expectation, or ensemble averaging, operator
f
Coriolis
G
fast wave subspace
g
gravitational acceleration constant of the Earth
H
observation matrix; defines the observed linear
parameter, f
~ 2~ sin 45°
= 10-4 s -1
superscript indicating the spatial grid point, x
K
Kalman gain matrix, defined by (2.11c)
k
subscr ipt indicating the time, t
L
length of the computational
k
=
wave number;
~L/2TI
M
number of grid points in the L-domain
n
number of state variables,
n
=
3M; the
u
perturbation zonal wind component
v
perturbation meridional wind component
w
in the linearization of (3.2)
a solution (u,v,~)T of the continuous system (3.1) ' (uk,vk'~k) j j j a so l ut10n of the discrete system (3.6)
distance along the spatial L-domain
j
domain, about half the
is an integer ,
mean zonal wind component
wkj
k 6t
circumference of the Earth at 45°N
u
initial amplitude of v
combinations of state variables j
number of observations at a synoptic time;
x
spatial increment in the discrete system (3.6) "true" atmospheric state given by the stochastic model (2.5)
k(-) estimated atmospheric state just prior to observations at time k k(+) estimated atmospheric state just after observations at time k vector of observations, given by (2.6) observational noise, or random part of the observation
dimension of x P (-) estimation error covariance matrix just prior to k
model (2.6) system noise, or random part of the atmospheric model (2.5)
observations at time k P (+) estimation error covariance matrix just after k observations at time k
222
orthogonal projection operator onto the slow wave subs pace
R
223
mean geopotential at 45°N
in the linearization of (3.2),
i.e., g times the mean equivalent atmospheric height at 45°N ~
p~rturbation
~O
initial amplitude of
~
matrix defining the atmospheric dynamics, given for our model by Eqs.
geopotential ~
(3.6)
angular rate of rotation of the Earth
224