4. 0a. CONTRACT OR GRANT NO ga. ORIGINATOR'S REPORT NUMBER(S) ..... severe (i.e. they don't have arbitrarily large derivatives), and ... tation is taken principally from the dissertation of Hecht [14]. Chapter ..... 12. pZ ( AJ=i) corresponds to the distribution P(4 412io-1) which corresponds to ...... radian, tests were me.
FRANK J. SElLE* IN11SE1ARC14 --LA**WAT0kY"
-SRL-TR-72-0004
DMY1V
-AIC
ANENGI NEER'S GUIDE TO BUILDING NONLINEAR FILTERS
VOLUME I Richard S ESucy Calvin Hecht Capt Kenneth DSenne
APPROVtO
ROJET 794ODSTRISIM~ON
FOR
PUBLIC
UNLIMITEO.
AIR FORCE SYSTEMS COMMAND UNITED~ STATES AIR rFORCE
RELEA SE;
UNCLASSIFIED Securit
Classification
DOCUMENT CONTROL DATA-
R &D
(Security classification of title, body of abstract and indexing annotationmust be entered when the overall report is claelled• I ORIGINATINC AC I VITY (Corporata author)
I2r.
Frank J. Seiler Research Laboratory (AFSC) USAF Academy, Colorado 80840
REPORT SECURITY CLASSIFICATION
UNCLASSTFIED 2"
GROUP
3. REPORT TITLE
An Engineer's Guide to Building Nonlinear Filters
4 DESCRIPTIVE NOTES (Type of report and Inclusive dates)
Final Project Report (1969-1972) 5. AUTHORIS) (Firet name, middle initial, last name)
Richard S. Bucy, Professor, U.S.C. Calvin Hecht, T.R.W. Systems -Kenneth D. Senne, Captain, USAF 6. REPORT DATE
'a.
TOTAL NO.
May 1972 0a.
CONTRACT OR GRANT NO
OF PAGES
7b. NO. OF REFS
5
4
ga. ORIGINATOR'S REPORT NUMBER(S)
SRL-TR-72-0004 b. PROJECT NO.
7904-00-17 C.
DRS 61 102F d. 10
9b. OTHER REPOR'
NO(•) (Any other numbers that may be assi•ned
this report)
BPAC 681307
AD-
OISTRIBUTION STATEMENT
Approved for public release; distribution unlimited. I1
SUPPLEMENTARY NOTES
12. SPONSORING MILITARY ACTIVITY
Frank J. Seiler Resear.ch.Laboratory (AFSC USAF Academy, Colorado 9O840 " 13,
AD!TRACT
A comprehensive treatment of numerical approaches to the solution of Bayes Law has been included, describing numerical methods, computationa] algorithms, two example problems, and extensive numerical results. Bayes Law is an lintegral equation describing the evolution of the conditional probability distribution, describing the state of a Markov process, conditioned on the pp;t noisy observations. The Bayes Law is, in fact, the general solution to the discretL. nonlinea: estimation problem. This research represents one of the first successful attempts to approximat the conditional probability densities numerically and evaluate the Bayes integral by quadratures. Tht methods of density representation studied most thoroughly include Orthogonal Polynomials, Point-Masses, Gaussian Sums, and Fourier Series. For example problems two secpnd-order systems have been studied. The first problem involves a passive (bearings-only) receiver with geometry similar to the AWACS. The second example involves the reconsLruction of a second order phaseprocess which is the message process for a phase-modulated communication system. The various forms of the nonlinear estimates are compared with the phase-locked loop de.iodulator and extensive Monte Carlo simulations are described to provide high confidence numerical comparisons. A chapter is devoted to elaborating on the Monte Carlo methods employed for the computer simulations and a general-purpose, high-quality random number generator is introduced which is exactly realizable on any binary computer for comparisons of the digital and hybrid computer architectures to the Bayes-Law algorithms.
DD I
NOV
.1473
.1-,
UNCLASSIFIED Securitv ClassiIIcation
UNCLASSIFIED Security Classification LINK A
14.
LINK 8
KEY WORDS
LINK C
-
WT
ROLE
WT
ROLE
ROLE
WT
Markov Processes Nonlinear estimation Filtering Prediction Time series Simulation Bayes Law Monte Carlo nethods Orthogonal Series Quadratures Numerical Curve Fitting Random Number Generators Parallel Processing Hermite Polynomials Gaussian Sums
Point Masses Passive Tracking Bearings-only Receivers AWACS Phase Modulation Demodulation Phase-Locked Loops
I
I
I
_
i
_ _i
I
UNCLASSIFIED Security Classification
.
An Engineer's Guide co Building Nonlinear Filters
by
Richard S.
Calvin Hecht,
Bucy,
U., 0.
T.R.W. Syotems
Kenneth D. Senne, Capt.,
U.S.A.F.*
Final Report Frank J. Seiler Research Laboratory Project 7904-00-37 May 1972
* After June 1972:
Staff Member, M.I.T. Lincoln Laboratory
ii
Publication Notice
The authors are submitting portions of the enclosed material to the IEE9 Transactions on Automatic Control, while other portions will be published in Stochastics.
ii
Foreword Ever since the publication of the papers by ',alman and Bucy in 1960 and 1961, the engineering community has made giant strides in applying their ideas concerning linear estimation to an essentially uncountable number of space and control problems.
The widespread and
almost instantaneous acceptance of the Kalman-Bucy filter was due in part to a number of factors:
the then common frequency domain
synthesis procedures were limited to time-invariant (steady-state) problems, the advent of the digital computer led to easy simulation of time functions, whereas frequency calculations were indirect and A
less convenient than "state-space",
and most important, the newly
accelerated man-in-space program led to an immediate real-time application, the most significant caLalyst for engineering developments. The fact that most applications were not linear systems did not diminish the enthusiasm for the new linear theory.
Almost over night,
under the pressures of ever-present engineering deadlines,
a host of
approximate approaches to the nonlinear problems were generated with many variations and extensions, but most of which were patterned to look much like the highly successful linear estimator of Kalman and Bucy. In some applications these approximations produced highly saLisfactory results, but in most situations a second design phase, less publicized but equally important, ano.molies, estimators.
was carried out in order to eliminate the
instabilities and -incxplained peculiarities of t*-.= A seemingly endless stream of technical papers dealing
with adaptive techniques and examples of clever but non-generalizable
iii
"engineering fixes" for the multitude of undesirable characteristics of the approximations have appeared during the lest 8-10 years,
resulting
in an apparently bottomless bag of engineering tricks for the estimation engineer.
In addition, the technical access to the original articles
was evidently unsatisfactory since, for a variety of reasons, many authors have devoted tutorial articles explaining how the same equations for estimation can be derived from a ,ariety of points of view, including Bayes rule (the original concept), likelihood and a host of others,
least squares, maximum
The results would fill
many vo.umes
of "handbooks" for the engineer while neither answering nor posing the questions - where do we go from here?
or - do we reall
understand
nonlinear estimation? On the other hand, it
is fair to say that nonlinear estimation
theory, as advanced and generalized as it has become, pub.-
has not been
'zed and advertised as a general panacea for the estimation
problems of the future.
A small but growing contingent of university
researchers has been working steadily on problems involving mathematically subtile concepts such as stochastic integrals, diffusion processes, etc.,
Ito calculus,
and a fe, elegant and mathematically
sophisticated theorems have been proved concerning general representations for nonliner estimators.
But the very difficulties which
frustrated the application of these "solutions" is their nature: estimates are determined to be related to the numerical solution of infinite-dimensional partial differential equo'tionis - a formidable task in the simplest of problems,
and thus, understandably,
the paths
of estimation application and nonlinear estimation theory have continued to diverge.
On the rare occasion that representatives of the two
iv
factions have met much needless antagonism has resulted.
While it
is
true that the questions most frequently asked by the applications engineers rarely deal with the optimality of estimates but merely with the computational tractability, and vice versa for the theorists, it appears that a partial reconciliation on both sides seems likely to provide mutual enrichment, Specifically, what is proposed here is that all interested parties should stand back for a moment and reflect on the question - what, exactly, are the characteristics of optimal, ;onlinear estimators?
The
answer to this and related questions provides the motivation for the work described in this paper.
We ask thac the applir,^.tions engineer
temporarily suspend his seemingly uLattainable computational constraints and that the estimation theorist pause briefly from his esoteric pursuits involving the subtleties of measure theory and look for awhile at some examples of optimal, discrete-time, nonlinear estimators. it
turns out, the discrete-time problem,
although far from trivial, is
at leasc tractable for solution on modern digital computers.) way, it
(As
In this
is expected that the benefits to both factions will be
considerable,
and perhaps their steady divergence may be curtailed.
In effect, it
is hoped that the engineer engaged in the daily
process of fitting old tricks into new computers will begin to ask whether or not the basic principle ot the trick is applicable to the specific application at hand - perhaps a simple modification or another approach would be much more satisfactory.
Such realiz.tions are
generally obtained at the expense of serious time delays and costly experimental failures - perhaps added experience with optimal nonlinear estimators could provide more effective and less expensive guidelines.
V
=.;.
In addition, it
is hoped that the theoretician can appreciate the
reality of concrete exampJes and thereby refuel his fire regarding theorems concerning the characteristics and asymptotic descriptions of optimal nonlinear estimates.
Finally, it
is intended that certain
applications of nonlinear estimators be considered for which the optimal solution can itself be considered practical, if not via present day realizations,
then perhaps by special purpose hardware designed
with optimal estimation in mind.
This paper has been written to
instill some enthusiasm in the reader for all of these expectations.
vi
Acknowledgements
The authors are deeply indebted to the generous and enthusiastic support of the U.S. Administration. the Frank J.
Air Force and to the National Aeronautics and Space
The research described herein has been performed at
Seiler Research Laboratory, Air Force Systems Command,
at
the University of Southern California Department of Aerospace Engineering (Air Force Office of Scientific Research Grant AFOSR-71-2141), conjunction with Electrac Corporation (NASA Grant NAS5-10789,
and in 1970).
Although many persons must be acknowledged regarding this research and the associated e):periments, Colonel Bernard S. Morgan,
perhaps the two most important figures are Jr., USAF,
and Major Allen D. Dayton, USAF,
who had the insight and intuition necessary to introduce the authors of this paper to each other in the first place. fortunate,
It would be very
indeed, if such technical interchanges and mutual cooperation
were to -ontinue to be encouraged among government sponsored and univer.iity researchers.
vii
Table of Contents
Publication Notice
ii
Foreward
iii
Acknowledgements
vii
Table of Contents
ix
Table of Figures
xiii
Table of Tables
I.
I
III.
xvi
l.nttroduction
1
References
6
Bayesian Estimation:
The Problem
9
References
21
Finite Dimensional Approximations
23
A.
Introduction
23
B.
Orthogonal Series
24
1. Least Square Polynomial Approximation, Scalar Case
25
2.
Least Square Polynomial Approximation, Multi-dimensional Case
31
3.
Gauss-HermitL Integration
34
4.
Application of Polynomial Expansions to a Two-
5.
Dimensional Filtering Problem
39
Applying the Hermii:e Expansion
50
Preceding page blank ix
Page
IV.
C.
The Point-Mass Approximation
64
D.
Non-Orthogonal Series - Gaussian Sums
68
E.
Other Computational Methods
74
1. Fourier Sezies Expansions
74
2.
75
Spline Functions
References
77
Monte Carlo Analysis Techniques
79
A.
Introduction
79
B.
Statistical Analysis
80
C.
An Example
85
D.
Conclusions
94
References Appendix.
V.
96 A Machine Independent Random Number Generator
A.
Introduction
B.
An Example
97 100
Parallel Computational Techniques
115
A.
Introduction
115
B.
Parallelism and Bayes Law
115
C.
Look-Ahead Processors
122
D.
Array Processors
124
E.
Associative Processo"s
125
F.
Pipe-Line Processor
125
G.
Hybrid Computer Methods
127
Refe.ences VI.
97
A Passive Receiver:
A.
129 Bearings-Only Tracking (AWACS)
131
131
Introduction
x
136
3B. The Linearized Estimator
C.
137
Application of Nonlinear Filtering
140
References Appendix A.
First Monte Carlo Experiments - Point Masses 141
Versus Linearized Appendix B.
More Recent Experimental Results:
Point
Masses Versus Gaussian Sums
Appendix C. A Movie of Conditional Densities VII.
Example:
Optimal
onlinear Phase Demodulation
155 183
A.
Introduction
183
B.
The Linearized Filzer
184
C. Application of Nonlinear Filtering
198
Referenced
205
Appendix A.
Numerical Experiments with the Phase207
Locked Loop Appendix B.
Numerical Experiments with The Hermite
213
Polynomial Expansion Appendix C.
Cyclic Point-Mass Experiments
Appendik D. A Fourier Series Experiment Appendix E. VIII.
147
A Movie of Conditional Densities
227 243 249 267
Conclusions
Bibliography
271
Resumes of the Authors
275
Richard S. Bucy
276
Calvin He~ht
21
Kenneth D. Senne
282
xi
//
Additional Appendix A.
A Two-Dimensional Point-Mass Program
for the Passive Receiver Problem Additional Appendix B.
A Gaussian-Sum Program for the Passive 355
Receiver Problem
Additional Appendix C.
A Gauss-Hermite Program for Implementing
the Two-Dimensional Phase Demodulator Additional Appendix D.
419
A Fourier Series Implementation of the
Cyclic Phase Demodulator Additional Appendix F.
379
A Point-Mass Program for Implementing
the Interpolating Version of the Cyclic Phase Demodulator Additional.Appendix E.
285
445
Some Unpublished Conference Papers
Referenced by this Report
465
R.S. Bucy, "Realization of Non-Linear Filters"
467
R.S.
M.J. Merri't, and D.S. Miller, "Hybrid
Bucy,
Computer Synthesis of Optimal Dis:rete Nonlinear 475
Filters" C. Hecht,
"Digital Realization of Non-Linear Filters"
K.D. Senne,
•
2i
"Computer Experiments with Nonlinear Filters"
505 513
Table of Figures
Page Chapter III Fig. 1.
Hermite Polynomial Bayes-Law Recursion
41
Fig. 2.
Coordinate Systems for Hermite Expansion
58
Chapter IV Fig. 1.
Example of a Questionable Monte Carlo Cumulative Average Sample Path
Fig. 2.
84
Probability Density and Distribution of the Asymptotic (N-.o)
Kolmogorov Statistic
91
Algorithm for Partitioned Uniform Generator
101
Fig. 1.
Serl -i Ev&" ýtion of a+b+c+d+e+f
117
Fig. 2.
Maximally Parallel Evaluation uf a+b+c4d+e+f
118
Fig. 3.
Combining Ser.'al and Parallel Computations
121
Fig. 1.
Typical Passive Receiver Geometry
133
Fig. B-1.
Typical Geometry of the "Old Problem" -
Fig. A-1. Chapter V
Chapter VI
Illustrates Periodicity of Errors
149
Fig. B-2.
"New Problem" Geometry without Periodic Errors
151
Fig. C-i.
Detection Geometry in the Presence of Multipath
156
Reception Fig. C-2.
A Priori Density Resulting from Multipath Detection Aimbiguicy
xiii
158
Fig.
C-'3.
A Typical Sample Path Resulting from the Multipath Detection Ambiguity
Fig.
C-4.
159
Absolute Error Performance of Optimal and Linearized Predictors for Multimodal Problem
181
Block Diagram of Linearized Phase Estimation
189
Fig. 2. SFig. 3.
Discrete P Error Variance Discrete P122 Error Variance
195 19
Fig.
4.
Discrete P12 Error Variance
197
Fig.
5.
Torus Interpretation of Doubly Cyclic State Space
201
Fig.
A-1.
MSE Performance Summary
209
Chapter VII Fig.
1.
Fig. A-2.
Fourth Moment Divided by Three Times the Squared Variance for the Phase-Locked Loop Error
211
Fig. B-1.
Hermite Expansion Error Summary
-
P(o) = 4P(-)
218
Fig. B-2.
Cumulative Statistical Variance
-
P(o) = 4P(-)
219
Fig. B-3.
Portion of Sample Function No. P
Fig. B-4.
1
(o)
Fig.
B-5.
B-6.
220 6 -
= 0.3025
221
Error Variance for P at P 1(o) = 4P
Fig.
-
= 0.3025
Error for Sample Function No.
P 1 (o)
6
(o)
-5.2 dB Starting
(o)
223
Cumulative Statistical Variance for P 1(o) = -5.2 dB
224
Fig. C-1.
Nonlinear Filter Summary (Enlarged)
231
Fig. C-2.
MSE Improvement of Nonlinear Filters over P.ase-Locked Loop
xiv
235
Page Fig. C-3.
MSE Difference from Ideal Linear Analysis
Fig. E-1.
A Typical Sample Path of Densities Evolving
237
250
in Time
xv
Table of Tables
Page Chapter II Table 1.
Model for Bayes Rule Conditional Density 18
Recursion Formula Table 2.
Conditional Density Recursion formulae for Bayesian Estimation
19
Outline of a Gaussian-Sum Recursion Procedure
70
Chapter III Table 1. Chapter IV Table 1.
1
The Normalized Standard Deviation 2
as a
function of P and N
82
Table 2.
Monte Carlo Moments of Gaussian Generator
88
Table 3.
Testing the Hypothesis of Gaussian Moments
88
Table 4.
Pr(KN < A) from Massey
90
Table 5.
Results of Kolmogorov Test
92
Table 6.
Sampled Correlation Function
93
Table 7.
Uncorrelatedness Test
94
Table A-i.
Partition Requirements for m=36 bit random numbers
102
Table A-2.
Partition Examples
103
Table A-3.
initial Sample Path - Sequence One
105
Table A-4.
Initial Sample Path - Sequence Two
106
xvi
Page Table A-5.
Repeat Characteristics of the Generator for 107
each Sequence Table A-6.
Table A-7.
FORTRAN-II Coding Examples Two-Piece Generator
108
Three-Piece Generator
109
Four-Piece Geiterator
110
Six-Piece Generator
ll
FORTRAN-II Coding of Gaussian Generator
113
Chapter VI Table A-i.
Monte-Carlo Performance of the Optimal and Linearized Predictors
Table A-2.
142
Monte-Carlo Performance of Optimal and Linearized Filters
143
Table A-3.
Monte-Carlo Confidence Intervals for Predictors
144
Tible B-i.
Monte-Carlo
Average Sum Squared Error
Performance for Predictors - Old Problem Table B-2.
149
Monte-Carlo Averaged Sum-Squared Error Performance for Predictors - New Problem
132
Chapter VII Table 1.
Summary of Continuous Linearized Kalman-Bucy 185
Filter Table 2.
a.'*Ie A-i. "ta*le B-i.
Summary of Discrete Linearized Kalman-Bucy Filter
191
Confidence Intervals for the Linearized Filter
212
Numerical Valuet3 for the Computer Simulation
216
xvii
Page Table C-i.
Monte Carlo Mod 2n Error Performance Data for the Cyclic Point-Mass Estimates
Table C-2.
Monte Carlo Imprcvements - Cyclic Point-Mass over Phase-Locked Loop
Table C-3.
233
234
Monte Carlo Difference Between Cyclic PointMass and Ideal Linear
234
Table C-4.
Timing Estimates
238
Table C-5.
n/m Constant
239
Table C-6.
n Constant
240
Table C-7.
m Constant
241
xviii
I.
Introduction
intended to serve as a record of specific experiments
This report is
with feasible rea]izations of optimal nonlinear estimators. it
In addition,
is expected that future research along the lines described herein will
continue to result in increased understanding of the behavior of optimal estimates and, possibly, in some guidelines for actual realizations in particular applications. The underlying thread of continuity connecting all segments of this research is Bayes-Law (See Chapter II),
the general solution to the
discrete-time nonlinear estimation problem.
Bayes-Law is in effect the
discrete "representation theorem" (see Bucy and Joseph [7]).
Some of the
earliest attempts to realize Bayes Law on the digital computer involved orthogonal series representations of the conditional densities employing Gram-Chalier series (23], however,
or Edgeworth expansion (22].
An early problem,
concerned the tendency for truncated expansions to become
predominantly negative resulting un unavoidable numerical instabilities. In 1969 Bucy [4] proposed a pcint-mass approximation to density functions which involved a selection of important points on a "floating grid" and the centering of mass on the selected points.
The point-mass
approximation was of course always positive, easy to implement, numerically stable.
The computational burden was unfortunately pro-
hibitive for high dimensional systems,
however, and many short-cuts and
simplificadions were -lade by Bucy and Senn. the computations tractabie. by Bucy,
Geesey,
and
[10],
[20] in order tc make
A ;assive receiver -,roblex was introduced
nA- Senne [6: to illustrate the concerts of point-rass
approximation and the associated problems.
:MT
2 In an independent effort simultaneous to the above work, Alspach and Sorenson [2],
[21] pioneered an approach based on a nonorthogonal
series of Gaussians densities,
originally chosen to be set down in such
a way so as to minimize an L criterion, but subsequently to be deterp mined via a 3imple approximation, again in order to provide a short-cut to the otherwise prohibitive computations. In 1970 at the Nonlinear Estimation Symposium in San Diego Bucy and Senne [9] and Alspach and Sorenson [2] each described their respective approaches to the Bayes-Law compucations,
thereby proviing the impetus
for a sizable new interest in the computations associated with nonlinear est:imation. During the last two years, a multitude of topics related to BayesLaw computation have emerged.
Edison Tse [24] has noted a link between
the previous two methods involving a Fourier transform translation theorem due to Wiener [26].
Julian Center [11] has observed the relation-
ship between generalized least-squares projection and seties expansions of density functions.
Hecht [13] [14] has taken a much closer look at
orthogonal polynomials - notably Gauss-Hermite expansions. and Miller [8],
Bucy,
Merritt,
[18] have discussed hybrid solutions to reduce the serial
computational burdens of Bayes Law by substituting the natural of hybrid computers.
parallelisms
Another promising approach to the approximation
problem involving generalized splines has been studied by deFigueiredo and Jan [12],
[15],
while Weinert and Kailath [25] have been relating
splines to least-squares approximation, projection.
Thus the subjects of
numerical methods and optimal nonlinear estimation are now firmly entrenched.
Meanwhile, still another practical application of Bayesian estimation has recently been studied by Mallinckrodt, Bucy, and Cheng [17], by Hecht
[14], by Bucy [5], and by Senne [19), and is
reviewed in the current report.
The new application involves demodulation
of phase-modulated signals observed in additive white noise.
Since the
nominal engineering solution to such problems involves th. well-known and reliable phase-locked loop, it appears that the demodulation problem will continue to provide an important comparison between moment series approximations and numerical density approximations. Moment series approximations are, of course, the most commonly encountered nonlinear estimates in engineering practice today.
The
simplest moment approximation has been referred to by the names "extended" or "linearized" Kalman-Bucy filter [7].
The appeal of such methods is
highly warranted in many problems, since the non'inearities are not severe (i.e. they don't have arbitrarily large derivatives), frequently the assumption of Gaussian noises is cdequate.
and
Whenever
either or both of the "well-behaved" assumptions It false, however, considerable controversy has resulted. order moment expansions,
Some have advocated higher-
[3], others have proposed 3daptive noise
tracking techniques [16], or finite-memory filters j16], but generally nobody seems to ask the most fundamental question:
What characteristics
of the filtering problem have led to the demise of tbe Aimple firstorder method?
Or, equivalently, how would an optimal estimate behave
in such a situation?
The answers to these and othe. iuz:.cions are
directly addressed in the present report.
4 The existing organization of the report was necessitated due to time constraints, and although many aifferent topics are addressed, there is occasionally some duplication.
The attempt was to assemble
a chronolog of the more significant resulrs of the authors during the past three years into one source,
thereby providing a focal point
for subsequent research in the field.
We apologize beforehand for any
unavoidable difficulties for the reader caused by the presentalion of the material.
The global organization of the chapters is as follows:
Chapter II contains a simplified summary of a derivation of the principal Bayes-Law formulas usAd throughout the report. tation is
The presen-
taken principally from the dissertation of Hecht [14].
Chapter III provides a background for the various proposed numerical representations of the conditional densities.
Covered in greatest detail
are the orthogonal series, exemplified by Gauss-Hermite and Fourier, and the point-mass representatitn of Bucy and Senne [10].
Discussed
in lesser detail are the nonorttnogonal series (such as Gaussian sums) and generalized spline functions. In Chapter IV the important topic of Monte Carlo simulation is treated in considerable depth. perimental confidence is
In particular,
the subject of ex-
treated in great detail and an example of the
analysis is given involving the testing of the gaussian random number generator (which is realizable on any binary computer) to determine its statistical properties.
Thus the reader -. s left with a complete
understanding of the experimental methods used for this report. Chapter V describes another important side-light of the current investigpcion - computer architecture.
The concepts of parallelism
and asynchronous computation are introduced and the Bayes estimation
5 problem is interpreted ii light of parallel digital architecture. At first the "ideal" machine is postulated for computation of Bayes law. Then, as allowance is made for technical feasibility and current computer architecture, obseivations are made concerning efficient use of such structures as array processors (like Illiac IV), machines (like CDC Star),
pipe-line
look-ahead machines (like CDC 6600 or i.;O),
associative processors (like Goodyear's), and multi-processorn the Burroughs D-Machines).
Finally, some consideration is
(like
given to
the currently available hybrid computer systems - examining their intrinsic parallelism. Chapters VI and VII provide the details concerning the two examples studied in this work.
The first example deals with a receive-only
tracking system or a passive tracking receiver, which attempts to locate a target on the basis of bearing information imbedded in additive noise.
The problem description is ver- similar to the Airborne Warning
and Command System (AWACS), for the Air Force.
which is currently under contract development
The other example deals with phase demodulation,
whereby a phase-modulated signei is observed in additive noise and it
is desired to retrieve thb original nessage process - at least
modulo-2ir. Chapter VIII contains a brief conclusion concerning the Bayes estimation research and indicates some paths for future developments. The appendices provide documentation on some of the computer programs used and some unpublished technical papers relevant to the current research.
6 References
1 1] D. L. Alspach, "A Bayesian Approximation Technique fnr Estimation and Control of Time Discrete Stochastic Systems," University of California, San Diego, 1970.
1 2]
Dissertation, Dh.L
D. L. Alspach and H. W. Sorenson, "Approximation of bDenIsty Functions. by a Sum of Gaussians for Nonlinear Bayesian .'Simation," Proc. Symp. on Nonlinear Estimation Theory and Its Appli-a'Ions., San Diego, Eept. 1970, 19-31.
3]
R. W. Bass, V. D. Norum, and L. Schwartz, "Optimal Multichanne.. Non-Linear Filtering," J. Math. Anal. Appl. 16 (1966), 152-164.
4]
R. S. Bucy, "Bayes Theorem and Digital Realizations for NonLinear Filters," J. Astro. Sci. 17 (1969), 80-94.
51
R. S. Bucy, "Realization of Non-Linear Filters," Proc. Second Symp. on Nonlinear Estimation Thpory and Its Applications, San Diego, Sept. 1971, 51-58.
6]
R. S. Bucy, R. A. Geesey, and K. D. Senne, "Passive Receiver Design via Nonlinear Fi-tering Theory," Proc. Third Hawaii International Conf. on System Sciences, Vol i, 1970, 477-480.
[ 7]
R. S. Bucy and P. D. Joseph, Filtering for Stochastic Processes with Applications to Guidance, Wiley Interscience, New York, 1968.
[ 8]
R. S. Bucy, M. J. Merritt, and D. S. Miller, "Hybrid Computer Synthesis of Optimal Discrete Nonlinear Filters," Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications, "San Diego, Sept. 1971, 59-87.
.
9]
R. S. Bucy and K. D. Senne, "Realization of Optimum DiscreteTime Nonlinear Estimators," Proc. Symp. on Nonlinear Estimation Theory and Its Applications, San Liiego, Sept. 1970, 6-17.
[10)
R. S. Bucy and K. D. Senne, "Digital Synthesis of Nonlinear Filters, "Automatica 7 (1971), 287-298.
[Il]
J. L. Center, "Practical Nonlinear Filtering of Discrete Observatio.is by Generalized Least Sqiiares Approximation of the Conditional probability Distribution, "Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications, San Diego, Sept. 1971, 88-99.
[12]
R. J. P. deFigueireio and Y. G. Jan, "Spline Filters, "Proc. Second Symp. on Norlinear Estimation Theory and Its Applications, San Diego, Sept. 197j, 127-138.
7 References (Cont) [13]
C. Hecht, "Digital Realization of Non-linear Filters, " Prc. Second Symp. on Nonlinear Estimation Theory and Its Applicadions, San Diego, Sept. 1971, 152-158.
[14]
C. Hecht, "Synthesis and Realization of Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1972.
[15]
Y. G. Jan, Ph.D. Dissertation, Rice University,
[16]
A. H. Jazwinski, Stochastic Processes and Filtering Theory, Academic Press, New York, 1970.
[17]
A. J. Mallinckrodt, R. 3. Bucy, and S. Y. Cheng, "Final Project Report for a Design Study for an Optimal Non-Linear Receiver/ Demodulator,'NASA Contract NAS5-10789, Goddard Space Flight Center, Maryland, 1970.
[18]
D. S. Miller, "Hybrid Synthesis of Optimal Discrete Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1971.
[19]
K. D. Senne, "Bayes Law Implementation: Phase Estimation,, Proc. SWIEEECO Conf.,
[20]
K. D. Senne and R. S. Bucy, "Digital Realization of Optimal Discrete-Time Nonlinear Estimators", Proc. Fourth Annual Princeton Conf. on System Sciences, Princeton, March 1970, 280-284.
[211
H. W. Sorenson and D. L. Alspach, "Recursive Bayesian Estimation using Gaussian Sums," Automatica, 7 (1971), 465-479.
[22]
H. W. Sorenson and A. R. Stubberud, "Non-Linear Filtering by Approximation of the A Posteriori Density," International J.
*
Control, 8 (1968), [23]
1971.
Optimal Discrete-Time Dallas, April 1972.
33-51.
K. Sri-,;vasan, "State Estimation by Orthogonal Expansion of Probability Distributions," IEEE Trans. Auto. Control, AC-15
(1970), 3-10. L24]
E. Tse,
"Parallel Computation of the Conditional Mean State
Estimate for Nonlinear Systems," Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications,
San Diego,
Sept. 1971, 385-394. [25]
H. L. Weinert and T. Kailath, "Recursive Spline Interpolation and Least-Squares Estibation," submitted to Amer. Math. Soc., 1971.
[26]
N. Wiener, The Fourier Integral and Certain of Its Applications, Cambridge, Cambridge University Press, 1933 (Also: New York,
Dover, 1958).
9
II.
Bayesian Estimation:
The Problem
Although the equations for Bayesian estimation are relatively well known, having been derived for example by Bucy[i], derivatior. is
[2],
a modified
included in this chapcer for the sake of introducing
relevant notation and to make the present exposition as self-contained as possible.
This presentation is
taken from Hecht [3].
The discrete-time process and measurement equations may bh written as -n
n-1
Z
h(x) + v
-n
-
n-l (X-n-
)n-i
(i)
---
n
(2)
-n%
Equation (1)
represents a discrete time signal process with
x
sequence of
d-dimensional random vectors; the subscript
refers to
time.
(xn)
R
d
to
is x
r
independent
a function from
-n-l-n-
matrices.
Rd
to Rd
The process
o(xn)
{u n
a function from is
r-dimensional random vectors with density
ran-om vector
c
and has density
is
d-dimensional,
n
a
independent of the
a set of p
Un
u
(w).
The
process,
p (x).
c
Equation (2) represents the observation process, with z a sequence of s-dimensional random vectors, h(x ) a function from to Rs
and
{v n
a set of independent
-n n=1,... vectors with density pv (0) : independent of
s-dimensional random
c
and the
u
process.
n The filtering problem consists of determining the conditional density, given as
Preceding page blank
Rd
10 Jnit(y)dy
=
P(x•edylj
F
z
(3)
Results are stated in the same notation as Bucy [2).
The following
notation will be used in discussing the derivation of the conditional densities. Underlined lower-case Latin letters denote the name of random variables or random vectors, and related Greek letters represent the dummy argument associated with the density or distribution functions. PH)= probability density function P(.) = probability distribution function (The above functions are referred to briefly as "density" and "distribution" respectively.)
Thus, for example, x•
xn, with
the density function of the random vector
=
n
as dummy argument.
F Px (Jz• n1zn n
=
the conditional density of the random vector given the random vector
z
xn
has taken on the value
n-PX (E
zZn
n
) is a function of
•
The above is frequently abbreviated to with the argument
C
px (in'zn) n n'-n--n1) xn p aX a1 n•,
implied.
9
of the random vectors
x -n
and
and
ý
.
= Px (4nlFn' n the joint density
x_..
-n-Vl
Using the ;oove notation (3)
is
Jnlt(-)dJ = px (ýnIKt=Lt
.. z =_1o)
It is noted in Bucy's original work that the signal process is process with transition density
a Markov
11
Px ri-F
(lir+PjIýJ,&j
5)
*
The required recursion relations for the conditional density Jn~n
are
given by the following equations.
n Flfdfp P.(n-h)(i
-V(_L
n
(n4l -) d•_l,
(6)
n
with
Pv 0 [_C-.h(o)]p(x) W
=
K(o) = pz0(L_)
(7) (8)
,
K(n) = p z (Q-n 1-7n-l''•
(9)
n
J
n-n+l' =
n)
(10)
r(fdjf-r!'-in-n)Jnjn(-n)d4n
x
n+r n L+r
J
ou
n+l-(n) ]Pv
-
]nI n-ih(Yn
) dL&n (11)
where previously undefined symbols have the following meaning: ff
)LX=fdf.-)dx
and pOUnl(o)
,..dxd
=
d integrations
=
thq density of the d-dimensional random vector
G(x)n-
_1.un-
The derivation o0 (6: threugh (11) follows:
J0 1 0 ()d
by Bayes rule.
0...
0
(12)
-m -lm
-m'-
A~
X.
12
pZ
(
AJ=i) corresponds to the distribution P(4
412io-1)
which corresponds to Pv (4-hýI(j
3
)],
or
pz0(LjXO=')=Pv [:0 -h(i.%)
-
(13)
Substituting (13) into (12) gives
which is (7).
J njnnýýdd-n
Pv [.4-x(.%)Ipx (JO)
1
"Px0
Next, using relations of conditional and joint densities,
n(ýnI.4n"**1.E
=Px
PZ, ...
-Z
5-
z0 (ý'
,
Z-l
K(n) JP
-'4) I.-.
n ) zn x
Now operating on the integrand,
l0 n-l, ...'. 0
1
n-l(ý-n'Lnllzn-l'
.....
'4)d~n-1
13
P•,~
zn_(-1''--)
~
n-i
xi
u Next, by indepen'dence of the -n
and v"-n processes and the Markov
property, nx-n Pz (;-
Pz (; 1% 'x-11-V-•-l'"-vO)
(16)
n
n
which may be manipulated using the corresponding distribution function. .n Xn4 P(zn
ii/
t4
U
19
'Ve
x.x
w 0 oq w
I
I
-
P6
E-4
H
4.)o
r.-II I
0 VI
C:)-
04
c-ap
I
-
20
4$4
44
4
0 0
*
00
0
0
0-I
>1
••1 0•*r-l
o
•o
44
>
0U4 •
0•
Co
4-)
€..4
4,.)
0
H"'
H,-
I
4J
4.)
0 04
14
o
•
4.
0c 2 00~
4J
0
rU
0
co
o
•
04J •-
I.)
*,
0
(
o
.l041 04
•
o4-
-
4J*
I1
I
00 00C
7-4
0
. 0]
C
Co.0
0 .
,,0 0
n
4,
'-ý
21 References [1]
R.S. Bucy, "Bayes Theorem and Digital Realizations for Nonlinear Filters," J. Astro. Sci. 17 (1969), 80-94.
[2]
R.S. Btcy and P.D. Joseph, Filtering for Stochastic Processes with Applications to Guidance, Wiley Interscience, New York, 1968.
[3]
C. Hecht, "Synthesis and Realization of Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1972.
23 I
Finite Dimensional Approximations A.
Introduction reviewed in detail in the
The formal Bayes-Law calculations, previous chapter,
are functionally simple but their implications to
computation are formidable.
To begin with, the representation of
densities is in general an infinite-dimensional problem.
Although
there are many examples of density families characterized by a finite number of parameters,
the Bayes-law computation rarely reproduces
another member of the same family.
The most well-known exception is
the Gaussian family, which is reproduced under linear transformations, resulting in the widely used Kalman-Bucy filter, which optimally describes the Bayes-law computation in terms of linear differenLial covariance of the Gaussian
(difference) equations fcr the mean and
conditional densities, provided that the underlying physical system is described by linear differential (difference) equations with additive Gaussian inhomogeneities.
If
the physical plant is not
linear or the disturbances are not Gaussian, however, the Bayes-law rarely leads to a reproducing family of densities.
Thus an improvised
or unnatural parameterization of the densities is required in order to an infinite number of parameters is
implement Bayes-jaw and, in general, required for exact representation. then is simply staced:
The numerical approximation problem,
how do we choose an appropriate (finite)
subset of the para.c'ers to represent a given collection of conditional densities?
Since the answer to this fundamental question depends
heavily on the densities in question, it
turns out that little
can
be said about the applicability of a given approximation without dis'ussing specific example problems.
Preceding page blank
Even for a given problem there
24 may be several different appropriate numerical approximation schemes, depending upon the problem parameters. will be discussed in In
Exampl•.s
of these dependences
Chapters VI and VII.
the present paper we will discuss some representatives from
several categories of density approximations as well as their Bayes-law implementations and associated difficulties. discuss orthogonal series,
First, we
covering a candidate with unbounded support
(Gauss-Hermite polynomials - Section B) and also a candidate with compact support (Trigonometric series - Section E). Next, we discuss an approach involving nonorthogonal polynomials which is
intended to provide positive density approximations for any
finite number of terms (Section D).
Thirdly, we discuss an intuitively
simple approach to density approximation involving point masses (Section C),
suitably distributed so that most of the probability is
covered by a small number of discrete points.
adequately
Finally, we describe a
relatively recent additional approach to the problem using numerical spline functions (Section E).
The presentation of this paper is
only to be representative and not exhaustive,
intended
since there are many approaches
to numerical approximation which have yet to be considered. B.
Orthogonal Series
The general theory of orthogonal series may be found in many places in the literature (see,
for example,
[10]).
intRend only to provide two contrasting examples, thr. significance tion at hand.
If
In
this section we
whereby we illustrate
of the type of state-space required for the applicathe state vector can take on all values in
with positive probability,
then orthogonal polynomials must be used
to provide the necessary approximation.
On the other hand,
if
25 the state-space actually is or can be approximated as compact then a periodic function (such as trigonometric) might profitably be employed as a basis for an orthogonal series.
It may happen that a
given problem may be interpreted in either way (see, for example, Chapter VII),
so that more than one orthogonal series may be
appropriate, depending on the performance desired.
1.
Least Square Polynomial Approximation, Scalar Case
The theory given here follows Hildebrand [10], outline giving key results.
and is only an
Detailed proofs may be found in the
reference. We wish to approximate a function f(x) with a series of polynomials y(x),
as follows: n
(1)
y(x) =-0 ak~k(X)
£(x)
where 0o(X),...,n (x)are the required polynomial functions.
The
approximation is to be thebest in the sense that f bw(x) (f(x)-y(x)] 2 dx a
bf a
w(x)[f(x)-_
n
•k()]2d 0
K=O
(x W2k
= Minimum
(2)
with w(x) a specified weighting function which is assumed non-negative on the interval (a,b).
26 Equation (2)
imposes the condition on the coefficients ak,
b aar
J'w(x)[f(x)-
=
k=0 ad
(3)
(r2,l,...,n)
0
from which
b
n -0 ak
w(x)or(X)ok(X)dx =Jr
(X)f(x)dx
Wx)
(4)
(r=0,1,...,n)
The coordinate functions are chosen to be orthogonal to each other over the interval (a,b) with respect to the weighting function w(x).
(5)
r#k
0
iaW(X)0r(x)Ok (x)dx =
The "uncoupled" equations reduce to (omitting the argument x)
b
b 2
a
w0o dx
wO fdx
or
(6) b
a
a
=
r
aWOrfdx JWrfd
fbw2rdx
To construct the polynomial functions 0o(x),0j (x),
... ,0 r(X),
it
is required that the polynomial 0r(x) be orthogonal to all polynomials of degree inferior to r,
over the interval (a,b) with respect to
the weighting function w(x).
27
(x) qr-1l(x)dx
a
where q r
1 is
notation is
=0
an arbitrary polynomial of degree r-I or lese..
(7)
The
introduced
i
dr U (x)
r •
~dxr
r (r w
r
so that (7) becomes •
b0
_•U
a r
(r)q
xqr-1
(x)dx
=0
After r integrations by parts
[U (r-l) q -(r-2) r rl-
(,r-l , q(r-l)]b r-l) Ur_ a
The requirement that 0 (x) be a polynomial of degree r implies
r
that U (x),
r
I L2
from (8),
satisfy the differential equation
dr+l
1
drU (x)1
--
dxr1w(x)
J -
dx r
0(10)
(9)
S
28 in
(a,b) whereas the requirement (9) qr- (b),
qr-1(a),
qr_l(a),
q'r-1 )(b),
be satisfied for any values of etc. leads to the boundary
conditions r
UrI (a)
U(a)
Ur (b)
=
U (b)
For each integer r, ary conditions (11), is tions,
0
r (x),
Uri (a.
=
..
..
Ur(r-1) (a)
....
U(r-1)(b)=
r
0
a solution of (10) uhich satisfies the boundthe r thmember of the set of polynomial func-
given by
i drUr (x) r
w(x)
dxr
(12)
The numerator to compute the coefficient ar, given by (6)pis a function of f(x).
The denominator,
designated yr' is independent of
f and need be computed only once.
fr= br W
where Ar is
2(x) = (-l)r!rArbaUr(x)dx
the leading coefficient of 0 r(x).
Or(x) = Arxr + A
+rxr-1 ... + Ao
(13)
29 It is
shown, for use in the integration formulas, that if w(x)
does not change sign in (a,b), the polynomial
0 r(X)
possesses r
distinct real zeros, all of which lie in the interval (a,b). For application to the problem of Chapter VII of this paper we want to approximate n
f(x)
=
~ v(x) E br~r(x) r=O
y(x)
(14)
such that
L0-' 1_ý(X) !W-0f(x)
dx
b0rr(x)
minimum
(15)
which leads to the result
b br
which is
fOrdx Wf
-
(16)
2 equivalent to minimizing the squared error (f-y) with re-
In the application we let
spect to the weighting function w__2. v
22 (17)
w = v Me and the interval (a,b) is
,
which leads to the Hermite formulas.
For the above choice of w(x)
x drUr a 22 dxV
30 where Ur satisfies (equation (10))
dr+l
a•2 x2 drýUr
dx
22 dUr 1= 00
(18)
dxdxr and from the boundary conditions,
(11) requires U and its first r-l
derivatives to tend to zero as x++The function
(19)
x
Ur (x) = Cre
has the property that its rth derivative is the product of itself and Ai
a polynomial of degree r.
It therefore satisfies (18) and the boundary
conditions.
r (e-a x 2 dx
0 W)Creadx
(20)
The Hermite polynomial of degree r is defined by taking
Cr = (_l)r and
2 = 1.
2dr
H (x) Hr
e_2
(-l) rex 2d"(edxr
)
(21)
For C rr r 0r (X) = Hr (ax)
(22)
31 ttne Hermite polynomials are determined from the recurrence formula H Wx) = 1 0
HI(X) = 2x and Hr+1 (x) = 2xHr (x)
-
(23)
2rHr1 (x)
Equation (14) takes the form
y(x)
with a and b
22n -a ' x b H (ax) r=O r r
(24)
(from (13) and (16))given by 2r
SYr
2a
1Frr
(25) br
2 rr,
f f_
f(x)H (ax)dx r
by using (23) for Ar and (19) for Ur, with Cr as given above. 2.
Least Square Polynomial Approximation, Multi-dimensional Case We extend the above theory to multi-dimension functions.
To be
specific, and in accordance with the requirements of this paper, the theory is shown for a function of two variables.
Generalization to
higher dimension is straightforward (although messy, requiring double and triple indices to avoid awkward expressions). The approximation Is
32 ~
f(x 1 x 2 )
ml•=
mkm 22=ak•lm~
2
Y(xlx2 ) = P)
)
1
2
(26)
=OLl
with the requirement a2Wl(Xl)W2 ( x2 )[f(xlx )-Y(Xlx 2 2 )]2dx 1 dx 2 = minimum
(27)
Differentiating with respect to akik2 and setting the result to zero leads to in
m
1222
2
b1 b 2 =f Ia w (X1 )W2(X2) r (x1 ) 0 (X2 )f(xMx)dxldx2 2 1 r2 12 122
1
2
(28)
Using orthogonality properties identical to thcte for the scalar case
J a1 Wl0k 0r dx 1 1kr W2 A a2
gives for the coefficients
r dx2 2
2
0
=
k10 r1
0
k2Or
0
2
33
a
af2wlw 20rIr 2fdxldx2
r1r2lwl•2
dx a1
I1
2w2
(29)
" a2
2 2
(r 1 r 2 =O0,01,10,11,02,... , mm ) 2
The polynomial functions are the same ones used for the scalar case. The denominator of (29) is given by yr Yr2 where y my r as given by (13). The two-dimensional approximation that is needed is
f(
1xl 2)=
Y(XlIX 2)
V1(xI)V 2 (x 2)E =
~
br r
(x1)Or (X2)
r 01i2r
0 rl=U
2 =0
12
2
where, analogous to the scalar case,
b
I 1 2 r1
S~Letting
-aI =1
w2
v2
2 2 e -a2x2
a,
a2
-•
0
=b
b 1
2
bl
b2
wa/ fO
a2
f
0 d
d
(30)
34 and using the results of the scalar case,
22
Y(XX
2)
22_
2-a2 x2Th -a2x e -12 r1 r2 br r Hr (ai e l 2=0 12 1
Hr (a 2 x 2 ) 2
(31)
(-CO
0
in
pp 2.
f
x2kx 2 ke' 12X2e' le 2 Xdxjdx2 exist for all nonnegative integral
values of k.
L
- - --
- - -
48 3.
The integral in Equation (64) exists. =
-1- and a'
For the choice of a
----
2a 2
2a2
with a, and a2
the characteristic values of the covariance matrix, the first two conditions are obviously always true, and if decaying density function,
condition 3 is
of this section we associate f(x function in the form of (51)
1
f(xIx 2 ) is
true.
In
x 2 ) with an approximating density
in
(51)
Since the
is bounded for all values of
the arguments (and from reviewing the form of (62)), condition 3 is
the developments
with Jnjn-1 given by (62).
coefficient function of JnjnI
an exponentially
it is
clear that
true even for the approximating density function for any
value of m. Let f(xIx 2 ) be the density function we wish to approximate, let m=O.
Then
00
i
" f(xlx 2 ) dx 1 dX 2
H
*..2x
ala2 y (xlx2 )
=
-
Let al and a2
2 -A
I
e
e
be the variances of f(x
22
(66)
-
1O2 _x220 2 x2 -a•?X 2X2
~21 a
1%
and
(67)
1 x 2 ),
and let
49 E4uation (67) is
y(xlxx2 )
e
X.i
2 X2
e
(68)
We see that by taking only the zero'th term of the expansion we can get the best gaussian fit to the true density function, and also a verificatiun on the form of
the coefficient equation (64).
In addition to the above assumptions (except let m>O), let nj and the means of f(xIx 2 ), and let
be the i-jth central moment.
j
Then 4,j
(X1-nd)i (X2-n2)j f(x
=ff
x )
dx dx
(69)
The product of the Hermite Polynomials when expanding about the expected value can be written
HI ra 1 (4 1 -n\ HjaIQ(X2,-nl~ 2 ) = -r,'y•rz 1ai11a(X l-nl
j(xflnj
(70)
with rkaij a function of r, £, i and J.
Equation (64) is then rewritten as
br-
N.
2r r'21a2
-jj -C f fx2
a1a2
t
2rr!2 L£1w
i=O j=O
r
2 =O0 re a- ai J•(xl-ni)
a ij rk i ajal2Ii j
(x 2 -n 2 ) dx 1 dx 2
(71)
50 Comparing (63) with (60), we see the approximating polynomial expansion was accomplished by letting
2
2
'
2
1 A22
0
,
and using (71)
to compute the coefficients,
r~aii•
and (71)
in (70)
Hermite Polynomials,
V1
n1
brR.
,
12
v2
The cuefficients,
are functions of the coefficients of the
and can be computed one time in advance and stored.
The formula for generating the Hermite coefficients is given in the previous section. raj
Designating cmn as the coefficient of x
in Equation (70)
m
is determined for each r and for each k from rraiij
5.
in H (x),
=
ri Cj
Applying the Hermite Expansion
The general formulas of the previous section are specialized here for the case when the system equations are those of the phase demodulacor (see Chapter VII).
This section describes the techniques that were used
to mechanize the given equations.
In particular,
vi, the characteristic values A., and vectors,
t
the conditional means of the covariance
matrix, and the coefficients of the Hermite approximation, br2 are described in detail for this specific example. To perform the integrations to obtain the means and central moments (to compute the quantities in the above paragraph) Equations (51) (62)
are combined into one equation, omitting temporarily the normalizing
constant.
L
and
51
-"
JnIn (Y2
S (yly 2 ) Jnn_ 1-
(73)
,
2
where
S(yJy 2)
and Jnjn-,
expj
=
-
Ijjz1 -COS
Y1 )2+ (z.-sin
yl)2]1
(74)
given by (62), consists of two parts, an exponential
function and a polynomial function of the arguments y1 and y2 . The exponent of the ex[,onential part was a quadratic form; i.e., the form (-a jjx-x (73)
II
). To perform integrations of (73)
of
and Equat! n
times ylyl, the combined exponential function is put into the
following form in order to use the Gauss-Hermite formula:
1 r(Yl-ml)2 exp
[0--
-
2-
2P(y1-md)(y 2 -m2 ) -
with the exponent of S(y 1 y 2 ) of (73)
(y -1 2 )2 +
22F(yy
2)
(75)
(as given by (51)), linearized,
and the linear terms combined with the linear exponential terms of Jnjn(y1 y 2 ). 1
The error between the linear and nonlinear parts of the
exponent were combined with the polynomial part of Jnln-i to form F(yly 2 ).
This technique was suggested in Bucy, Geesey, and Senne [ 5].
52 The terms of (75)
mi
A =
E(yl)
m2
=
E (y 2 )
are
E [(y M1 )2 1 -.
01
or2
=
E [Y2-m2)
P
=
c'rrelation coefficient
E[ (Yl-md)(Y2-m2) ] =
~01
We operate on the exponential part of (74) [(zl- cos Yl)2+ (z2-
-
LL2r
-sin
(zlz2r COS Yj-
+
y,
as follows.
sin yl)2
Ccos- j + Co s y
sin y
sij
=
cos Y
+
Y
siny 1
sin y-
y1 sin y
31 Cos y +y 1 cos y
I
I,
iiEf n
z(
1
sin y l
()2 (76)
with
cos Y
(z
,
z(
)
53
exp
A LYIc 1-O
exp
(z2 -sin yI) 2]
~2+
L
(z
2.
b = 0
The multiplier
and r
99 need only be chosen so that the number of significant bits in is
in+1
approximately
m.
will force
into the next cycle, leaving little or no correlation between
adjacent numbers. any
In this way each application of (A-i)
r2S+l
For this article we shall choose
a = 8z - 1, or
m-bit number with the last three bits equal one.
left with the choice of
m.
The common choice for the word-length of the
word length.
Thus we are
1
is
the machine
n Such a choice automatically makes the generatoL machine
dependent, since the modular arithmetic contained in (A-I) will usually result in arithmetic overflows, the result of which is not predictable in general for all machines.
Consequently we will consider segmenting
the numbers into the form (m is evenly dividable by q) m(q-l) 1
= 11
m(q-2) 12 2 q + ...
2q
If we then segment
+ iq 2°
a
(A-2)
n
n
n
n
into \'he analogous pieces a_(q-1) + a q 20(A-3)
+ ...
a = al 2q
we may perform the operation (A-1) as follows:
First we compute
2in(q-i) a 1
= al.1 n
1
2
q
n
2in(q-2)
"+ (a
1
12 + a n
2
1 1) 2
q
+
n
2m
"+ (aq-1 1q n
+ aq
1q-1) 2 q + aq 1q 20 n
* See [1] for an equivalent discussion for decimal proposed techn'que applies only to binary machines.
(A-4)
n
machines.
The
%twi Z1
S
'74
!-1!b4
!100 Next we observe that a.l but the last half of the center term and the rest of the lower order terms are eliminated when we take the modulus leaving
2 m,
relative to
i (al lq + a2
1 q1
+ ...
+ aq 11) mod 2q
2
q2
n
n
n
n+l
Lm( f1--2) 2
+ aq 12) 2 q n
3 + (a5 iq + a iq-l + n n
(A-5)
+ .-. + aq Iq 20
the usual bracket (integer part) function.
where [.] is
see that the last half of the last term in (A-5) is
Finally, we
V iq 1
.he
remainder of the last term plus the last half of the next to last term is
iq-l
etc.
pieces of
i+l
We summarize the algorithm involved to identify the is
in Fig. A-I, where it
assumed that
ak
1in
and
are defined froin the previous operation or initially by a suitable seed. We see from studying (A-5) accomplish the update is only q
bits to add up
that the maximum precision required to plus the necessary carry-over
2q-bits
nuimbers of length
2m q
and q m of length M m one number
p = (log 2{q( 2 q - 1)2+ 2
This exact number of bits is
- _ 1}]
+
1,
where the bracket function has again been used again to denote the largest integer.
B.
An Example
In this section we introduce a specific example of the generator for
m = 36,
which is chosen since it
evenly dividable by
q = 2,
3, 4, 6,
is sufficiently long, and yet 9, etc.
Thereby providing
101
+ E_ 0-c:
0
c\J4
0
CS-
+)
z
~
4-4
X
Ai
14 00
C~J
ar+ 'Aj
102 The choice of
significant flexibility.
q*
is
determined primarily so
that the operation (A-5) does not lead to integer overflows on the machine concerned.
Table A-l summarized the hardware requirements
for several combinations.
Table A-i. for m
q
=
No.
Partition Requirements 36 bit random numbers
m= Bits per Piece q
Bits/(positive) integer
1
36
72
2
18
37
3
12
26
4
9
20
6
6
15
9
4
11
12
3
10
18
2
8
36
1
6
of Pieces
The number of carry bits increases at such a rate that it is not very efficient to divide the generator into more than about six or nine parts, but the principle remains the same regardless. In order to initialize the generator it
is nece:sary to provide
q
pieces of a suitable seed (i.e., one that leads to a rel.iable sequence). Table A-2 lists the multiplier
a = 7357764655278
a - vhe initial
* Of course the individual pieces need not necessarily be the same
length but this choice makes the algorithm simplest to co,;i.
103 numbers
1°
a
(Sequence One and
10
3110375524218
ZSequence Two)
for several dJfferent dichotomizations,
Table A-2. No. of Pieces 2
3
4
Partition Examples
a(l0 for Sequence One) 7357768
24473410
10 for Sequence Two 3110378 = 102943 8 10 5524218 = 18561710
4655278
=
15855110
73578
=
382310
31108 = 160810
76468 = 400610
37558 = 202910
55278 = 290310
24218 = 129710
7358 = 47710
3118 = 20110
7768 = 51010
0378 = 03110
4658 = 30910
5528 = 36210
5278 = 34310
4218 = 27310
738 = 5910
318 = 2510
578 W 4710
108 = 08]0
768 = 6210
378 = 3110
468 = 3810
558 = 4510
55q = 45
24
9 27
10
2310
= 20 8
10
21= 1710
Table A-3 contains the first 250 octal numbers resulting from Sequence One (equivalent to the seed 1o = 1 vith one number discarded)
104 and Table A-4 contains the initial 250 octal numbers from Sequence Two. The statistics for these two sequences is given in the main part of this article.
The cycle length of the generator may be calculated theoreti-
cally as follows:
The next to last octal digit repeats every 8 steps,
the third from last every 64 steps, etc., so that the first digit repeats every 811= 2
8.59x10 9 .
In practice, however,
only be obtained for certain seeds,
thus a check must be made rt
guarantee that the cycle is sufficiently long. run on the two given starting numbers, greater than 107 for both.
the maximum cycle can
Such a test 'as been
confirming a cycle length of
The repeat characteristics for the initial
segment Qf each sequence is given in Table A-5.
105 Table A-3.
0/ 735776465527 546011324661 10/ 6050071C5407 124027314201 20/ 176553134667 767051140121 30/ 003422023547 717370070441 40/ 004604602027 051575575361 50/ 715232077707 120347726701 60/ 372342545167 337314554621 70/ 122324612047 001636347141 80/ 336670306327 605537556061 90/ 221123262207 316227251401 100/ 226525145467 317604301321 110/ 142603170347 135622135641 120/ 524644402627 467414246561 130/ 700055634507 132422504101 140/ 373255535767 106715136021 150/ 376370336647 776340234341 160/ 364024067127 651040447261 170/ 404344377007 750206646601 180/ 201227116267 403544102521 190/ 166317275147 470725643041 200/ 60662214342? 630171357761 210/ 333362331307 370540721301 220/ 312174466567 255106157221 230/ 272143023T!7 304377561541 240/ 254151607727 677064000461
Initial Sample
310473353621 742044723047 133551226141 037341537327 434502115061 101236233207 655044670401 646162236467 613250400321 671013001347 455020314641 152627333627 227000105561 116564305507 374157423101 775630326767 015736535021 510277647647 706711713341 633530520127 633535606261 741056550007 032153065601 375327407267 0422:3001521 365736306147 401342622041 641260274427 333470016761 252510202307 353204440301 613632457567 50fL332356221 406:301534447 726750040541 177751440727 303653737461 210374024607 711550723001 260414520067 470332042721 665204552747 213646567241 750417175227 445426370161 23P505037107 C5uC05115501 014030E50367 3(,6426637421 515002361247
ath-- Sequence One
167041756507 640355342101 350232117767 4743411340'ýl 430116160647 560164372341 716424151127 445053745261 515637721007 575260304601 045756700267 431002700521 355564317147 775360601041 707405425427 612307455761 570205053307 005511157301 266117450567 550337555221 465147245447 467421317541 327540271727 520464676461 110414375607 666544742001 445347211067 607664541721 277101763747 545643346241 525057526227 673720627161 223461110107 553760434501 371240741367 614376636421 252737272247 401663704741 492476352527 334470267661 711256012407 207031037201 510047461667 353573043121 176732370547 737217553441 440027567027 467110440361 414376104707 427513351701
473777560041 062221556427 743450114761 3*, 0724307 2•6C67630• 04Z2133441567 2761e5754221 15753I56447 265113'76541 45431612ý2727 1443166354E1 6663037466%'k 27110176100. 720430702067 307500240721 367006174747 777441125241 151007057227 775734066161 3306u4161107 205774753501 351100132367 113327635421 721203203247 527374763741 272007403527 763355026661 631244563407 442014656201 207274352667 721731342121 631546001547 023534132441 052132320027 674336477361 517640355707 777736470701 057370563167 653502156621 652411370047 714713611141 721112624327 254315660061 136662340207 450437213401 146437763467 231117103321 064610546347 606030577641 515343520527
712641373067 526574737721 305721405747 350037704241 757225410A27 652470325161 563076232107 353052272501 124366323367 042241634421 150756114247 046407042741 550107434527 241462565661 434708334407 326541475201 146450243667 34055064A121 17017041•5$A7 264051511441 543324051027 115505536361 015051626707 230422607701 ?12242154167 265316755621 642713501047 554424470141 627606055327 265216217061 463257311207 53435263240! 611617054467 364021202321 657002357347 125624756641 610550451627 443151407561 702116163507 314326565101 207731744767 710254537021 62!370025647 707667555341 336506436127 716164310261 156701226007 470403427601 030455625267 336136203521
106 Table A-4.
0/ 311037552421 473364616247 10/ 542451520741 547717776527 20/ 675431003661 053565536407 30/ 502132453201 106377305667 40/ 473763357121 0236743111547 50/ 663364767441 416435613027 60/ 365060554361 505142230707 70/ 517054365701 547002416167 80/ 260526273621 356334603047 90/ 764170546141 317441017327 100/ 676556035061 760375113207 110/ 376025210401 621410516467 120/ 460605320321 253160661347 130/ 300601634641 050344613627 140/ 733056025561 302101165507 150/ 117001743101 042774606767 160/ 43577545F'21 731123527647 170/ 207035233341 573464000:27 180/ 567015526261 111351430007 190/ 550037405601 551211667267 200/ 732171721521 316040166147 210/ 257230142041 650231554427 220/ 323351736761 050561062307 230/ 275332760301 473032737567 240/ 15737b276221 676461414/147
I
Initial Sample Path - Sequence Two
222760501707 332563504701 731701007167 273546072621 234703714047 770664425141 610621250327 335621374061 632477064207 676663627401 333714607467 514612417321 124517472347 273261013641 431336544627 045242664561 473076636507 667257662101 07651L377767 114060054021 642462040647 103367712341 330477431127 577013665261 370652601007 615224624601 5r4761160267 45C423620521 361406177147 174326121041 712476705427 436651375761 652775733307 027717477301 472437730567 32106-475221 3130471254117 463330637541 510447551727 264630616461 223563255607 144415262001 554005471067 040711461721 75"657643747 47' 71466621! 2)405006227 4(`-C6654716: 1L,'(5770107 736L7!754501
553141172641 566217475627 410550523561 006643307507 341176601101 113267170767 320523453021 512727351647 377623371341 551702062127 154633024261 061222752007 542553043601 466057451267 411734517521 04616j210147 534025100041 341433036427 344472034761 544561604307 731145216301 053673721567 062330674221 223143636447 2201631165.1 633345402727 723142555461 130172626607 565032301001 056207162067 312205160721 747304054747 566572445241 602454337227 577562006161 605651041107 604567273501 567574412367 7115365554P1 007157063247 534070303741 106672663527 275404746661 102267443407 634651176201 0213506632667 511042262121 575777661547 246771452441 365033600027
6115504742267 113326416521 125147221147 135125057041 446056167427 143633473761 336714455307 160233735301 747756712567 062360073221 077747347447 2511163755zli 024232233727 107475514461 103451177607 632010320001 357537653067 222761657721 445737265747 314251224241 603012670227 444776245161 230663112107 505724612501 274202603367 670130554421 700451774247 526162362741 007112714527 402172505661 4b0445214407 253456015201 536002523667 315341561121 660142272547 660367031441 722345331027 731617456361 557372506707 460601127701 616112434167 2062116756e1 623743361047 506704010141 113445335327 502132137061 065156171207 221173152401 b0660533'.467 706216122321
421051453461 436576550607 002327337001 311217344067 732017356721 024401476747 717531003241 522640221227 703733504161 672044163107 421123131501 205237774367 655503553421 104253705247 1505554417111 677121715527 4552012zj4661 263271765407 123023634201 621225414667 051322060121 077113703547 004765410441 o0774606RO27 07236751536! 674272757707 220606246701 353333025167 323667474621 '7710741172047 523175667141 4706475663?7 465133476061 021542142207 251127571401 422633425467 443461221321 275231050347 754323455641 352241662627 553012166561 622652514507 671165024101 237302015767 44,727405602). 505474216647 5734035543411 131637347127 325640367261 732117257007
107 Table A-5.
Repeat Characteristics
of the Generator for each Sequence Cycle
Sequence One
Sequence Two
0
735776465527 8
3110375524218
81
062221556427
82
ý8 650107434527
1136462 655503553421
83
617512155527
514633562421 -;-8 524218
1133264165218
8
84
--- 8
8
65527
85465528
552421 8
86
6465527
75524218
The final concern in describing the generator is to give examples of coding the generator.
These examples, written in FORTRAN II-
compatible code, assume no special hardware characteristics except that only the applicable number of pieces has been selected.
It is further
assumed that the subroutine remains core--resident so that all locally defined variables remain unchanged between calls.
If the subroutine
must be dynamically reloaded on call (either because of load on call restrictions or virtual core) then the locally defined variables must be stored globally in COMMON.
We leave this modification to the user.
Table A-6 contains coding examples for two, three, four, and six-piece generators.
No optimization has been done.
108 Table A-6.
FORERAN-11 Coding Examples
Two-Piece Generator
FUN-CTIoN xiAMFNF'S~rj0DE) 10 N:1=244 734 M2=158551 N I=NS C 1)
T2:=2.**(-36)
100 DO 200 I=1,2 (,'TO (110,1P0).,I 110 H=fv.2*N2 (-0 11O 190 120 X=M1*N2+M2*W1+1•D
190 RL =I
*10 z0
C0
113
4
I-
-J
'4-4 +
LU
+
-t
+
+ '4-4
0 0 4-i
-4
-4
-4 -4 t4 -4
-4 -4
N -H 44
I-
119 expressions is given by Baer and Bovet (2].
For a definition of
maximal parallelism and a devzlopment of the associated theory, see Keller [9]. Now that a note of caution has been inserted we proceed to a discussion of higbly structured parallel computations, where we use Bayes Law as an example.
Bayes Law may be expressed operationally in the
form (~ j (Y).d C n+l n+l() =jdf T_(Yx) Jn ;-_
where
Tn(Y,x)
is
(,
a spatially varying kernel which is a function of the
product of probability densities of the noises and the new measurement Zn,
and
Cn+
is
the normalizing constant (i.e., the integral of
the right-hand side of (1) over all y).
It should be clear from inspect-
ing (1) that there is implicitly a form of parallelism called for, in that for eery be comp,'ted.
y
in
Jn+l
a
d-dimensicnal convolution integral must
The explicit form of parallelism will bj a function of the
nature of the (finite-dimensional) algorithm chosen to implement (1). The point-mass approach of Bucy and Senne (5] maps (1) into an equivalent The least-squares series expansions of Hecht [8],
matrix multiplication. Alspach [1],
and Center [6], for example,
update for expansion coefficients.
replace (1) by an equivalent
The exact implications on the
structure of the computer will of course be dependent upon the form of the algorithm. problt
(1)
To be specific, then, since many of the associated
e-:'e similar, we wi.1
discuss the matrix multiply analogy to
where symbolically we write M
X (X,'x)
J(x)(2
120
J
=
C
(3)
'
il Jn+l(i) =n+l(
(5)
l-i 'n+l()
Xn+lIn
and
i=I..M,(4
iyCn+l,
(where we choose to discuss the
- (5)
The overall structure of (2)
conditional mean estimate ao a specific example) illustrate both purely
(by analogy with the example in Figs. parallelism could be built into (3)
1-2) it
is obvious that some it
and (5),
While
computations.
(5)
and essentially serial (3),
(4)
parallel (2),
is also clear that a
processor which is optimized to perform the totally parallel operations (2) and (4) would be mostly wasted on the scaling and estimation integrals of (3)
and (5).
Thus it
is clear that even in this highly
structured problem a combination of computer architectures would bc necessary to optimize computational speed with minimum overhead (i.e., idle functional units or processors).
Ideally, then, we might envision a
and accumulates the sum (2)
i =,...,M
J'n 1
in
(i)
for
successively for
we had enough processors and temporary storage we
(If
j = 1,...,M.
J (x)
Trn(.ix)
parallel computer simultaneously evaluatE
could even imagine tree-structuring tMe latter computation in the form of Fig. 2.) that our
Imagine,
J'V (y.)
is
the evaluation of (2), and
ý(njn-l),
C )
using
n
further, that we only had off by
..
factor of
C
J'n
to use in (2)
which,
so
simultaneous to
we are evaluating in an auxiliary
serial machine,
which we are also evaluating in a serial machine (modulo
JV
n
instead of
take the form in Fig. 3.
J
II
.
The resulting computation timing could
o p44
0
.-4
414 coo
:
o SU I S
X|
::
0
0
,-4
0
-I
U
X r..
$4.,1
4
"0
C4
4-
C.)r
~
-4
0 S4
r
-
+
,'-4 i
"
-
00
,1-I
.
,,4
i • C..4
-4
.'
+ Q,
since the dynamics are linear.
The results
are shown in Table A--2, also taken from Bucy and Senne [4].
Table A-l.
Monte Carlo Performance
of the Optimal and Linearized Predictors
Optimal nonlinear predictor
n
Linearized predictor
Zero state predictor
Average Average error Average covariance error Average covariance
Theoretical covariance
1
-0.094 0.078
0.359 0.230
0.230 0.857
-0.371 0.477
0.645 -0.765
-0.765 2.728
0.350 0.050
0.050 1.100
2
-0.008 0.042
0.146 0.068
0.068 0.599
-0.277 0.081
0.401 -0.183
-0.183 1.255
0.188 0.075
0.075 1.200
3
-0.004 0.026
0.131 0.051
0.051 0.354
-0.158 -0.041
0.222 0.012
0.012 0.623
0.147 0.088
0.088 1.300
4
0.007 0.003
0.127 0.088
0.088 0.351
-0.081 -0.084
0.158 0.074
0.074 0.449
0.137 0.094
0.094 1.400
5
0.015 -0.022
0.129 0.069
0.069 0.342
-0.069 -0.106
0.155 0.113
0.113 0.723
0.134 0.097
0.097 1.500
6
0.038 0.025
0.177 0.099
0.099 0.369
-0.128 -0.315
0.255 0.589
0.589 3.813
0.134 0.098
0.098 1.600
7
0.075 0.065
0.217 0.140
0.140 0.416
-0.188 -0.362
0.291 0.824
0.824 6.439
0.133 0.099
0.099 1.700
8
0.063 0.093
0.159 0.112
0.112 0.455
-0.186 -0.316
0.223 0.549
0.549 5.890
0.133 0.100
0.100 1.800
9
0.007 0.043
0.130 0.076
0.076 0.364
-0.141 -0.321
0.164 0.262
0.262 4.950
0.133 0.100
0.100 1.900
10
0.026 0.036
0.136 0.081
0.081 0.326
-0.146 -0.428
0.164 0.148
0.148 3.978
0.133 0.100
0.100 2.000
---------------------------------
143 Table A-2.
Monte Carlo Performance of Optinal and Linearized Filters
Optimal nonlinear filter n
Average covariance
Linearized filter Average covariance
1
1.038 0.361
0.361 0.757
2.180 -1.630
-1.630 2.628
2
0.184 0.036
0.036 0.499
1.206 -0.465
-0.465 1.155
3
0.123 0.003
0.003 0.254
0.487 -0.077
-0.077 0.523
4
0.108 0.075 0.117
0.075 0.251 0.039
0,232 0.049 0.221
0.049 0.349 0.126
0.039
0.242
0.126
0.623
0.309
0.097
0.620
1.078
0.097
0.269
1.078
3.712
7
0.469 0.180
0.180 0.316
0.764 1.547
1.547 6.339
8
0.234 0.124
0.124 0.355
0.491 0.999
0.999 5.790
9
0.119 0.053
0.053 0.264
0.257 0.425
0.425 4.850
10
0.142 0.063
0.063 0.226
0.256 0.196
0.1.96 3.878
5
6
In reviewing these experimental results it
must be pointed out that
the confidence analysis of Bucy and Senne (4] is 4
gaussian confidence level is
in error.
1
(2o /N)/2,
in the original paper (see Chapter IV).
The one-sigma
21/2
and not
(3o
IN)
,
as given
Using the results of Chapter IV
we will now assess the accuracy of the above test results.
First we
convert the diagonal terms of the sampled covariances into their three
.'A
144 standard deviation confidence limits, taken from Chapter IV.
The results
are given in Table A-3 for the predictor covariances of Table A-i. nonlinear predictor results were based on
N=500
The
Monte Carlos and the
linearized predictor was run for N=2000.
Table A-3.
Monte Carlo
Confidence Intervals for Predictors Linearized Predictor
Optimal Nonlinear Predictor
n
First Coordinate
Second Coordinate
First Coordinate
Second Coordinate
1
0.302+0.443
0.720+1.058
0.589+0.713
2.492-3.014
2
0.123+0.180
O.503-0.739
0.366+0.443
1.146+1.387
3
0.11040.162
0.29840.437
O.203->0.245
0,56940.688
4
0.107">0.157
0.29540.433
O.144-0.175
0.41040.496
5
0.108+0.159
0.28740.422
0.14240.171
0.66040.799
6
0.149+0.218
0.30+Q0.*455
0.233÷0.282
3.483+4.213
7
0.182+0.268
0.350+0.513
0.266+0.322
5.881+7.114
8
0.137+0.196
0.382+0.562
0.204+0.246
5.380+6.507
9
0.109;0.160
0.30640.449
0.150+0.181
4.521+5.469
10
0.11440.168
0.274+0.402
0.1500•0.181
3.633+4.395
Almost immediately we ascertain from comparing Table A-3 with Table A-1 that the "optimal" estimate can not possibly be optimal for iterations 6,
7, and 8, since the entire
the zero-state result fZr those samples.
3o
confidence band lies above
In fact we observe a
periodicity in the errors for both estimators with period approximately equal to the rotational period of the sensor (between 6 and 7 samples).
145 We could also have made a study of the zero-bias reliability of the estimates.
But, owing to the dubious value of the data, we choose to
proceed op to the more recent experiments.
147
Appendix B.
More Recent Experimental Results:
Point Masses versus Gaussian Sums The early experiments reported in the previous appendix raised a lot of questions which demanded more tests to resolve. errors cyclic modulo 27?
What made the
Why was the lincarized filter unstable?
Al~pach and Sorenson (2] revealed another "linearized" filter which did not have instabilities.
Thus it
was imperative that we closely compare
our results with theirs in order to explain this anomolous discrepancy. Using an escimate comparison test of the two linearized filters it was discoveree that: Alspach and Sorenson had modified the estimate update equation to the form
x (n+lIn+l) = xl(n+lln) + A(n+l) [z(n+l)-h[x(n+lIn) I+7Imod 2-7r instead of the original form (7).
(B-l)
We may intuitively rationalize the
success of the modified measurement scheme by examining Fig. B-1. Suppose the working range of the sensor is to the first coordinate axis. which might arise if orbit, as is situation for
[-n,v)
w.th zero referring
The figure shows t,.,o typical situations
the target is allowed to be inside the sensor
the likely case if 0(n+l) G 0.
Jo
-
N(O,I).
In this case
h(x 1 ),
Case 1 represents the the t-rue bearing, is
positive, while the bearing to the estimated position The result is
a difference in bearing greater than
xI
is negative.
Tr. On the other
hand, if both the estimate and the target remain inside the sensor orbit, then some nonzero
B(n+l) (such as given in Case 2 of the figure)
would result in bearing and estimated bearing of the same sign, so that
Preceding page blank
-i' ;$''
148
hnlX2n
_•i• -" 'a'
Fig. B-1.
Typical Geometry of the "Old Problem" Illustrates Periodicity of Errors
n (Xl)
h (x )
149 the difference is
always less than
V. Thus the figure accounts for
the modularity of the error performance discovered in the early experiments. Now the filter update equation (7) contains the term which ip turn may be expressed as h(x) - h(x) is
gaussian and may take on all values.
-iT to Thus, if
is usually small compared with the magnitude of be expected that the modulation of the difference would lead to improved performance.
so that even if
h(x) - h(® + v,
has an effective range from
z - h(Xo
n,
the additive noise
the noise magnitude it would
h(:) - h(K), z - h(x)
as in (B-l)
The "fixed" linearized performance
is in fact dramatically improved, as illustrated in Table B-l, which shows the Monte Carlo results for the "old problem".
Table B-1.
Monte Carlo Averaged Sum Squared Error
Performance for Predictors - Old Problem
Sample
Mean-State Predictor
Iterated Linearized Predictor
"Fixed" Linearized Predictor
Gaussian Sum Predictor
Floating Grid Predictor
1
1.250
4.334
1.468
0.757
0.866
2
1.388
1.678
0.974
0.596
0.674
3
1.447
0.651
0.562
0.457
0.631
4
1.537
0.535
0.550
0.473
0.408
5
1.634
0.828
0.617
0.518
0.403
6
1.734
3.187
0.169
0.531
0.464
7
1.833
4.927
0.540
0.471
0.674
8
1.933
4.912
0.618
0.553
0.454
9
2.03'
4.248
0.564
0.578
0.377
10
2.133
3.763
0.619
0.641
0.443"
Stable
Stable
S table
Unstable
Unstable
150 What is
shown the table is
the trace of the sampled covariance
matrices for the Mean-State (ideal predictor), the Gaussian Sum,
the Fixed-Linearized, Table B-1 is
and the Floating Grid Predictors.
which was printed incorrectly due to
taken from Senne [8],
an editing error.
the Iterated-Linearized,
The fixed-linearized predictor is seen in the table
to be stable, and, based on only 100 Monte Carlos, essentially equivalent It
to both the Gaussian-Sum and Floating-Grid predictors. in fact, that the "old problem," than originally intended. tion with each sample if it of
a=l
radian per sample.
as depicted in Fig. B-1,
turns out, is far easier
The sensor is given considerable new informaorbits around the target at the high rate Thus it
is not surprising to find that it
is relatively easy to design an approximate estimator which performs very close to optimum. performs miserably, it however,
The straightforward happens,
linearized predictor
as discussed above.
We can conclude,
that the old problem is not sufficiently difficult to demon-
strate the difference in performance between the various estimators. We return thus to the originally intended geometry, Fig. B-2, where the iLLitial density
Jo
-
N([3),1),
illustrated in
R=0.01,
and
F=I,
so that the target is undergoing pure random walk in both dimensions. The Monte Carlo pertormance summary for the "new problem" is given in Table B-2, where we may discern that the linearized predictor (in this case the "fix" is unnecessary) converges more slowly to steady state than the optimal estimates, sample paths) still
but the accuracy of the Monte Carlo (100
precludes discrimination for steady-state operation.
Thus we have come to the conclusion that the passive receiver with gaussian noises is very linearizable in that the conditional densities are very nearly gaussian, at least in range-bearing coordinates.
151
'II
Fig. j-2.
"New Problem" Geometry without
Periodic Errors
152 Monte Carlo Averaged Sum-Squared Error "Performance for Predictors - New Problem
Table B-2.
I
I
Sample
Mean-State Predictor
Linearized Predictor
Gaussian Sum Predictor
Floating Grid Predictor
1
2.200
1.665
11.436*
0.955
2
2.400
1.928
1.723
1.020
3
2.600
2.187
1.495
1.006
4
2.800
2.229
1.321
1.083
5
3.000
2.190
1.208
1.118
6
3.200
2.145
1.45$
1.359
7
3.400
1.830
1.625
1.521
8
3.600
2.105
1.437
1.370
9
3.800
2.136
1.423
1.132
10
4.000
2.349
1.286
1.316
11
4.200
2.370
1.431
1.493
12
4.400
2.114
1.533
1.515
13
4.600
1.883
1.475
1.345
14
4.800
1.752
1.398
1.268
15
5.000
1.752
1.553
1.441
16
5.200
1.850
1.617
1.534
17
5.400
1.627
1.275
1,222
18
5.600
1.561
1.271
1,238
19
5.800
1.620
1.630
1.314
20
6.000
1.688
1.803
1.556
We note in passing that the first sample estimate of the Gaussian sum predictor suffers from a transient at about Monte Carlo number 50,
153 so that the number in the table is inaccurate.
The filter recovers
stability, however, so that this number may be ignored.
$
155
Appendix C. A Movie of Conditional Densities In the previous two appendices the relative success of the linearized-
type predictor can be related to the validity of the Gaussian approximation to the conditional density functions.
The simplest way to destroy
the validity of the Gaussian assumption is to provide a nongaussian initial density function.
Consider, for example, the detector geometry
illustrated in Fig. C-1.
If there were a sequence of known reflecting
ionospheric layers above the aircraft observer and we were given an apriori distribution on the transmitter's power, then it is conceivable that we might want to integrate over all elevations to maximize detectability, thus introducing a multimodal range ambiguity as shown in the figure.
ProbabiJistically,
the initial condition would be obtained by
taking the product of the multimodal range ambiguity density with the 6
bearing ambiguity density.
The result might look as in Fig. C-2, where
no particular scale is intended.
As soon as the aircraft's detector
circuits obtain a reliable detection, the aircraft banks left into a
circle of unit radius and activates a high sensitivity receiver which is tuned to one elevation. Iii order to study the evoluti.on of the conditional densities a
movie was made by chor~trzg the parameters
.0
0.0250
0.025
1.001 i.000
0.025
0.050
[1 F =oo
R= 0.01, J
0
= 0.01,
and
was chosen as the sum of four Gaussian densities with one cross range
Preceding page blank
156
0
N
0
5
m
k*
k
tV
C)CU
o
4