An Engineer's Guide to Building Nonlinear Filters

0 downloads 0 Views 11MB Size Report
4. 0a. CONTRACT OR GRANT NO ga. ORIGINATOR'S REPORT NUMBER(S) ..... severe (i.e. they don't have arbitrarily large derivatives), and ... tation is taken principally from the dissertation of Hecht [14]. Chapter ..... 12. pZ ( AJ=i) corresponds to the distribution P(4 412io-1) which corresponds to ...... radian, tests were me.
FRANK J. SElLE* IN11SE1ARC14 --LA**WAT0kY"

-SRL-TR-72-0004

DMY1V

-AIC

ANENGI NEER'S GUIDE TO BUILDING NONLINEAR FILTERS

VOLUME I Richard S ESucy Calvin Hecht Capt Kenneth DSenne

APPROVtO

ROJET 794ODSTRISIM~ON

FOR

PUBLIC

UNLIMITEO.

AIR FORCE SYSTEMS COMMAND UNITED~ STATES AIR rFORCE

RELEA SE;

UNCLASSIFIED Securit

Classification

DOCUMENT CONTROL DATA-

R &D

(Security classification of title, body of abstract and indexing annotationmust be entered when the overall report is claelled• I ORIGINATINC AC I VITY (Corporata author)

I2r.

Frank J. Seiler Research Laboratory (AFSC) USAF Academy, Colorado 80840

REPORT SECURITY CLASSIFICATION

UNCLASSTFIED 2"

GROUP

3. REPORT TITLE

An Engineer's Guide to Building Nonlinear Filters

4 DESCRIPTIVE NOTES (Type of report and Inclusive dates)

Final Project Report (1969-1972) 5. AUTHORIS) (Firet name, middle initial, last name)

Richard S. Bucy, Professor, U.S.C. Calvin Hecht, T.R.W. Systems -Kenneth D. Senne, Captain, USAF 6. REPORT DATE

'a.

TOTAL NO.

May 1972 0a.

CONTRACT OR GRANT NO

OF PAGES

7b. NO. OF REFS

5

4

ga. ORIGINATOR'S REPORT NUMBER(S)

SRL-TR-72-0004 b. PROJECT NO.

7904-00-17 C.

DRS 61 102F d. 10

9b. OTHER REPOR'

NO(•) (Any other numbers that may be assi•ned

this report)

BPAC 681307

AD-

OISTRIBUTION STATEMENT

Approved for public release; distribution unlimited. I1

SUPPLEMENTARY NOTES

12. SPONSORING MILITARY ACTIVITY

Frank J. Seiler Resear.ch.Laboratory (AFSC USAF Academy, Colorado 9O840 " 13,

AD!TRACT

A comprehensive treatment of numerical approaches to the solution of Bayes Law has been included, describing numerical methods, computationa] algorithms, two example problems, and extensive numerical results. Bayes Law is an lintegral equation describing the evolution of the conditional probability distribution, describing the state of a Markov process, conditioned on the pp;t noisy observations. The Bayes Law is, in fact, the general solution to the discretL. nonlinea: estimation problem. This research represents one of the first successful attempts to approximat the conditional probability densities numerically and evaluate the Bayes integral by quadratures. Tht methods of density representation studied most thoroughly include Orthogonal Polynomials, Point-Masses, Gaussian Sums, and Fourier Series. For example problems two secpnd-order systems have been studied. The first problem involves a passive (bearings-only) receiver with geometry similar to the AWACS. The second example involves the reconsLruction of a second order phaseprocess which is the message process for a phase-modulated communication system. The various forms of the nonlinear estimates are compared with the phase-locked loop de.iodulator and extensive Monte Carlo simulations are described to provide high confidence numerical comparisons. A chapter is devoted to elaborating on the Monte Carlo methods employed for the computer simulations and a general-purpose, high-quality random number generator is introduced which is exactly realizable on any binary computer for comparisons of the digital and hybrid computer architectures to the Bayes-Law algorithms.

DD I

NOV

.1473

.1-,

UNCLASSIFIED Securitv ClassiIIcation

UNCLASSIFIED Security Classification LINK A

14.

LINK 8

KEY WORDS

LINK C

-

WT

ROLE

WT

ROLE

ROLE

WT

Markov Processes Nonlinear estimation Filtering Prediction Time series Simulation Bayes Law Monte Carlo nethods Orthogonal Series Quadratures Numerical Curve Fitting Random Number Generators Parallel Processing Hermite Polynomials Gaussian Sums

Point Masses Passive Tracking Bearings-only Receivers AWACS Phase Modulation Demodulation Phase-Locked Loops

I

I

I

_

i

_ _i

I

UNCLASSIFIED Security Classification

.

An Engineer's Guide co Building Nonlinear Filters

by

Richard S.

Calvin Hecht,

Bucy,

U., 0.

T.R.W. Syotems

Kenneth D. Senne, Capt.,

U.S.A.F.*

Final Report Frank J. Seiler Research Laboratory Project 7904-00-37 May 1972

* After June 1972:

Staff Member, M.I.T. Lincoln Laboratory

ii

Publication Notice

The authors are submitting portions of the enclosed material to the IEE9 Transactions on Automatic Control, while other portions will be published in Stochastics.

ii

Foreword Ever since the publication of the papers by ',alman and Bucy in 1960 and 1961, the engineering community has made giant strides in applying their ideas concerning linear estimation to an essentially uncountable number of space and control problems.

The widespread and

almost instantaneous acceptance of the Kalman-Bucy filter was due in part to a number of factors:

the then common frequency domain

synthesis procedures were limited to time-invariant (steady-state) problems, the advent of the digital computer led to easy simulation of time functions, whereas frequency calculations were indirect and A

less convenient than "state-space",

and most important, the newly

accelerated man-in-space program led to an immediate real-time application, the most significant caLalyst for engineering developments. The fact that most applications were not linear systems did not diminish the enthusiasm for the new linear theory.

Almost over night,

under the pressures of ever-present engineering deadlines,

a host of

approximate approaches to the nonlinear problems were generated with many variations and extensions, but most of which were patterned to look much like the highly successful linear estimator of Kalman and Bucy. In some applications these approximations produced highly saLisfactory results, but in most situations a second design phase, less publicized but equally important, ano.molies, estimators.

was carried out in order to eliminate the

instabilities and -incxplained peculiarities of t*-.= A seemingly endless stream of technical papers dealing

with adaptive techniques and examples of clever but non-generalizable

iii

"engineering fixes" for the multitude of undesirable characteristics of the approximations have appeared during the lest 8-10 years,

resulting

in an apparently bottomless bag of engineering tricks for the estimation engineer.

In addition, the technical access to the original articles

was evidently unsatisfactory since, for a variety of reasons, many authors have devoted tutorial articles explaining how the same equations for estimation can be derived from a ,ariety of points of view, including Bayes rule (the original concept), likelihood and a host of others,

least squares, maximum

The results would fill

many vo.umes

of "handbooks" for the engineer while neither answering nor posing the questions - where do we go from here?

or - do we reall

understand

nonlinear estimation? On the other hand, it

is fair to say that nonlinear estimation

theory, as advanced and generalized as it has become, pub.-

has not been

'zed and advertised as a general panacea for the estimation

problems of the future.

A small but growing contingent of university

researchers has been working steadily on problems involving mathematically subtile concepts such as stochastic integrals, diffusion processes, etc.,

Ito calculus,

and a fe, elegant and mathematically

sophisticated theorems have been proved concerning general representations for nonliner estimators.

But the very difficulties which

frustrated the application of these "solutions" is their nature: estimates are determined to be related to the numerical solution of infinite-dimensional partial differential equo'tionis - a formidable task in the simplest of problems,

and thus, understandably,

the paths

of estimation application and nonlinear estimation theory have continued to diverge.

On the rare occasion that representatives of the two

iv

factions have met much needless antagonism has resulted.

While it

is

true that the questions most frequently asked by the applications engineers rarely deal with the optimality of estimates but merely with the computational tractability, and vice versa for the theorists, it appears that a partial reconciliation on both sides seems likely to provide mutual enrichment, Specifically, what is proposed here is that all interested parties should stand back for a moment and reflect on the question - what, exactly, are the characteristics of optimal, ;onlinear estimators?

The

answer to this and related questions provides the motivation for the work described in this paper.

We ask thac the applir,^.tions engineer

temporarily suspend his seemingly uLattainable computational constraints and that the estimation theorist pause briefly from his esoteric pursuits involving the subtleties of measure theory and look for awhile at some examples of optimal, discrete-time, nonlinear estimators. it

turns out, the discrete-time problem,

although far from trivial, is

at leasc tractable for solution on modern digital computers.) way, it

(As

In this

is expected that the benefits to both factions will be

considerable,

and perhaps their steady divergence may be curtailed.

In effect, it

is hoped that the engineer engaged in the daily

process of fitting old tricks into new computers will begin to ask whether or not the basic principle ot the trick is applicable to the specific application at hand - perhaps a simple modification or another approach would be much more satisfactory.

Such realiz.tions are

generally obtained at the expense of serious time delays and costly experimental failures - perhaps added experience with optimal nonlinear estimators could provide more effective and less expensive guidelines.

V

=.;.

In addition, it

is hoped that the theoretician can appreciate the

reality of concrete exampJes and thereby refuel his fire regarding theorems concerning the characteristics and asymptotic descriptions of optimal nonlinear estimates.

Finally, it

is intended that certain

applications of nonlinear estimators be considered for which the optimal solution can itself be considered practical, if not via present day realizations,

then perhaps by special purpose hardware designed

with optimal estimation in mind.

This paper has been written to

instill some enthusiasm in the reader for all of these expectations.

vi

Acknowledgements

The authors are deeply indebted to the generous and enthusiastic support of the U.S. Administration. the Frank J.

Air Force and to the National Aeronautics and Space

The research described herein has been performed at

Seiler Research Laboratory, Air Force Systems Command,

at

the University of Southern California Department of Aerospace Engineering (Air Force Office of Scientific Research Grant AFOSR-71-2141), conjunction with Electrac Corporation (NASA Grant NAS5-10789,

and in 1970).

Although many persons must be acknowledged regarding this research and the associated e):periments, Colonel Bernard S. Morgan,

perhaps the two most important figures are Jr., USAF,

and Major Allen D. Dayton, USAF,

who had the insight and intuition necessary to introduce the authors of this paper to each other in the first place. fortunate,

It would be very

indeed, if such technical interchanges and mutual cooperation

were to -ontinue to be encouraged among government sponsored and univer.iity researchers.

vii

Table of Contents

Publication Notice

ii

Foreward

iii

Acknowledgements

vii

Table of Contents

ix

Table of Figures

xiii

Table of Tables

I.

I

III.

xvi

l.nttroduction

1

References

6

Bayesian Estimation:

The Problem

9

References

21

Finite Dimensional Approximations

23

A.

Introduction

23

B.

Orthogonal Series

24

1. Least Square Polynomial Approximation, Scalar Case

25

2.

Least Square Polynomial Approximation, Multi-dimensional Case

31

3.

Gauss-HermitL Integration

34

4.

Application of Polynomial Expansions to a Two-

5.

Dimensional Filtering Problem

39

Applying the Hermii:e Expansion

50

Preceding page blank ix

Page

IV.

C.

The Point-Mass Approximation

64

D.

Non-Orthogonal Series - Gaussian Sums

68

E.

Other Computational Methods

74

1. Fourier Sezies Expansions

74

2.

75

Spline Functions

References

77

Monte Carlo Analysis Techniques

79

A.

Introduction

79

B.

Statistical Analysis

80

C.

An Example

85

D.

Conclusions

94

References Appendix.

V.

96 A Machine Independent Random Number Generator

A.

Introduction

B.

An Example

97 100

Parallel Computational Techniques

115

A.

Introduction

115

B.

Parallelism and Bayes Law

115

C.

Look-Ahead Processors

122

D.

Array Processors

124

E.

Associative Processo"s

125

F.

Pipe-Line Processor

125

G.

Hybrid Computer Methods

127

Refe.ences VI.

97

A Passive Receiver:

A.

129 Bearings-Only Tracking (AWACS)

131

131

Introduction

x

136

3B. The Linearized Estimator

C.

137

Application of Nonlinear Filtering

140

References Appendix A.

First Monte Carlo Experiments - Point Masses 141

Versus Linearized Appendix B.

More Recent Experimental Results:

Point

Masses Versus Gaussian Sums

Appendix C. A Movie of Conditional Densities VII.

Example:

Optimal

onlinear Phase Demodulation

155 183

A.

Introduction

183

B.

The Linearized Filzer

184

C. Application of Nonlinear Filtering

198

Referenced

205

Appendix A.

Numerical Experiments with the Phase207

Locked Loop Appendix B.

Numerical Experiments with The Hermite

213

Polynomial Expansion Appendix C.

Cyclic Point-Mass Experiments

Appendik D. A Fourier Series Experiment Appendix E. VIII.

147

A Movie of Conditional Densities

227 243 249 267

Conclusions

Bibliography

271

Resumes of the Authors

275

Richard S. Bucy

276

Calvin He~ht

21

Kenneth D. Senne

282

xi

//

Additional Appendix A.

A Two-Dimensional Point-Mass Program

for the Passive Receiver Problem Additional Appendix B.

A Gaussian-Sum Program for the Passive 355

Receiver Problem

Additional Appendix C.

A Gauss-Hermite Program for Implementing

the Two-Dimensional Phase Demodulator Additional Appendix D.

419

A Fourier Series Implementation of the

Cyclic Phase Demodulator Additional Appendix F.

379

A Point-Mass Program for Implementing

the Interpolating Version of the Cyclic Phase Demodulator Additional.Appendix E.

285

445

Some Unpublished Conference Papers

Referenced by this Report

465

R.S. Bucy, "Realization of Non-Linear Filters"

467

R.S.

M.J. Merri't, and D.S. Miller, "Hybrid

Bucy,

Computer Synthesis of Optimal Dis:rete Nonlinear 475

Filters" C. Hecht,

"Digital Realization of Non-Linear Filters"

K.D. Senne,



2i

"Computer Experiments with Nonlinear Filters"

505 513

Table of Figures

Page Chapter III Fig. 1.

Hermite Polynomial Bayes-Law Recursion

41

Fig. 2.

Coordinate Systems for Hermite Expansion

58

Chapter IV Fig. 1.

Example of a Questionable Monte Carlo Cumulative Average Sample Path

Fig. 2.

84

Probability Density and Distribution of the Asymptotic (N-.o)

Kolmogorov Statistic

91

Algorithm for Partitioned Uniform Generator

101

Fig. 1.

Serl -i Ev&" ýtion of a+b+c+d+e+f

117

Fig. 2.

Maximally Parallel Evaluation uf a+b+c4d+e+f

118

Fig. 3.

Combining Ser.'al and Parallel Computations

121

Fig. 1.

Typical Passive Receiver Geometry

133

Fig. B-1.

Typical Geometry of the "Old Problem" -

Fig. A-1. Chapter V

Chapter VI

Illustrates Periodicity of Errors

149

Fig. B-2.

"New Problem" Geometry without Periodic Errors

151

Fig. C-i.

Detection Geometry in the Presence of Multipath

156

Reception Fig. C-2.

A Priori Density Resulting from Multipath Detection Aimbiguicy

xiii

158

Fig.

C-'3.

A Typical Sample Path Resulting from the Multipath Detection Ambiguity

Fig.

C-4.

159

Absolute Error Performance of Optimal and Linearized Predictors for Multimodal Problem

181

Block Diagram of Linearized Phase Estimation

189

Fig. 2. SFig. 3.

Discrete P Error Variance Discrete P122 Error Variance

195 19

Fig.

4.

Discrete P12 Error Variance

197

Fig.

5.

Torus Interpretation of Doubly Cyclic State Space

201

Fig.

A-1.

MSE Performance Summary

209

Chapter VII Fig.

1.

Fig. A-2.

Fourth Moment Divided by Three Times the Squared Variance for the Phase-Locked Loop Error

211

Fig. B-1.

Hermite Expansion Error Summary

-

P(o) = 4P(-)

218

Fig. B-2.

Cumulative Statistical Variance

-

P(o) = 4P(-)

219

Fig. B-3.

Portion of Sample Function No. P

Fig. B-4.

1

(o)

Fig.

B-5.

B-6.

220 6 -

= 0.3025

221

Error Variance for P at P 1(o) = 4P

Fig.

-

= 0.3025

Error for Sample Function No.

P 1 (o)

6

(o)

-5.2 dB Starting

(o)

223

Cumulative Statistical Variance for P 1(o) = -5.2 dB

224

Fig. C-1.

Nonlinear Filter Summary (Enlarged)

231

Fig. C-2.

MSE Improvement of Nonlinear Filters over P.ase-Locked Loop

xiv

235

Page Fig. C-3.

MSE Difference from Ideal Linear Analysis

Fig. E-1.

A Typical Sample Path of Densities Evolving

237

250

in Time

xv

Table of Tables

Page Chapter II Table 1.

Model for Bayes Rule Conditional Density 18

Recursion Formula Table 2.

Conditional Density Recursion formulae for Bayesian Estimation

19

Outline of a Gaussian-Sum Recursion Procedure

70

Chapter III Table 1. Chapter IV Table 1.

1

The Normalized Standard Deviation 2

as a

function of P and N

82

Table 2.

Monte Carlo Moments of Gaussian Generator

88

Table 3.

Testing the Hypothesis of Gaussian Moments

88

Table 4.

Pr(KN < A) from Massey

90

Table 5.

Results of Kolmogorov Test

92

Table 6.

Sampled Correlation Function

93

Table 7.

Uncorrelatedness Test

94

Table A-i.

Partition Requirements for m=36 bit random numbers

102

Table A-2.

Partition Examples

103

Table A-3.

initial Sample Path - Sequence One

105

Table A-4.

Initial Sample Path - Sequence Two

106

xvi

Page Table A-5.

Repeat Characteristics of the Generator for 107

each Sequence Table A-6.

Table A-7.

FORTRAN-II Coding Examples Two-Piece Generator

108

Three-Piece Generator

109

Four-Piece Geiterator

110

Six-Piece Generator

ll

FORTRAN-II Coding of Gaussian Generator

113

Chapter VI Table A-i.

Monte-Carlo Performance of the Optimal and Linearized Predictors

Table A-2.

142

Monte-Carlo Performance of Optimal and Linearized Filters

143

Table A-3.

Monte-Carlo Confidence Intervals for Predictors

144

Tible B-i.

Monte-Carlo

Average Sum Squared Error

Performance for Predictors - Old Problem Table B-2.

149

Monte-Carlo Averaged Sum-Squared Error Performance for Predictors - New Problem

132

Chapter VII Table 1.

Summary of Continuous Linearized Kalman-Bucy 185

Filter Table 2.

a.'*Ie A-i. "ta*le B-i.

Summary of Discrete Linearized Kalman-Bucy Filter

191

Confidence Intervals for the Linearized Filter

212

Numerical Valuet3 for the Computer Simulation

216

xvii

Page Table C-i.

Monte Carlo Mod 2n Error Performance Data for the Cyclic Point-Mass Estimates

Table C-2.

Monte Carlo Imprcvements - Cyclic Point-Mass over Phase-Locked Loop

Table C-3.

233

234

Monte Carlo Difference Between Cyclic PointMass and Ideal Linear

234

Table C-4.

Timing Estimates

238

Table C-5.

n/m Constant

239

Table C-6.

n Constant

240

Table C-7.

m Constant

241

xviii

I.

Introduction

intended to serve as a record of specific experiments

This report is

with feasible rea]izations of optimal nonlinear estimators. it

In addition,

is expected that future research along the lines described herein will

continue to result in increased understanding of the behavior of optimal estimates and, possibly, in some guidelines for actual realizations in particular applications. The underlying thread of continuity connecting all segments of this research is Bayes-Law (See Chapter II),

the general solution to the

discrete-time nonlinear estimation problem.

Bayes-Law is in effect the

discrete "representation theorem" (see Bucy and Joseph [7]).

Some of the

earliest attempts to realize Bayes Law on the digital computer involved orthogonal series representations of the conditional densities employing Gram-Chalier series (23], however,

or Edgeworth expansion (22].

An early problem,

concerned the tendency for truncated expansions to become

predominantly negative resulting un unavoidable numerical instabilities. In 1969 Bucy [4] proposed a pcint-mass approximation to density functions which involved a selection of important points on a "floating grid" and the centering of mass on the selected points.

The point-mass

approximation was of course always positive, easy to implement, numerically stable.

The computational burden was unfortunately pro-

hibitive for high dimensional systems,

however, and many short-cuts and

simplificadions were -lade by Bucy and Senn. the computations tractabie. by Bucy,

Geesey,

and

[10],

[20] in order tc make

A ;assive receiver -,roblex was introduced

nA- Senne [6: to illustrate the concerts of point-rass

approximation and the associated problems.

:MT

2 In an independent effort simultaneous to the above work, Alspach and Sorenson [2],

[21] pioneered an approach based on a nonorthogonal

series of Gaussians densities,

originally chosen to be set down in such

a way so as to minimize an L criterion, but subsequently to be deterp mined via a 3imple approximation, again in order to provide a short-cut to the otherwise prohibitive computations. In 1970 at the Nonlinear Estimation Symposium in San Diego Bucy and Senne [9] and Alspach and Sorenson [2] each described their respective approaches to the Bayes-Law compucations,

thereby proviing the impetus

for a sizable new interest in the computations associated with nonlinear est:imation. During the last two years, a multitude of topics related to BayesLaw computation have emerged.

Edison Tse [24] has noted a link between

the previous two methods involving a Fourier transform translation theorem due to Wiener [26].

Julian Center [11] has observed the relation-

ship between generalized least-squares projection and seties expansions of density functions.

Hecht [13] [14] has taken a much closer look at

orthogonal polynomials - notably Gauss-Hermite expansions. and Miller [8],

Bucy,

Merritt,

[18] have discussed hybrid solutions to reduce the serial

computational burdens of Bayes Law by substituting the natural of hybrid computers.

parallelisms

Another promising approach to the approximation

problem involving generalized splines has been studied by deFigueiredo and Jan [12],

[15],

while Weinert and Kailath [25] have been relating

splines to least-squares approximation, projection.

Thus the subjects of

numerical methods and optimal nonlinear estimation are now firmly entrenched.

Meanwhile, still another practical application of Bayesian estimation has recently been studied by Mallinckrodt, Bucy, and Cheng [17], by Hecht

[14], by Bucy [5], and by Senne [19), and is

reviewed in the current report.

The new application involves demodulation

of phase-modulated signals observed in additive white noise.

Since the

nominal engineering solution to such problems involves th. well-known and reliable phase-locked loop, it appears that the demodulation problem will continue to provide an important comparison between moment series approximations and numerical density approximations. Moment series approximations are, of course, the most commonly encountered nonlinear estimates in engineering practice today.

The

simplest moment approximation has been referred to by the names "extended" or "linearized" Kalman-Bucy filter [7].

The appeal of such methods is

highly warranted in many problems, since the non'inearities are not severe (i.e. they don't have arbitrarily large derivatives), frequently the assumption of Gaussian noises is cdequate.

and

Whenever

either or both of the "well-behaved" assumptions It false, however, considerable controversy has resulted. order moment expansions,

Some have advocated higher-

[3], others have proposed 3daptive noise

tracking techniques [16], or finite-memory filters j16], but generally nobody seems to ask the most fundamental question:

What characteristics

of the filtering problem have led to the demise of tbe Aimple firstorder method?

Or, equivalently, how would an optimal estimate behave

in such a situation?

The answers to these and othe. iuz:.cions are

directly addressed in the present report.

4 The existing organization of the report was necessitated due to time constraints, and although many aifferent topics are addressed, there is occasionally some duplication.

The attempt was to assemble

a chronolog of the more significant resulrs of the authors during the past three years into one source,

thereby providing a focal point

for subsequent research in the field.

We apologize beforehand for any

unavoidable difficulties for the reader caused by the presentalion of the material.

The global organization of the chapters is as follows:

Chapter II contains a simplified summary of a derivation of the principal Bayes-Law formulas usAd throughout the report. tation is

The presen-

taken principally from the dissertation of Hecht [14].

Chapter III provides a background for the various proposed numerical representations of the conditional densities.

Covered in greatest detail

are the orthogonal series, exemplified by Gauss-Hermite and Fourier, and the point-mass representatitn of Bucy and Senne [10].

Discussed

in lesser detail are the nonorttnogonal series (such as Gaussian sums) and generalized spline functions. In Chapter IV the important topic of Monte Carlo simulation is treated in considerable depth. perimental confidence is

In particular,

the subject of ex-

treated in great detail and an example of the

analysis is given involving the testing of the gaussian random number generator (which is realizable on any binary computer) to determine its statistical properties.

Thus the reader -. s left with a complete

understanding of the experimental methods used for this report. Chapter V describes another important side-light of the current investigpcion - computer architecture.

The concepts of parallelism

and asynchronous computation are introduced and the Bayes estimation

5 problem is interpreted ii light of parallel digital architecture. At first the "ideal" machine is postulated for computation of Bayes law. Then, as allowance is made for technical feasibility and current computer architecture, obseivations are made concerning efficient use of such structures as array processors (like Illiac IV), machines (like CDC Star),

pipe-line

look-ahead machines (like CDC 6600 or i.;O),

associative processors (like Goodyear's), and multi-processorn the Burroughs D-Machines).

Finally, some consideration is

(like

given to

the currently available hybrid computer systems - examining their intrinsic parallelism. Chapters VI and VII provide the details concerning the two examples studied in this work.

The first example deals with a receive-only

tracking system or a passive tracking receiver, which attempts to locate a target on the basis of bearing information imbedded in additive noise.

The problem description is ver- similar to the Airborne Warning

and Command System (AWACS), for the Air Force.

which is currently under contract development

The other example deals with phase demodulation,

whereby a phase-modulated signei is observed in additive noise and it

is desired to retrieve thb original nessage process - at least

modulo-2ir. Chapter VIII contains a brief conclusion concerning the Bayes estimation research and indicates some paths for future developments. The appendices provide documentation on some of the computer programs used and some unpublished technical papers relevant to the current research.

6 References

1 1] D. L. Alspach, "A Bayesian Approximation Technique fnr Estimation and Control of Time Discrete Stochastic Systems," University of California, San Diego, 1970.

1 2]

Dissertation, Dh.L

D. L. Alspach and H. W. Sorenson, "Approximation of bDenIsty Functions. by a Sum of Gaussians for Nonlinear Bayesian .'Simation," Proc. Symp. on Nonlinear Estimation Theory and Its Appli-a'Ions., San Diego, Eept. 1970, 19-31.

3]

R. W. Bass, V. D. Norum, and L. Schwartz, "Optimal Multichanne.. Non-Linear Filtering," J. Math. Anal. Appl. 16 (1966), 152-164.

4]

R. S. Bucy, "Bayes Theorem and Digital Realizations for NonLinear Filters," J. Astro. Sci. 17 (1969), 80-94.

51

R. S. Bucy, "Realization of Non-Linear Filters," Proc. Second Symp. on Nonlinear Estimation Thpory and Its Applications, San Diego, Sept. 1971, 51-58.

6]

R. S. Bucy, R. A. Geesey, and K. D. Senne, "Passive Receiver Design via Nonlinear Fi-tering Theory," Proc. Third Hawaii International Conf. on System Sciences, Vol i, 1970, 477-480.

[ 7]

R. S. Bucy and P. D. Joseph, Filtering for Stochastic Processes with Applications to Guidance, Wiley Interscience, New York, 1968.

[ 8]

R. S. Bucy, M. J. Merritt, and D. S. Miller, "Hybrid Computer Synthesis of Optimal Discrete Nonlinear Filters," Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications, "San Diego, Sept. 1971, 59-87.

.

9]

R. S. Bucy and K. D. Senne, "Realization of Optimum DiscreteTime Nonlinear Estimators," Proc. Symp. on Nonlinear Estimation Theory and Its Applications, San Liiego, Sept. 1970, 6-17.

[10)

R. S. Bucy and K. D. Senne, "Digital Synthesis of Nonlinear Filters, "Automatica 7 (1971), 287-298.

[Il]

J. L. Center, "Practical Nonlinear Filtering of Discrete Observatio.is by Generalized Least Sqiiares Approximation of the Conditional probability Distribution, "Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications, San Diego, Sept. 1971, 88-99.

[12]

R. J. P. deFigueireio and Y. G. Jan, "Spline Filters, "Proc. Second Symp. on Norlinear Estimation Theory and Its Applications, San Diego, Sept. 197j, 127-138.

7 References (Cont) [13]

C. Hecht, "Digital Realization of Non-linear Filters, " Prc. Second Symp. on Nonlinear Estimation Theory and Its Applicadions, San Diego, Sept. 1971, 152-158.

[14]

C. Hecht, "Synthesis and Realization of Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1972.

[15]

Y. G. Jan, Ph.D. Dissertation, Rice University,

[16]

A. H. Jazwinski, Stochastic Processes and Filtering Theory, Academic Press, New York, 1970.

[17]

A. J. Mallinckrodt, R. 3. Bucy, and S. Y. Cheng, "Final Project Report for a Design Study for an Optimal Non-Linear Receiver/ Demodulator,'NASA Contract NAS5-10789, Goddard Space Flight Center, Maryland, 1970.

[18]

D. S. Miller, "Hybrid Synthesis of Optimal Discrete Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1971.

[19]

K. D. Senne, "Bayes Law Implementation: Phase Estimation,, Proc. SWIEEECO Conf.,

[20]

K. D. Senne and R. S. Bucy, "Digital Realization of Optimal Discrete-Time Nonlinear Estimators", Proc. Fourth Annual Princeton Conf. on System Sciences, Princeton, March 1970, 280-284.

[211

H. W. Sorenson and D. L. Alspach, "Recursive Bayesian Estimation using Gaussian Sums," Automatica, 7 (1971), 465-479.

[22]

H. W. Sorenson and A. R. Stubberud, "Non-Linear Filtering by Approximation of the A Posteriori Density," International J.

*

Control, 8 (1968), [23]

1971.

Optimal Discrete-Time Dallas, April 1972.

33-51.

K. Sri-,;vasan, "State Estimation by Orthogonal Expansion of Probability Distributions," IEEE Trans. Auto. Control, AC-15

(1970), 3-10. L24]

E. Tse,

"Parallel Computation of the Conditional Mean State

Estimate for Nonlinear Systems," Proc. Second Symp. on Nonlinear Estimation Theory and Its Applications,

San Diego,

Sept. 1971, 385-394. [25]

H. L. Weinert and T. Kailath, "Recursive Spline Interpolation and Least-Squares Estibation," submitted to Amer. Math. Soc., 1971.

[26]

N. Wiener, The Fourier Integral and Certain of Its Applications, Cambridge, Cambridge University Press, 1933 (Also: New York,

Dover, 1958).

9

II.

Bayesian Estimation:

The Problem

Although the equations for Bayesian estimation are relatively well known, having been derived for example by Bucy[i], derivatior. is

[2],

a modified

included in this chapcer for the sake of introducing

relevant notation and to make the present exposition as self-contained as possible.

This presentation is

taken from Hecht [3].

The discrete-time process and measurement equations may bh written as -n

n-1

Z

h(x) + v

-n

-

n-l (X-n-

)n-i

(i)

---

n

(2)

-n%

Equation (1)

represents a discrete time signal process with

x

sequence of

d-dimensional random vectors; the subscript

refers to

time.

(xn)

R

d

to

is x

r

independent

a function from

-n-l-n-

matrices.

Rd

to Rd

The process

o(xn)

{u n

a function from is

r-dimensional random vectors with density

ran-om vector

c

and has density

is

d-dimensional,

n

a

independent of the

a set of p

Un

u

(w).

The

process,

p (x).

c

Equation (2) represents the observation process, with z a sequence of s-dimensional random vectors, h(x ) a function from to Rs

and

{v n

a set of independent

-n n=1,... vectors with density pv (0) : independent of

s-dimensional random

c

and the

u

process.

n The filtering problem consists of determining the conditional density, given as

Preceding page blank

Rd

10 Jnit(y)dy

=

P(x•edylj

F

z

(3)

Results are stated in the same notation as Bucy [2).

The following

notation will be used in discussing the derivation of the conditional densities. Underlined lower-case Latin letters denote the name of random variables or random vectors, and related Greek letters represent the dummy argument associated with the density or distribution functions. PH)= probability density function P(.) = probability distribution function (The above functions are referred to briefly as "density" and "distribution" respectively.)

Thus, for example, x•

xn, with

the density function of the random vector

=

n

as dummy argument.

F Px (Jz• n1zn n

=

the conditional density of the random vector given the random vector

z

xn

has taken on the value

n-PX (E

zZn

n

) is a function of



The above is frequently abbreviated to with the argument

C

px (in'zn) n n'-n--n1) xn p aX a1 n•,

implied.

9

of the random vectors

x -n

and

and

ý

.

= Px (4nlFn' n the joint density

x_..

-n-Vl

Using the ;oove notation (3)

is

Jnlt(-)dJ = px (ýnIKt=Lt

.. z =_1o)

It is noted in Bucy's original work that the signal process is process with transition density

a Markov

11

Px ri-F

(lir+PjIýJ,&j

5)

*

The required recursion relations for the conditional density Jn~n

are

given by the following equations.

n Flfdfp P.(n-h)(i

-V(_L

n

(n4l -) d•_l,

(6)

n

with

Pv 0 [_C-.h(o)]p(x) W

=

K(o) = pz0(L_)

(7) (8)

,

K(n) = p z (Q-n 1-7n-l''•

(9)

n

J

n-n+l' =

n)

(10)

r(fdjf-r!'-in-n)Jnjn(-n)d4n

x

n+r n L+r

J

ou

n+l-(n) ]Pv

-

]nI n-ih(Yn

) dL&n (11)

where previously undefined symbols have the following meaning: ff

)LX=fdf.-)dx

and pOUnl(o)

,..dxd

=

d integrations

=

thq density of the d-dimensional random vector

G(x)n-

_1.un-

The derivation o0 (6: threugh (11) follows:

J0 1 0 ()d

by Bayes rule.

0...

0

(12)

-m -lm

-m'-

A~

X.

12

pZ

(

AJ=i) corresponds to the distribution P(4

412io-1)

which corresponds to Pv (4-hýI(j

3

)],

or

pz0(LjXO=')=Pv [:0 -h(i.%)

-

(13)

Substituting (13) into (12) gives

which is (7).

J njnnýýdd-n

Pv [.4-x(.%)Ipx (JO)

1

"Px0

Next, using relations of conditional and joint densities,

n(ýnI.4n"**1.E

=Px

PZ, ...

-Z

5-

z0 (ý'

,

Z-l

K(n) JP

-'4) I.-.

n ) zn x

Now operating on the integrand,

l0 n-l, ...'. 0

1

n-l(ý-n'Lnllzn-l'

.....

'4)d~n-1

13

P•,~

zn_(-1''--)

~

n-i

xi

u Next, by indepen'dence of the -n

and v"-n processes and the Markov

property, nx-n Pz (;-

Pz (; 1% 'x-11-V-•-l'"-vO)

(16)

n

n

which may be manipulated using the corresponding distribution function. .n Xn4 P(zn

ii/

t4

U

19

'Ve

x.x

w 0 oq w

I

I

-

P6

E-4

H

4.)o

r.-II I

0 VI

C:)-

04

c-ap

I

-

20

4$4

44

4

0 0

*

00

0

0

0-I

>1

••1 0•*r-l

o

•o

44

>

0U4 •

0•

Co

4-)

€..4

4,.)

0

H"'

H,-

I

4J

4.)

0 04

14

o



4.

0c 2 00~

4J

0

rU

0

co

o



04J •-

I.)

*,

0

(

o

.l041 04



o4-

-

4J*

I1

I

00 00C

7-4

0

. 0]

C

Co.0

0 .

,,0 0

n

4,

'-ý

21 References [1]

R.S. Bucy, "Bayes Theorem and Digital Realizations for Nonlinear Filters," J. Astro. Sci. 17 (1969), 80-94.

[2]

R.S. Btcy and P.D. Joseph, Filtering for Stochastic Processes with Applications to Guidance, Wiley Interscience, New York, 1968.

[3]

C. Hecht, "Synthesis and Realization of Nonlinear Filters," Ph.D. Dissertation, University of Southern California, 1972.

23 I

Finite Dimensional Approximations A.

Introduction reviewed in detail in the

The formal Bayes-Law calculations, previous chapter,

are functionally simple but their implications to

computation are formidable.

To begin with, the representation of

densities is in general an infinite-dimensional problem.

Although

there are many examples of density families characterized by a finite number of parameters,

the Bayes-law computation rarely reproduces

another member of the same family.

The most well-known exception is

the Gaussian family, which is reproduced under linear transformations, resulting in the widely used Kalman-Bucy filter, which optimally describes the Bayes-law computation in terms of linear differenLial covariance of the Gaussian

(difference) equations fcr the mean and

conditional densities, provided that the underlying physical system is described by linear differential (difference) equations with additive Gaussian inhomogeneities.

If

the physical plant is not

linear or the disturbances are not Gaussian, however, the Bayes-law rarely leads to a reproducing family of densities.

Thus an improvised

or unnatural parameterization of the densities is required in order to an infinite number of parameters is

implement Bayes-jaw and, in general, required for exact representation. then is simply staced:

The numerical approximation problem,

how do we choose an appropriate (finite)

subset of the para.c'ers to represent a given collection of conditional densities?

Since the answer to this fundamental question depends

heavily on the densities in question, it

turns out that little

can

be said about the applicability of a given approximation without dis'ussing specific example problems.

Preceding page blank

Even for a given problem there

24 may be several different appropriate numerical approximation schemes, depending upon the problem parameters. will be discussed in In

Exampl•.s

of these dependences

Chapters VI and VII.

the present paper we will discuss some representatives from

several categories of density approximations as well as their Bayes-law implementations and associated difficulties. discuss orthogonal series,

First, we

covering a candidate with unbounded support

(Gauss-Hermite polynomials - Section B) and also a candidate with compact support (Trigonometric series - Section E). Next, we discuss an approach involving nonorthogonal polynomials which is

intended to provide positive density approximations for any

finite number of terms (Section D).

Thirdly, we discuss an intuitively

simple approach to density approximation involving point masses (Section C),

suitably distributed so that most of the probability is

covered by a small number of discrete points.

adequately

Finally, we describe a

relatively recent additional approach to the problem using numerical spline functions (Section E).

The presentation of this paper is

only to be representative and not exhaustive,

intended

since there are many approaches

to numerical approximation which have yet to be considered. B.

Orthogonal Series

The general theory of orthogonal series may be found in many places in the literature (see,

for example,

[10]).

intRend only to provide two contrasting examples, thr. significance tion at hand.

If

In

this section we

whereby we illustrate

of the type of state-space required for the applicathe state vector can take on all values in

with positive probability,

then orthogonal polynomials must be used

to provide the necessary approximation.

On the other hand,

if

25 the state-space actually is or can be approximated as compact then a periodic function (such as trigonometric) might profitably be employed as a basis for an orthogonal series.

It may happen that a

given problem may be interpreted in either way (see, for example, Chapter VII),

so that more than one orthogonal series may be

appropriate, depending on the performance desired.

1.

Least Square Polynomial Approximation, Scalar Case

The theory given here follows Hildebrand [10], outline giving key results.

and is only an

Detailed proofs may be found in the

reference. We wish to approximate a function f(x) with a series of polynomials y(x),

as follows: n

(1)

y(x) =-0 ak~k(X)

£(x)

where 0o(X),...,n (x)are the required polynomial functions.

The

approximation is to be thebest in the sense that f bw(x) (f(x)-y(x)] 2 dx a

bf a

w(x)[f(x)-_

n

•k()]2d 0

K=O

(x W2k

= Minimum

(2)

with w(x) a specified weighting function which is assumed non-negative on the interval (a,b).

26 Equation (2)

imposes the condition on the coefficients ak,

b aar

J'w(x)[f(x)-

=

k=0 ad

(3)

(r2,l,...,n)

0

from which

b

n -0 ak

w(x)or(X)ok(X)dx =Jr

(X)f(x)dx

Wx)

(4)

(r=0,1,...,n)

The coordinate functions are chosen to be orthogonal to each other over the interval (a,b) with respect to the weighting function w(x).

(5)

r#k

0

iaW(X)0r(x)Ok (x)dx =

The "uncoupled" equations reduce to (omitting the argument x)

b

b 2

a

w0o dx

wO fdx

or

(6) b

a

a

=

r

aWOrfdx JWrfd

fbw2rdx

To construct the polynomial functions 0o(x),0j (x),

... ,0 r(X),

it

is required that the polynomial 0r(x) be orthogonal to all polynomials of degree inferior to r,

over the interval (a,b) with respect to

the weighting function w(x).

27

(x) qr-1l(x)dx

a

where q r

1 is

notation is

=0

an arbitrary polynomial of degree r-I or lese..

(7)

The

introduced

i

dr U (x)

r •

~dxr

r (r w

r

so that (7) becomes •

b0

_•U

a r

(r)q

xqr-1

(x)dx

=0

After r integrations by parts

[U (r-l) q -(r-2) r rl-

(,r-l , q(r-l)]b r-l) Ur_ a

The requirement that 0 (x) be a polynomial of degree r implies

r

that U (x),

r

I L2

from (8),

satisfy the differential equation

dr+l

1

drU (x)1

--

dxr1w(x)

J -

dx r

0(10)

(9)

S

28 in

(a,b) whereas the requirement (9) qr- (b),

qr-1(a),

qr_l(a),

q'r-1 )(b),

be satisfied for any values of etc. leads to the boundary

conditions r

UrI (a)

U(a)

Ur (b)

=

U (b)

For each integer r, ary conditions (11), is tions,

0

r (x),

Uri (a.

=

..

..

Ur(r-1) (a)

....

U(r-1)(b)=

r

0

a solution of (10) uhich satisfies the boundthe r thmember of the set of polynomial func-

given by

i drUr (x) r

w(x)

dxr

(12)

The numerator to compute the coefficient ar, given by (6)pis a function of f(x).

The denominator,

designated yr' is independent of

f and need be computed only once.

fr= br W

where Ar is

2(x) = (-l)r!rArbaUr(x)dx

the leading coefficient of 0 r(x).

Or(x) = Arxr + A

+rxr-1 ... + Ao

(13)

29 It is

shown, for use in the integration formulas, that if w(x)

does not change sign in (a,b), the polynomial

0 r(X)

possesses r

distinct real zeros, all of which lie in the interval (a,b). For application to the problem of Chapter VII of this paper we want to approximate n

f(x)

=

~ v(x) E br~r(x) r=O

y(x)

(14)

such that

L0-' 1_ý(X) !W-0f(x)

dx

b0rr(x)

minimum

(15)

which leads to the result

b br

which is

fOrdx Wf

-

(16)

2 equivalent to minimizing the squared error (f-y) with re-

In the application we let

spect to the weighting function w__2. v

22 (17)

w = v Me and the interval (a,b) is

,

which leads to the Hermite formulas.

For the above choice of w(x)

x drUr a 22 dxV

30 where Ur satisfies (equation (10))

dr+l

a•2 x2 drýUr

dx

22 dUr 1= 00

(18)

dxdxr and from the boundary conditions,

(11) requires U and its first r-l

derivatives to tend to zero as x++The function

(19)

x

Ur (x) = Cre

has the property that its rth derivative is the product of itself and Ai

a polynomial of degree r.

It therefore satisfies (18) and the boundary

conditions.

r (e-a x 2 dx

0 W)Creadx

(20)

The Hermite polynomial of degree r is defined by taking

Cr = (_l)r and

2 = 1.

2dr

H (x) Hr

e_2

(-l) rex 2d"(edxr

)

(21)

For C rr r 0r (X) = Hr (ax)

(22)

31 ttne Hermite polynomials are determined from the recurrence formula H Wx) = 1 0

HI(X) = 2x and Hr+1 (x) = 2xHr (x)

-

(23)

2rHr1 (x)

Equation (14) takes the form

y(x)

with a and b

22n -a ' x b H (ax) r=O r r

(24)

(from (13) and (16))given by 2r

SYr

2a

1Frr

(25) br

2 rr,

f f_

f(x)H (ax)dx r

by using (23) for Ar and (19) for Ur, with Cr as given above. 2.

Least Square Polynomial Approximation, Multi-dimensional Case We extend the above theory to multi-dimension functions.

To be

specific, and in accordance with the requirements of this paper, the theory is shown for a function of two variables.

Generalization to

higher dimension is straightforward (although messy, requiring double and triple indices to avoid awkward expressions). The approximation Is

32 ~

f(x 1 x 2 )

ml•=

mkm 22=ak•lm~

2

Y(xlx2 ) = P)

)

1

2

(26)

=OLl

with the requirement a2Wl(Xl)W2 ( x2 )[f(xlx )-Y(Xlx 2 2 )]2dx 1 dx 2 = minimum

(27)

Differentiating with respect to akik2 and setting the result to zero leads to in

m

1222

2

b1 b 2 =f Ia w (X1 )W2(X2) r (x1 ) 0 (X2 )f(xMx)dxldx2 2 1 r2 12 122

1

2

(28)

Using orthogonality properties identical to thcte for the scalar case

J a1 Wl0k 0r dx 1 1kr W2 A a2

gives for the coefficients

r dx2 2

2

0

=

k10 r1

0

k2Or

0

2

33

a

af2wlw 20rIr 2fdxldx2

r1r2lwl•2

dx a1

I1

2w2

(29)

" a2

2 2

(r 1 r 2 =O0,01,10,11,02,... , mm ) 2

The polynomial functions are the same ones used for the scalar case. The denominator of (29) is given by yr Yr2 where y my r as given by (13). The two-dimensional approximation that is needed is

f(

1xl 2)=

Y(XlIX 2)

V1(xI)V 2 (x 2)E =

~

br r

(x1)Or (X2)

r 01i2r

0 rl=U

2 =0

12

2

where, analogous to the scalar case,

b

I 1 2 r1

S~Letting

-aI =1

w2

v2

2 2 e -a2x2

a,

a2

-•

0

=b

b 1

2

bl

b2

wa/ fO

a2

f

0 d

d

(30)

34 and using the results of the scalar case,

22

Y(XX

2)

22_

2-a2 x2Th -a2x e -12 r1 r2 br r Hr (ai e l 2=0 12 1

Hr (a 2 x 2 ) 2

(31)

(-CO

0

in

pp 2.

f

x2kx 2 ke' 12X2e' le 2 Xdxjdx2 exist for all nonnegative integral

values of k.

L

- - --

- - -

48 3.

The integral in Equation (64) exists. =

-1- and a'

For the choice of a

----

2a 2

2a2

with a, and a2

the characteristic values of the covariance matrix, the first two conditions are obviously always true, and if decaying density function,

condition 3 is

of this section we associate f(x function in the form of (51)

1

f(xIx 2 ) is

true.

In

x 2 ) with an approximating density

in

(51)

Since the

is bounded for all values of

the arguments (and from reviewing the form of (62)), condition 3 is

the developments

with Jnjn-1 given by (62).

coefficient function of JnjnI

an exponentially

it is

clear that

true even for the approximating density function for any

value of m. Let f(xIx 2 ) be the density function we wish to approximate, let m=O.

Then

00

i

" f(xlx 2 ) dx 1 dX 2

H

*..2x

ala2 y (xlx2 )

=

-

Let al and a2

2 -A

I

e

e

be the variances of f(x

22

(66)

-

1O2 _x220 2 x2 -a•?X 2X2

~21 a

1%

and

(67)

1 x 2 ),

and let

49 E4uation (67) is

y(xlxx2 )

e

X.i

2 X2

e

(68)

We see that by taking only the zero'th term of the expansion we can get the best gaussian fit to the true density function, and also a verificatiun on the form of

the coefficient equation (64).

In addition to the above assumptions (except let m>O), let nj and the means of f(xIx 2 ), and let

be the i-jth central moment.

j

Then 4,j

(X1-nd)i (X2-n2)j f(x

=ff

x )

dx dx

(69)

The product of the Hermite Polynomials when expanding about the expected value can be written

HI ra 1 (4 1 -n\ HjaIQ(X2,-nl~ 2 ) = -r,'y•rz 1ai11a(X l-nl

j(xflnj

(70)

with rkaij a function of r, £, i and J.

Equation (64) is then rewritten as

br-

N.

2r r'21a2

-jj -C f fx2

a1a2

t

2rr!2 L£1w

i=O j=O

r

2 =O0 re a- ai J•(xl-ni)

a ij rk i ajal2Ii j

(x 2 -n 2 ) dx 1 dx 2

(71)

50 Comparing (63) with (60), we see the approximating polynomial expansion was accomplished by letting

2

2

'

2

1 A22

0

,

and using (71)

to compute the coefficients,

r~aii•

and (71)

in (70)

Hermite Polynomials,

V1

n1

brR.

,

12

v2

The cuefficients,

are functions of the coefficients of the

and can be computed one time in advance and stored.

The formula for generating the Hermite coefficients is given in the previous section. raj

Designating cmn as the coefficient of x

in Equation (70)

m

is determined for each r and for each k from rraiij

5.

in H (x),

=

ri Cj

Applying the Hermite Expansion

The general formulas of the previous section are specialized here for the case when the system equations are those of the phase demodulacor (see Chapter VII).

This section describes the techniques that were used

to mechanize the given equations.

In particular,

vi, the characteristic values A., and vectors,

t

the conditional means of the covariance

matrix, and the coefficients of the Hermite approximation, br2 are described in detail for this specific example. To perform the integrations to obtain the means and central moments (to compute the quantities in the above paragraph) Equations (51) (62)

are combined into one equation, omitting temporarily the normalizing

constant.

L

and

51

-"

JnIn (Y2

S (yly 2 ) Jnn_ 1-

(73)

,

2

where

S(yJy 2)

and Jnjn-,

expj

=

-

Ijjz1 -COS

Y1 )2+ (z.-sin

yl)2]1

(74)

given by (62), consists of two parts, an exponential

function and a polynomial function of the arguments y1 and y2 . The exponent of the ex[,onential part was a quadratic form; i.e., the form (-a jjx-x (73)

II

). To perform integrations of (73)

of

and Equat! n

times ylyl, the combined exponential function is put into the

following form in order to use the Gauss-Hermite formula:

1 r(Yl-ml)2 exp

[0--

-

2-

2P(y1-md)(y 2 -m2 ) -

with the exponent of S(y 1 y 2 ) of (73)

(y -1 2 )2 +

22F(yy

2)

(75)

(as given by (51)), linearized,

and the linear terms combined with the linear exponential terms of Jnjn(y1 y 2 ). 1

The error between the linear and nonlinear parts of the

exponent were combined with the polynomial part of Jnln-i to form F(yly 2 ).

This technique was suggested in Bucy, Geesey, and Senne [ 5].

52 The terms of (75)

mi

A =

E(yl)

m2

=

E (y 2 )

are

E [(y M1 )2 1 -.

01

or2

=

E [Y2-m2)

P

=

c'rrelation coefficient

E[ (Yl-md)(Y2-m2) ] =

~01

We operate on the exponential part of (74) [(zl- cos Yl)2+ (z2-

-

LL2r

-sin

(zlz2r COS Yj-

+

y,

as follows.

sin yl)2

Ccos- j + Co s y

sin y

sij

=

cos Y

+

Y

siny 1

sin y-

y1 sin y

31 Cos y +y 1 cos y

I

I,

iiEf n

z(

1

sin y l

()2 (76)

with

cos Y

(z

,

z(

)

53

exp

A LYIc 1-O

exp

(z2 -sin yI) 2]

~2+

L

(z


2.

b = 0

The multiplier

and r

99 need only be chosen so that the number of significant bits in is

in+1

approximately

m.

will force

into the next cycle, leaving little or no correlation between

adjacent numbers. any

In this way each application of (A-i)

r2S+l

For this article we shall choose

a = 8z - 1, or

m-bit number with the last three bits equal one.

left with the choice of

m.

The common choice for the word-length of the

word length.

Thus we are

1

is

the machine

n Such a choice automatically makes the generatoL machine

dependent, since the modular arithmetic contained in (A-I) will usually result in arithmetic overflows, the result of which is not predictable in general for all machines.

Consequently we will consider segmenting

the numbers into the form (m is evenly dividable by q) m(q-l) 1

= 11

m(q-2) 12 2 q + ...

2q

If we then segment

+ iq 2°

a

(A-2)

n

n

n

n

into \'he analogous pieces a_(q-1) + a q 20(A-3)

+ ...

a = al 2q

we may perform the operation (A-1) as follows:

First we compute

2in(q-i) a 1

= al.1 n

1

2

q

n

2in(q-2)

"+ (a

1

12 + a n

2

1 1) 2

q

+

n

2m

"+ (aq-1 1q n

+ aq

1q-1) 2 q + aq 1q 20 n

* See [1] for an equivalent discussion for decimal proposed techn'que applies only to binary machines.

(A-4)

n

machines.

The

%twi Z1

S

'74

!-1!b4

!100 Next we observe that a.l but the last half of the center term and the rest of the lower order terms are eliminated when we take the modulus leaving

2 m,

relative to

i (al lq + a2

1 q1

+ ...

+ aq 11) mod 2q

2

q2

n

n

n

n+l

Lm( f1--2) 2

+ aq 12) 2 q n

3 + (a5 iq + a iq-l + n n

(A-5)

+ .-. + aq Iq 20

the usual bracket (integer part) function.

where [.] is

see that the last half of the last term in (A-5) is

Finally, we

V iq 1

.he

remainder of the last term plus the last half of the next to last term is

iq-l

etc.

pieces of

i+l

We summarize the algorithm involved to identify the is

in Fig. A-I, where it

assumed that

ak

1in

and

are defined froin the previous operation or initially by a suitable seed. We see from studying (A-5) accomplish the update is only q

bits to add up

that the maximum precision required to plus the necessary carry-over

2q-bits

nuimbers of length

2m q

and q m of length M m one number

p = (log 2{q( 2 q - 1)2+ 2

This exact number of bits is

- _ 1}]

+

1,

where the bracket function has again been used again to denote the largest integer.

B.

An Example

In this section we introduce a specific example of the generator for

m = 36,

which is chosen since it

evenly dividable by

q = 2,

3, 4, 6,

is sufficiently long, and yet 9, etc.

Thereby providing

101

+ E_ 0-c:

0

c\J4

0

CS-

+)

z

~

4-4

X

Ai

14 00

C~J

ar+ 'Aj

102 The choice of

significant flexibility.

q*

is

determined primarily so

that the operation (A-5) does not lead to integer overflows on the machine concerned.

Table A-l summarized the hardware requirements

for several combinations.

Table A-i. for m

q

=

No.

Partition Requirements 36 bit random numbers

m= Bits per Piece q

Bits/(positive) integer

1

36

72

2

18

37

3

12

26

4

9

20

6

6

15

9

4

11

12

3

10

18

2

8

36

1

6

of Pieces

The number of carry bits increases at such a rate that it is not very efficient to divide the generator into more than about six or nine parts, but the principle remains the same regardless. In order to initialize the generator it

is nece:sary to provide

q

pieces of a suitable seed (i.e., one that leads to a rel.iable sequence). Table A-2 lists the multiplier

a = 7357764655278

a - vhe initial

* Of course the individual pieces need not necessarily be the same

length but this choice makes the algorithm simplest to co,;i.

103 numbers



a

(Sequence One and

10

3110375524218

ZSequence Two)

for several dJfferent dichotomizations,

Table A-2. No. of Pieces 2

3

4

Partition Examples

a(l0 for Sequence One) 7357768

24473410

10 for Sequence Two 3110378 = 102943 8 10 5524218 = 18561710

4655278

=

15855110

73578

=

382310

31108 = 160810

76468 = 400610

37558 = 202910

55278 = 290310

24218 = 129710

7358 = 47710

3118 = 20110

7768 = 51010

0378 = 03110

4658 = 30910

5528 = 36210

5278 = 34310

4218 = 27310

738 = 5910

318 = 2510

578 W 4710

108 = 08]0

768 = 6210

378 = 3110

468 = 3810

558 = 4510

55q = 45

24

9 27

10

2310

= 20 8

10

21= 1710

Table A-3 contains the first 250 octal numbers resulting from Sequence One (equivalent to the seed 1o = 1 vith one number discarded)

104 and Table A-4 contains the initial 250 octal numbers from Sequence Two. The statistics for these two sequences is given in the main part of this article.

The cycle length of the generator may be calculated theoreti-

cally as follows:

The next to last octal digit repeats every 8 steps,

the third from last every 64 steps, etc., so that the first digit repeats every 811= 2

8.59x10 9 .

In practice, however,

only be obtained for certain seeds,

thus a check must be made rt

guarantee that the cycle is sufficiently long. run on the two given starting numbers, greater than 107 for both.

the maximum cycle can

Such a test 'as been

confirming a cycle length of

The repeat characteristics for the initial

segment Qf each sequence is given in Table A-5.

105 Table A-3.

0/ 735776465527 546011324661 10/ 6050071C5407 124027314201 20/ 176553134667 767051140121 30/ 003422023547 717370070441 40/ 004604602027 051575575361 50/ 715232077707 120347726701 60/ 372342545167 337314554621 70/ 122324612047 001636347141 80/ 336670306327 605537556061 90/ 221123262207 316227251401 100/ 226525145467 317604301321 110/ 142603170347 135622135641 120/ 524644402627 467414246561 130/ 700055634507 132422504101 140/ 373255535767 106715136021 150/ 376370336647 776340234341 160/ 364024067127 651040447261 170/ 404344377007 750206646601 180/ 201227116267 403544102521 190/ 166317275147 470725643041 200/ 60662214342? 630171357761 210/ 333362331307 370540721301 220/ 312174466567 255106157221 230/ 272143023T!7 304377561541 240/ 254151607727 677064000461

Initial Sample

310473353621 742044723047 133551226141 037341537327 434502115061 101236233207 655044670401 646162236467 613250400321 671013001347 455020314641 152627333627 227000105561 116564305507 374157423101 775630326767 015736535021 510277647647 706711713341 633530520127 633535606261 741056550007 032153065601 375327407267 0422:3001521 365736306147 401342622041 641260274427 333470016761 252510202307 353204440301 613632457567 50fL332356221 406:301534447 726750040541 177751440727 303653737461 210374024607 711550723001 260414520067 470332042721 665204552747 213646567241 750417175227 445426370161 23P505037107 C5uC05115501 014030E50367 3(,6426637421 515002361247

ath-- Sequence One

167041756507 640355342101 350232117767 4743411340'ýl 430116160647 560164372341 716424151127 445053745261 515637721007 575260304601 045756700267 431002700521 355564317147 775360601041 707405425427 612307455761 570205053307 005511157301 266117450567 550337555221 465147245447 467421317541 327540271727 520464676461 110414375607 666544742001 445347211067 607664541721 277101763747 545643346241 525057526227 673720627161 223461110107 553760434501 371240741367 614376636421 252737272247 401663704741 492476352527 334470267661 711256012407 207031037201 510047461667 353573043121 176732370547 737217553441 440027567027 467110440361 414376104707 427513351701

473777560041 062221556427 743450114761 3*, 0724307 2•6C67630• 04Z2133441567 2761e5754221 15753I56447 265113'76541 45431612ý2727 1443166354E1 6663037466%'k 27110176100. 720430702067 307500240721 367006174747 777441125241 151007057227 775734066161 3306u4161107 205774753501 351100132367 113327635421 721203203247 527374763741 272007403527 763355026661 631244563407 442014656201 207274352667 721731342121 631546001547 023534132441 052132320027 674336477361 517640355707 777736470701 057370563167 653502156621 652411370047 714713611141 721112624327 254315660061 136662340207 450437213401 146437763467 231117103321 064610546347 606030577641 515343520527

712641373067 526574737721 305721405747 350037704241 757225410A27 652470325161 563076232107 353052272501 124366323367 042241634421 150756114247 046407042741 550107434527 241462565661 434708334407 326541475201 146450243667 34055064A121 17017041•5$A7 264051511441 543324051027 115505536361 015051626707 230422607701 ?12242154167 265316755621 642713501047 554424470141 627606055327 265216217061 463257311207 53435263240! 611617054467 364021202321 657002357347 125624756641 610550451627 443151407561 702116163507 314326565101 207731744767 710254537021 62!370025647 707667555341 336506436127 716164310261 156701226007 470403427601 030455625267 336136203521

106 Table A-4.

0/ 311037552421 473364616247 10/ 542451520741 547717776527 20/ 675431003661 053565536407 30/ 502132453201 106377305667 40/ 473763357121 0236743111547 50/ 663364767441 416435613027 60/ 365060554361 505142230707 70/ 517054365701 547002416167 80/ 260526273621 356334603047 90/ 764170546141 317441017327 100/ 676556035061 760375113207 110/ 376025210401 621410516467 120/ 460605320321 253160661347 130/ 300601634641 050344613627 140/ 733056025561 302101165507 150/ 117001743101 042774606767 160/ 43577545F'21 731123527647 170/ 207035233341 573464000:27 180/ 567015526261 111351430007 190/ 550037405601 551211667267 200/ 732171721521 316040166147 210/ 257230142041 650231554427 220/ 323351736761 050561062307 230/ 275332760301 473032737567 240/ 15737b276221 676461414/147

I

Initial Sample Path - Sequence Two

222760501707 332563504701 731701007167 273546072621 234703714047 770664425141 610621250327 335621374061 632477064207 676663627401 333714607467 514612417321 124517472347 273261013641 431336544627 045242664561 473076636507 667257662101 07651L377767 114060054021 642462040647 103367712341 330477431127 577013665261 370652601007 615224624601 5r4761160267 45C423620521 361406177147 174326121041 712476705427 436651375761 652775733307 027717477301 472437730567 32106-475221 3130471254117 463330637541 510447551727 264630616461 223563255607 144415262001 554005471067 040711461721 75"657643747 47' 71466621! 2)405006227 4(`-C6654716: 1L,'(5770107 736L7!754501

553141172641 566217475627 410550523561 006643307507 341176601101 113267170767 320523453021 512727351647 377623371341 551702062127 154633024261 061222752007 542553043601 466057451267 411734517521 04616j210147 534025100041 341433036427 344472034761 544561604307 731145216301 053673721567 062330674221 223143636447 2201631165.1 633345402727 723142555461 130172626607 565032301001 056207162067 312205160721 747304054747 566572445241 602454337227 577562006161 605651041107 604567273501 567574412367 7115365554P1 007157063247 534070303741 106672663527 275404746661 102267443407 634651176201 0213506632667 511042262121 575777661547 246771452441 365033600027

6115504742267 113326416521 125147221147 135125057041 446056167427 143633473761 336714455307 160233735301 747756712567 062360073221 077747347447 2511163755zli 024232233727 107475514461 103451177607 632010320001 357537653067 222761657721 445737265747 314251224241 603012670227 444776245161 230663112107 505724612501 274202603367 670130554421 700451774247 526162362741 007112714527 402172505661 4b0445214407 253456015201 536002523667 315341561121 660142272547 660367031441 722345331027 731617456361 557372506707 460601127701 616112434167 2062116756e1 623743361047 506704010141 113445335327 502132137061 065156171207 221173152401 b0660533'.467 706216122321

421051453461 436576550607 002327337001 311217344067 732017356721 024401476747 717531003241 522640221227 703733504161 672044163107 421123131501 205237774367 655503553421 104253705247 1505554417111 677121715527 4552012zj4661 263271765407 123023634201 621225414667 051322060121 077113703547 004765410441 o0774606RO27 07236751536! 674272757707 220606246701 353333025167 323667474621 '7710741172047 523175667141 4706475663?7 465133476061 021542142207 251127571401 422633425467 443461221321 275231050347 754323455641 352241662627 553012166561 622652514507 671165024101 237302015767 44,727405602). 505474216647 5734035543411 131637347127 325640367261 732117257007

107 Table A-5.

Repeat Characteristics

of the Generator for each Sequence Cycle

Sequence One

Sequence Two

0

735776465527 8

3110375524218

81

062221556427

82

ý8 650107434527

1136462 655503553421

83

617512155527

514633562421 -;-8 524218

1133264165218

8

84

--- 8

8

65527

85465528

552421 8

86

6465527

75524218

The final concern in describing the generator is to give examples of coding the generator.

These examples, written in FORTRAN II-

compatible code, assume no special hardware characteristics except that only the applicable number of pieces has been selected.

It is further

assumed that the subroutine remains core--resident so that all locally defined variables remain unchanged between calls.

If the subroutine

must be dynamically reloaded on call (either because of load on call restrictions or virtual core) then the locally defined variables must be stored globally in COMMON.

We leave this modification to the user.

Table A-6 contains coding examples for two, three, four, and six-piece generators.

No optimization has been done.

108 Table A-6.

FORERAN-11 Coding Examples

Two-Piece Generator

FUN-CTIoN xiAMFNF'S~rj0DE) 10 N:1=244 734 M2=158551 N I=NS C 1)

T2:=2.**(-36)

100 DO 200 I=1,2 (,'TO (110,1P0).,I 110 H=fv.2*N2 (-0 11O 190 120 X=M1*N2+M2*W1+1•D

190 RL =I

*10 z0

C0

113

4

I-

-J

'4-4 +

LU

+

-t

+

+ '4-4

0 0 4-i

-4

-4

-4 -4 t4 -4

-4 -4

N -H 44

I-

119 expressions is given by Baer and Bovet (2].

For a definition of

maximal parallelism and a devzlopment of the associated theory, see Keller [9]. Now that a note of caution has been inserted we proceed to a discussion of higbly structured parallel computations, where we use Bayes Law as an example.

Bayes Law may be expressed operationally in the

form (~ j (Y).d C n+l n+l() =jdf T_(Yx) Jn ;-_

where

Tn(Y,x)

is

(,

a spatially varying kernel which is a function of the

product of probability densities of the noises and the new measurement Zn,

and

Cn+

is

the normalizing constant (i.e., the integral of

the right-hand side of (1) over all y).

It should be clear from inspect-

ing (1) that there is implicitly a form of parallelism called for, in that for eery be comp,'ted.

y

in

Jn+l

a

d-dimensicnal convolution integral must

The explicit form of parallelism will bj a function of the

nature of the (finite-dimensional) algorithm chosen to implement (1). The point-mass approach of Bucy and Senne (5] maps (1) into an equivalent The least-squares series expansions of Hecht [8],

matrix multiplication. Alspach [1],

and Center [6], for example,

update for expansion coefficients.

replace (1) by an equivalent

The exact implications on the

structure of the computer will of course be dependent upon the form of the algorithm. problt

(1)

To be specific, then, since many of the associated

e-:'e similar, we wi.1

discuss the matrix multiply analogy to

where symbolically we write M

X (X,'x)

J(x)(2

120

J

=

C

(3)

'

il Jn+l(i) =n+l(

(5)

l-i 'n+l()

Xn+lIn

and

i=I..M,(4

iyCn+l,

(where we choose to discuss the

- (5)

The overall structure of (2)

conditional mean estimate ao a specific example) illustrate both purely

(by analogy with the example in Figs. parallelism could be built into (3)

1-2) it

is obvious that some it

and (5),

While

computations.

(5)

and essentially serial (3),

(4)

parallel (2),

is also clear that a

processor which is optimized to perform the totally parallel operations (2) and (4) would be mostly wasted on the scaling and estimation integrals of (3)

and (5).

Thus it

is clear that even in this highly

structured problem a combination of computer architectures would bc necessary to optimize computational speed with minimum overhead (i.e., idle functional units or processors).

Ideally, then, we might envision a

and accumulates the sum (2)

i =,...,M

J'n 1

in

(i)

for

successively for

we had enough processors and temporary storage we

(If

j = 1,...,M.

J (x)

Trn(.ix)

parallel computer simultaneously evaluatE

could even imagine tree-structuring tMe latter computation in the form of Fig. 2.) that our

Imagine,

J'V (y.)

is

the evaluation of (2), and

ý(njn-l),

C )

using

n

further, that we only had off by

..

factor of

C

J'n

to use in (2)

which,

so

simultaneous to

we are evaluating in an auxiliary

serial machine,

which we are also evaluating in a serial machine (modulo

JV

n

instead of

take the form in Fig. 3.

J

II

.

The resulting computation timing could

o p44

0

.-4

414 coo

:

o SU I S

X|

::

0

0

,-4

0

-I

U

X r..

$4.,1

4

"0

C4

4-

C.)r

~

-4

0 S4

r

-

+

,'-4 i

"

-

00

,1-I

.

,,4

i • C..4

-4

.'

+ Q,

since the dynamics are linear.

The results

are shown in Table A--2, also taken from Bucy and Senne [4].

Table A-l.

Monte Carlo Performance

of the Optimal and Linearized Predictors

Optimal nonlinear predictor

n

Linearized predictor

Zero state predictor

Average Average error Average covariance error Average covariance

Theoretical covariance

1

-0.094 0.078

0.359 0.230

0.230 0.857

-0.371 0.477

0.645 -0.765

-0.765 2.728

0.350 0.050

0.050 1.100

2

-0.008 0.042

0.146 0.068

0.068 0.599

-0.277 0.081

0.401 -0.183

-0.183 1.255

0.188 0.075

0.075 1.200

3

-0.004 0.026

0.131 0.051

0.051 0.354

-0.158 -0.041

0.222 0.012

0.012 0.623

0.147 0.088

0.088 1.300

4

0.007 0.003

0.127 0.088

0.088 0.351

-0.081 -0.084

0.158 0.074

0.074 0.449

0.137 0.094

0.094 1.400

5

0.015 -0.022

0.129 0.069

0.069 0.342

-0.069 -0.106

0.155 0.113

0.113 0.723

0.134 0.097

0.097 1.500

6

0.038 0.025

0.177 0.099

0.099 0.369

-0.128 -0.315

0.255 0.589

0.589 3.813

0.134 0.098

0.098 1.600

7

0.075 0.065

0.217 0.140

0.140 0.416

-0.188 -0.362

0.291 0.824

0.824 6.439

0.133 0.099

0.099 1.700

8

0.063 0.093

0.159 0.112

0.112 0.455

-0.186 -0.316

0.223 0.549

0.549 5.890

0.133 0.100

0.100 1.800

9

0.007 0.043

0.130 0.076

0.076 0.364

-0.141 -0.321

0.164 0.262

0.262 4.950

0.133 0.100

0.100 1.900

10

0.026 0.036

0.136 0.081

0.081 0.326

-0.146 -0.428

0.164 0.148

0.148 3.978

0.133 0.100

0.100 2.000

---------------------------------

143 Table A-2.

Monte Carlo Performance of Optinal and Linearized Filters

Optimal nonlinear filter n

Average covariance

Linearized filter Average covariance

1

1.038 0.361

0.361 0.757

2.180 -1.630

-1.630 2.628

2

0.184 0.036

0.036 0.499

1.206 -0.465

-0.465 1.155

3

0.123 0.003

0.003 0.254

0.487 -0.077

-0.077 0.523

4

0.108 0.075 0.117

0.075 0.251 0.039

0,232 0.049 0.221

0.049 0.349 0.126

0.039

0.242

0.126

0.623

0.309

0.097

0.620

1.078

0.097

0.269

1.078

3.712

7

0.469 0.180

0.180 0.316

0.764 1.547

1.547 6.339

8

0.234 0.124

0.124 0.355

0.491 0.999

0.999 5.790

9

0.119 0.053

0.053 0.264

0.257 0.425

0.425 4.850

10

0.142 0.063

0.063 0.226

0.256 0.196

0.1.96 3.878

5

6

In reviewing these experimental results it

must be pointed out that

the confidence analysis of Bucy and Senne (4] is 4

gaussian confidence level is

in error.

1

(2o /N)/2,

in the original paper (see Chapter IV).

The one-sigma

21/2

and not

(3o

IN)

,

as given

Using the results of Chapter IV

we will now assess the accuracy of the above test results.

First we

convert the diagonal terms of the sampled covariances into their three

.'A

144 standard deviation confidence limits, taken from Chapter IV.

The results

are given in Table A-3 for the predictor covariances of Table A-i. nonlinear predictor results were based on

N=500

The

Monte Carlos and the

linearized predictor was run for N=2000.

Table A-3.

Monte Carlo

Confidence Intervals for Predictors Linearized Predictor

Optimal Nonlinear Predictor

n

First Coordinate

Second Coordinate

First Coordinate

Second Coordinate

1

0.302+0.443

0.720+1.058

0.589+0.713

2.492-3.014

2

0.123+0.180

O.503-0.739

0.366+0.443

1.146+1.387

3

0.11040.162

0.29840.437

O.203->0.245

0,56940.688

4

0.107">0.157

0.29540.433

O.144-0.175

0.41040.496

5

0.108+0.159

0.28740.422

0.14240.171

0.66040.799

6

0.149+0.218

0.30+Q0.*455

0.233÷0.282

3.483+4.213

7

0.182+0.268

0.350+0.513

0.266+0.322

5.881+7.114

8

0.137+0.196

0.382+0.562

0.204+0.246

5.380+6.507

9

0.109;0.160

0.30640.449

0.150+0.181

4.521+5.469

10

0.11440.168

0.274+0.402

0.1500•0.181

3.633+4.395

Almost immediately we ascertain from comparing Table A-3 with Table A-1 that the "optimal" estimate can not possibly be optimal for iterations 6,

7, and 8, since the entire

the zero-state result fZr those samples.

3o

confidence band lies above

In fact we observe a

periodicity in the errors for both estimators with period approximately equal to the rotational period of the sensor (between 6 and 7 samples).

145 We could also have made a study of the zero-bias reliability of the estimates.

But, owing to the dubious value of the data, we choose to

proceed op to the more recent experiments.

147

Appendix B.

More Recent Experimental Results:

Point Masses versus Gaussian Sums The early experiments reported in the previous appendix raised a lot of questions which demanded more tests to resolve. errors cyclic modulo 27?

What made the

Why was the lincarized filter unstable?

Al~pach and Sorenson (2] revealed another "linearized" filter which did not have instabilities.

Thus it

was imperative that we closely compare

our results with theirs in order to explain this anomolous discrepancy. Using an escimate comparison test of the two linearized filters it was discoveree that: Alspach and Sorenson had modified the estimate update equation to the form

x (n+lIn+l) = xl(n+lln) + A(n+l) [z(n+l)-h[x(n+lIn) I+7Imod 2-7r instead of the original form (7).

(B-l)

We may intuitively rationalize the

success of the modified measurement scheme by examining Fig. B-1. Suppose the working range of the sensor is to the first coordinate axis. which might arise if orbit, as is situation for

[-n,v)

w.th zero referring

The figure shows t,.,o typical situations

the target is allowed to be inside the sensor

the likely case if 0(n+l) G 0.

Jo

-

N(O,I).

In this case

h(x 1 ),

Case 1 represents the the t-rue bearing, is

positive, while the bearing to the estimated position The result is

a difference in bearing greater than

xI

is negative.

Tr. On the other

hand, if both the estimate and the target remain inside the sensor orbit, then some nonzero

B(n+l) (such as given in Case 2 of the figure)

would result in bearing and estimated bearing of the same sign, so that

Preceding page blank

-i' ;$''

148

hnlX2n

_•i• -" 'a'

Fig. B-1.

Typical Geometry of the "Old Problem" Illustrates Periodicity of Errors

n (Xl)

h (x )

149 the difference is

always less than

V. Thus the figure accounts for

the modularity of the error performance discovered in the early experiments. Now the filter update equation (7) contains the term which ip turn may be expressed as h(x) - h(x) is

gaussian and may take on all values.

-iT to Thus, if

is usually small compared with the magnitude of be expected that the modulation of the difference would lead to improved performance.

so that even if

h(x) - h(® + v,

has an effective range from

z - h(Xo

n,

the additive noise

the noise magnitude it would

h(:) - h(K), z - h(x)

as in (B-l)

The "fixed" linearized performance

is in fact dramatically improved, as illustrated in Table B-l, which shows the Monte Carlo results for the "old problem".

Table B-1.

Monte Carlo Averaged Sum Squared Error

Performance for Predictors - Old Problem

Sample

Mean-State Predictor

Iterated Linearized Predictor

"Fixed" Linearized Predictor

Gaussian Sum Predictor

Floating Grid Predictor

1

1.250

4.334

1.468

0.757

0.866

2

1.388

1.678

0.974

0.596

0.674

3

1.447

0.651

0.562

0.457

0.631

4

1.537

0.535

0.550

0.473

0.408

5

1.634

0.828

0.617

0.518

0.403

6

1.734

3.187

0.169

0.531

0.464

7

1.833

4.927

0.540

0.471

0.674

8

1.933

4.912

0.618

0.553

0.454

9

2.03'

4.248

0.564

0.578

0.377

10

2.133

3.763

0.619

0.641

0.443"

Stable

Stable

S table

Unstable

Unstable

150 What is

shown the table is

the trace of the sampled covariance

matrices for the Mean-State (ideal predictor), the Gaussian Sum,

the Fixed-Linearized, Table B-1 is

and the Floating Grid Predictors.

which was printed incorrectly due to

taken from Senne [8],

an editing error.

the Iterated-Linearized,

The fixed-linearized predictor is seen in the table

to be stable, and, based on only 100 Monte Carlos, essentially equivalent It

to both the Gaussian-Sum and Floating-Grid predictors. in fact, that the "old problem," than originally intended. tion with each sample if it of

a=l

radian per sample.

as depicted in Fig. B-1,

turns out, is far easier

The sensor is given considerable new informaorbits around the target at the high rate Thus it

is not surprising to find that it

is relatively easy to design an approximate estimator which performs very close to optimum. performs miserably, it however,

The straightforward happens,

linearized predictor

as discussed above.

We can conclude,

that the old problem is not sufficiently difficult to demon-

strate the difference in performance between the various estimators. We return thus to the originally intended geometry, Fig. B-2, where the iLLitial density

Jo

-

N([3),1),

illustrated in

R=0.01,

and

F=I,

so that the target is undergoing pure random walk in both dimensions. The Monte Carlo pertormance summary for the "new problem" is given in Table B-2, where we may discern that the linearized predictor (in this case the "fix" is unnecessary) converges more slowly to steady state than the optimal estimates, sample paths) still

but the accuracy of the Monte Carlo (100

precludes discrimination for steady-state operation.

Thus we have come to the conclusion that the passive receiver with gaussian noises is very linearizable in that the conditional densities are very nearly gaussian, at least in range-bearing coordinates.

151

'II

Fig. j-2.

"New Problem" Geometry without

Periodic Errors

152 Monte Carlo Averaged Sum-Squared Error "Performance for Predictors - New Problem

Table B-2.

I

I

Sample

Mean-State Predictor

Linearized Predictor

Gaussian Sum Predictor

Floating Grid Predictor

1

2.200

1.665

11.436*

0.955

2

2.400

1.928

1.723

1.020

3

2.600

2.187

1.495

1.006

4

2.800

2.229

1.321

1.083

5

3.000

2.190

1.208

1.118

6

3.200

2.145

1.45$

1.359

7

3.400

1.830

1.625

1.521

8

3.600

2.105

1.437

1.370

9

3.800

2.136

1.423

1.132

10

4.000

2.349

1.286

1.316

11

4.200

2.370

1.431

1.493

12

4.400

2.114

1.533

1.515

13

4.600

1.883

1.475

1.345

14

4.800

1.752

1.398

1.268

15

5.000

1.752

1.553

1.441

16

5.200

1.850

1.617

1.534

17

5.400

1.627

1.275

1,222

18

5.600

1.561

1.271

1,238

19

5.800

1.620

1.630

1.314

20

6.000

1.688

1.803

1.556

We note in passing that the first sample estimate of the Gaussian sum predictor suffers from a transient at about Monte Carlo number 50,

153 so that the number in the table is inaccurate.

The filter recovers

stability, however, so that this number may be ignored.

$

155

Appendix C. A Movie of Conditional Densities In the previous two appendices the relative success of the linearized-

type predictor can be related to the validity of the Gaussian approximation to the conditional density functions.

The simplest way to destroy

the validity of the Gaussian assumption is to provide a nongaussian initial density function.

Consider, for example, the detector geometry

illustrated in Fig. C-1.

If there were a sequence of known reflecting

ionospheric layers above the aircraft observer and we were given an apriori distribution on the transmitter's power, then it is conceivable that we might want to integrate over all elevations to maximize detectability, thus introducing a multimodal range ambiguity as shown in the figure.

ProbabiJistically,

the initial condition would be obtained by

taking the product of the multimodal range ambiguity density with the 6

bearing ambiguity density.

The result might look as in Fig. C-2, where

no particular scale is intended.

As soon as the aircraft's detector

circuits obtain a reliable detection, the aircraft banks left into a

circle of unit radius and activates a high sensitivity receiver which is tuned to one elevation. Iii order to study the evoluti.on of the conditional densities a

movie was made by chor~trzg the parameters

.0

0.0250

0.025

1.001 i.000

0.025

0.050

[1 F =oo

R= 0.01, J

0

= 0.01,

and

was chosen as the sum of four Gaussian densities with one cross range

Preceding page blank

156

0

N

0

5
m

k*

k

tV

C)CU

o

4

Suggest Documents