iterative estimators of parameters in a linear model with partially ...

1 downloads 0 Views 237KB Size Report
A new kind of linear model with partially variant coefficients is proposed and a ... Keywords: linear model, parameter estimation, iterative algorithms, variant ...
ITERATIVE ESTIMATORS OF PARAMETERS IN A LINEAR MODEL WITH PARTIALLY VARIANT COEFFICIENTS ‡



HU SHAOLIN† , MEINKE KARL , CHEN RUSHAN† and HUAJIANG OUYANG§ † Nanjing University of Science and Technology, 210071, Nanjing, China; [email protected] ‡ School of Computer Science and Communication, Royal Institute of Technology, Stockholm, 100-44, Sweden §University of Liverpool, Liverpool, United Kingdom; [email protected]

A new kind of linear model with partially variant coefficients is proposed and a series of iterative algorithms are introduced and verified. The new generalized linear model includes the ordinary linear regression model as a special case. The iterative algorithms effectively overcome some difficulties in computation with multidimensional inputs and iteratively appended parameters. An important application is described at the end of this article, which shows that this new model is reasonable and applicable in practical situations. Keywords: linear model, parameter estimation, iterative algorithms, variant coefficients

has so far not been essentially relaxed. 1. Introduction * In the last centuries, many statisticians and mathematicians have considered the following kind of linear regression model[1,2] Yi = B i β + ε i (1) where Yi ∈ R p , Bi ∈ R p×r (i = 1,2,L,n) and the vector β ∈ R r is a constant parameter vector to be estimated and ε i ∈ R p are errors arising from measurement or stochastic noises from disturbance. Some excellent theories and practical results have been published for statistical inferences and stochastic decisions using this model. This model has also been successfully applied to many kinds of practical engineering problems (see Draper and Smith, 1981; Frank and Harrell, 2002; Graybill and Iyer, 1994; Hu and Sun, 2001). Further research on this model shows that the limitation of constant coefficients in model (1) is quite restrictive and strong. In other words, there are some practical situations in which this linear model cannot be applied (see Brown, 1964; Hu and Sun, 2001). Although there has been a lot of further research to generalize or adapt this linear model (1), (see e.g. Fahrmeier and Tutz, 2001; Dodge and Kova, 2000) the constraint on constant coefficients *

Corresponding author: Hu Shaolin. Xi’an City, P.O.Box 505#16, Shannxi,China; Email: [email protected]

In order to overcome this limitation on the generality of the model (1) we set up a new linear model with partially variant coefficients as follows: Yi = A i X i + B i β + ε i (2) where Yi ∈ R p ,A i ∈ R p×q ,B i ∈ R p×r and {X i ∈ R q } is a variant vector series. Generally, the dimension p of the measurement output must be larger than the dimension q of the variant coefficients, so p > q , in order to make sure the structure of the time-variant multidimensional linear system is identifiable. Obviously, the ordinary linear regression model (1) is just a special case of the generalized model (2). If there are not any time-variant components, i.e. q = 0 , then the model (2) simplifies to the ordinary linear model (1). In Section 2, we consider estimating all of the model coefficients in (2) under the Gauss-Markov assumptions (in Radoslaw and Krzysztof, 1988). A series of iterative algorithms are introduced that allow us to estimate the coefficients which include the constant parameters β ∈ R r and the variant vector series {Xi ∈ R q } . In Section 3, a proof of the main result is given. In Section 4, a practical application is described and some computational results are presented, which show that this new model is useful in practise.

2. Iterative Estimators of Variant Coefficients In order to make sure the results given in this section are universal, we first assume that the coefficient series elements {Xi ∈ R q } are not related at different sampling points. In order to simplify the notation, we use the notation Φ n = (βτ , Xτ1 , L , Xτn )τ ∈ R r +nq , for vectors, where the superscript τ denotes the transpose of a matrix as well as a vector, and matrix

⎛ B1 ⎜ Hn = ⎜ M ⎜B ⎝ n

0 ⎞ ⎟ O M ⎟ ∈ R np×( r +nq ) L A n ⎟⎠

A1 L M 0

(3)

Under the following three well known Gauss-Markov assumptions (c.f. Radoslaw and Krzysztof, 1988 or Rencher, 2000) on random errors {ε i ∈ R p } : (i) the error ε i has expected value 0; (ii) the error series values {ε i , i = 1,2,L} are uncorrelated, and (iii) the error series values { ε i , i = 1,2,L} are homoscedastic, i.e. they all have the same variance. Then the least squared (LS-) estimators of coefficients in model (2) can be expressed as follows:

n+1

n+1

n+1

ˆ LS(n+1) = X ˆ LS(n) + (Aτ A )−1 Aτ B (βˆ LS(n) − βˆ LS(n+1) ) ⎧⎪X i i i i i i ⎨ LS(n+1) τ 1 τ − ˆ ⎪⎩X = (An+1Rn+1An+1 ) An+1Rn+1 (Yn+1 − Bn+1βˆ LS(n) ) n+1 (i = 1,2,L, n) (6) n ⎧ τ τ −1 τ r×r ⎪Ln = ∑Bi [I − Ai (Ai Ai ) Ai ]Bi ∈ R i 1 = ⎪ where ⎪Ω = A (Aτ R A ) −1 Aτ ∈ R p× p . ⎨ n+1 n+1 n+1 n+1 n+1 n+1 ⎪ τ −1 τ p× p ⎪R n+1 = I − Bn+1 (Ln + Bn+1Bn+1 ) Bn+1 ∈ R ⎪⎩

and the superscript in expression Xˆ iLS ( n ) denotes the LS-estimators of Xi£¨ i = 1,2, L, n) . Proof The proof of Theorem 1 will be given in Appendix 1.

Obviously, the algorithm (6) is iterative and very wieldy in practical engineering applications. However, some obvious optimizations exist: βˆ LS ( n +1) is a linear combination of the estimator with innovation βˆ LS ( n ) {Yn+1 − B n+1βˆ LS ( n ) } from new sampling data.

n

ˆ LS ( n ) = arg minimizing ∑ Y − (A X + B β ) 2 Φ n i i i i β∈R p ,{ Xi ∈R q } i =1

(4) and we can directly deduce a compact formula, which is very similar to the LS estimators of model (1). The compact formula for the LS estimator of coefficients in model (2) is as follows ˆ LS ( n ) = (Hτ H ) −1 Hτ Y Φ n n n n n

βˆ LS (n+1) = βˆ LS (n) + (Ln + Bτn+1Bn+1 ) −1 Bτn+1 (R −n+11 − Ωn+1 ) ⋅ R (Y − B βˆ LS (n) )

(5)

where Yn = ( Y1τ ,L, Ynτ )τ ∈ R np . In order to make sure the matrix Hτn H n is reversible, the dimensions of the model (2) must satisfy the restriction p > q and the sample capacity must satisfy n > r /( p − q) ; otherwise, the −1 in formula reversion operator of matrix £¨ Hτn H n£© (5) must be substituted by the “+” reversion + . operator, namely £¨ Hτn H n£© Theorem 1 Suppose that the number of sampling points n > r /( p − q ) . Then the LS estimators of the coefficients in model (2) can be iteratively expressed by

βˆ LS ( n +1) can be computed directly from estimator βˆ LS ( n ) without directly involving old sampling data { Yi , i = 1, L , n } as well ˆ LS ( n ) } ; as estimations {X i

ˆ LS ( n +1) are It is clear that the estimators X n +1 determined with innovation LS ( n ) ˆ {Yn+1 − B n+1β } , which coincides with our understanding of the model (2); ˆ LS ( n +1) of X (i ≤ n) can The estimation X i i be adjusted accurately in succession with the estimator error of the constant parameter vector β . In order to use this iterative algorithm to effectively solve practical problems, the initial estimates for the iterative algorithm (6) must be carefully selected. Generally, the initial estimates can be chosen to be the LS estimators processed in batch as follows: ˆ LS ( n0 ) = (Hτ H ) −1 Hτ Y Φ (7) n0 n0 n0 n 0 n0 where n0 ∈ N must satisfy the constraint

n0 > r

( p − q)

If the disturbance {ε n ∈ R p , n ≤ n0} is a stationary Gaussian white noise process with zero mean, then it can easily be shown that the ordinary LS estimators given in equation (4) are unbiased.

which are suitably located at different sites separately. These m transits are simultaneously used to track a payload carrying rocket M in space. Using these transits, we get a series of measurement data {(A j ( t i ), E j ( t i )) | i = 1,2, L , n; j = 1,2, L , m} , where

Theorem 2 Suppose the LS estimators (7) are chosen as initial estimates of the coefficients of model (2). If the disturbance {εn ∈Rp , n∈N} is a stationary Gaussian white noise process with mean zero, then the iterative estimators (6) are unbiased.

A j ( t i ) denotes the azimuth and E j ( t i ) denotes the

elevation of the rocket M, at time ti , with respect to a reference frame fixed at the center of the transit instrument j . In order to simplify the expressions below, we use some abbreviated notations such as A ij = A j ( t i )

Proof In order to prove Theorem 2, we just need to show that Eβˆ LS ( n ) = β , where the operator E denotes mathematical expectation or a stochastic variable. In fact,

and E ij = E j ( t i ) , etc. So, the error decomposition models used in determining the location of the spacecraft M can be set up as follows (see Brown, 1964; Hu and Sun, 2001)

~ E{βˆ LS ( n) } = E{βˆ LS ( n−1) } + L-n1B nτ [R n−1 - Ξn ]R n ⋅ A n Xn + Bn (β − E{βˆ LS ( n−1) }) ~ = β + L-n1B nτ [I − A n (A nτ R n A n ) −1 A nτ R n ]A n Xn

(

)

⎧ −1 x − x 0j + α j1 + α j3tgEijsinAij + α j4tgEijcosAij ⎪Aij = tg y y − 0j ⎪ ⎪ + α j5tgEij + α j6secEij + ε Aij ⎪ ⎨ z − z0j ⎪E = tg−1 + α j2 + α j3cosAij 2 ⎪ ij [(x − x 0j ) + (y − y0j )2 ]1 / 2 ⎪ − α j4sinAij + ε Eij ⎪⎩



(8) where

~ Ln =

n −1

∑ B iτ [ I -A i(A iτ A i )−1 A iτ ]B i + B nτ B n i =1

and Ξn+1 = A n+1 (Aτn+1R n+1An+1 ) −1 Aτn+1 . Using the equation (6), we get ˆ LS ( n ) } = E{X ˆ LS ( n−1) } + ( A τ A ) −1 A τ B ( E{βˆ LS ( n−1) } E{X i i i i i i − E{βˆ LS ( n ) }) = X i + ( A iτ A i ) −1 A iτ B i (β − β) = Xi

(i = 1,2 ,L ,n − 1)

( ) = (A R A )

ˆ LS ( n ) } = Aτ R A E{X n n n n τ

n

−1

−1

n

n

τ

(

(9)

)

A n R n E{Yn } − B n E{βˆ LS ( n −1) }

Aτn R n A n X n

= Xn

(10) Applying the principle of mathematical induction the result follows.

3 A Practical Application Our new linear model (2) can be widely used in many different practical fields, e.g. in data fusion, in modeling and monitoring of computer controlled systems, in signal processing, and in spacecraft control engineering, etc. In this section, we present an application of the model (2) to calculate the trajectory of rocket. Suppose that there are m transit instruments

(11) where the coefficients (α j1 ,α j 2 ) are non-zero errors of the transit instrument j used to measure the azimuth and elevation of the spacecraft, the coefficients (α j 3 ,L, α j 6 ) are non-orthogonal coefficients representing measurement errors arising from departures from right angles between each pair of the three axes in the measurement equipment (mechanical axis, laser axis and electro-axis) separately, and (ε A , ε E ) are stochastic errors included in the measurement data. Assuming that we get a series of imprecise r location data p*i = ( x *i , y*i , z*i ) for the spacecraft M at different sampling times ti (i = 1,2,....) , what we want to do is to estimate all of the instrument error coefficients as well as the precise location of the spacecraft M. According to the geometrical relationship between ordinates and measurement data from radars, two functions can be set up as follows:

f j ( x, y, z ) = tg −1

x − x0 j y − y0 j

g j ( x, y, z) = tg −1

z − z 0j

i=2

[(x − x 0j ) + (y − y 0j ) ] 2

2 1/ 2

i=3 i=4

and a design matrix can be defined as follows: ⎡1 0 tgEij sin Aij tgEij cos Aij tgEij secEij ⎤ Θij = ⎢ − sin Aij 0 0 ⎥⎦ ⎣0 1 cos Aij

Then we get the following linear model:

where Bi = diag{Θi1 ,L, Θim } . Obviously, model (13) is very similar to the linear model (2) with partially variant parameters. So, we can use the iterative algorithm (6) to calibrate the error coefficients in the transit instruments and, at the same time, to accurately determine the trajectory of the rocket in space. In this case, there are four transit instruments tracking a rocket in space. Selecting the computation parameter n0 = 100 (s), we use the formulae (6) to get the modification values. Table 1 gives an estimation of the values of the error coefficients at 110 seconds; Table 2 gives the modification values of the rocket trajectories after n0 = 100 (s). Table 1 Estimation of Error Coefficients [mrad] Transit

αj1

αj2

i=1

1.203

0.686

αj3

αj4

0.018 -0.006

αj5

αj6

0.006 0.003

0.009 0.015

0.001

Δxi

Δyi

Δzi

1

-0.66.445

0.631514

0.16216E-2

2

-0.763551

0.687524

-0.654640E-2

3

-0.760541

0.677017

-0.127287E-1

4

-0.752472

0.673932

-0.171894E-1

5

-0.793005

0.699980

-0.238274E-1

6

0.835997

0.730210

-0.297603E-1

7

-0.832480

0.731190

-0.366932E-1

8

-0.802443

0.710644

-0.378403E-1

9

-0.739471

0.661947

-0.366333E-1

10

-0.739483

0.660732

-0.387798E-1

Integrating all of the m instruments, we get an integrated error decomposition model as follows:

(i = 1,2,L)

1.449 -0.024

-0.019 -0.004

=100+

( j = 1,L, m; i = 1,2,L)

(13)

-0.514

0.041 -0.016

[m]

r r ~ ~ where ΔAij = Aij − f j ( Pi* )£¬ΔEij = Eij − g j ( Pi* ) , and

~ ⎡ ΔAi1 ⎤ ⎡ ε Ai1 ⎤ r ⎢ ~ ⎥ ⎛ J (P) ⎞ ⎥ ⎢ α x Δ ⎛ ⎞ ⎛ ⎞ ⎢ ΔEi1 ⎥ ⎜ 1 ⎟ ⎜ i⎟ ⎜ 11 ⎟ ⎢ ε Ei1 ⎥ ⎢ ~M ⎥ = ⎜ M ⎟ ⎜ Δyi ⎟ + Bi ⎜ M ⎟ + ⎢ M ⎥ r⎟ ⎢ΔAim ⎥ ⎜⎜ ⎜ ⎟ ⎜α ⎟ ⎢ε Aim⎥ ⎟ ⎝ m6 ⎠ ⎢ ⎢ ~ ⎥ ⎝ J m (P) ⎠ Pr=Pri* ⎝ Δzi ⎠ ε ⎥ ΔE ⎣ Eim ⎦ ⎣⎢ im ⎦⎥

1.087 -1.837

0.000 -0.000 0.001

Table 2 Modification Values of Trajectories

⎛ a j1 ⎞ ⎛ Δxi ⎞ ~ ⎜ ⎟ ⎡ε Aij ⎤ r ⎡ ΔAij ⎤ ⎜ ⎟ ⎢ ~ ⎥ = J j ( Pi ) Pr = Pr * ⎜ Δyi ⎟ + Θij ⎜ M ⎟ + ⎢ ⎥ (12) i ⎜ ⎟ ⎣ε Eij ⎦ ⎜ Δz ⎟ ⎣⎢ΔEij ⎦⎥ ⎝ i⎠ ⎝ a j6 ⎠ ( j = 1,L, m; i = 1,2,L)

r ∂( f j , g j ) J1 ( P ) = ∂ ( x, y , z )

-0.058 -0.070 -0.000

These computations, as well as the results given in Table 1 and Table 2, show that the iterative algorithms given in Section 2 not only decrease the time complexity of the computation but also efficiently improve the precision of the trajectory estimates for a rocket in space. What is more, the practical application shows that this new kind of linear model with variant coefficients is reasonable and useful not only in theory but also in different engineering fields. 4

Summary and Conclusions

This paper not only presents a new kind of linear model but also introduces a series of convenient algorithms. The new model usefully generalizes the widely used ordinary linear regression model. It can be used in many different kinds of fields, e.g. in data fusion, in process monitoring, and in control engineering etc. As for our new algorithms, their advantage is evident. Obviously, if we use the old LS-algorithm (5), we must compute a very high dimensional reversion matrix (Hτn H n ) −1 ∈ R ( r +nq )×( r +nq ) , what is more, the dimension of this matrix increases with the number n of samples as the process moves forwards in time. On the other hand, if we use our new iterative algorithm (6), we just need to deal with a series of low dimensional reversion matrices,

the highest dimension of which is equal to max{ p, q, r} . In fact, the iterative algorithm (6) involves only the three reversion −1 p× p τ −1 q×q matrices (Ai Ai ) ∈ R , R n+1 ∈ R and (Ln + Bτn+1Bn+1 ) −1 ∈ Rr×r . Acknowledgements We gratefully acknowledge partial financial support from the National Nature Science Fund of China (NSFC 90305007), the SI Project (SI-210 -05483) of the Swedish Institute, the Jiangsu Nature Science Fund (BK-06200) and the NSFC-RS Joint Project (NSFC-RS/0510881-207043). References Brown D.C. (1964): The Error Model Best Estimation Trajectory. -- AD 602799 Dodge Y. and Kova J. (2000): Adaptive Regression. – Berlin, Springer-Verlag Draper N.R. and Smith H. (1981): Applied Regression Analysis. --John Wiley & Sons, Inc.

Eubank R.,Chunfeng H and Maldonado Y.,et al (2004): Smoothing spline estimation in varying -oefficient models. --Journal of the Royal Statistical Society B,Vol.66,No.3, pp:653-667 Fahrmeier L. and Tutz G.(2001): Multivariate Statistical Modeling Based on Generalized Linear Models.--Springer series in statistics, Berlin: Springer-Verlag Frank E.and Harrell J.(2002): Regression Modeling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. --Springer-Verlag New York Inc. Graybill F.A. and Iyer H.K. (1994): Regression Analysis: Concepts and Applications.--CA: Duxbury Press Hu Shaolin and Sun Guoji (2001): Process Monitoring Technique and Applications. Bejing: National Defense Industry Press, pp:68-103 Radoslaw K. and Krzysztof K. (1988): Recursive Improvement of Estimates in a Gauss-Markov Model with Linear Restrictions. The Canadian Journal of Statistics, Vol.16, No.3, 301-305 Rencher A. (2000): Linear Models in Statistics. --John Wiley & Sons, Inc. pp:121-256

Appendix 1: Proof of Theorem 1 In order to prove Theorem 1, we make use of two lemmas, that we shall state without proof, which are fundamental in linear algebra: Lemma 1 If a block matrix Α and the block Α11 in matrix Α , defined as follows, are reversible, then we have ⎛ A11 ⎜⎜ ⎝ A 21

A12 ⎞ ⎟ A 22 ⎟⎠

−1

−1 ⎛ A -1 0 ⎞ ⎛ A11 A12 ⎞ ⎟ ⎟+⎜ = ⎜⎜ 11 ⎜ ⎟ ⎟ ⎝ 0 0⎠ ⎝ − I ⎠

⋅ ( A 22 −

−1 −1 A 21A11 A12 ) −1 ( A 21A11

M − I)

(a-4) LS ( n + 1 ) ˆ ⎡ ⎤ 0 ⎤ ⎡ Hn ˆ LS ( n +1) = ⎢Φ n where Φ , Ψn = ⎢ n +1 ⎥ LS ( n +1) ⎥ ˆ ⎢⎣ X n +1 ⎥⎦ ⎣C n +1 A n +1 ⎦ and Cn+1 = (Bn+1,0) ∈ R p×(q+nq)£¬D11 = Hnτ Hn + Cnτ +1Cn+1 Now, using the notations D22 = A nτ +1An+1 and -1 D12 = Cnτ +1A n+1 as well as Ω = D 22 − D 21D11 D12 , the following formula (a-5) can be directly derived from lemma 1:

ˆ LS ( n +1) ⎞ ⎧⎪⎛ D −1 ⎛Φ ⎟ = ⎜ 11 ⎜ n ⎜X ˆ LS ( n +1) ⎟ ⎨⎜ ⎝ n+1 ⎠ ⎪⎩⎝ 0 ⎛ H nτ Yn ⎜

A12 ⎞ ⎟ A 22 ⎟⎠

−1

⎛0 0 ⎞ ⎛ − I ⎞ ⎟ ⎜ ⎟⎟ = ⎜⎜ -1 ⎟ + ⎜ −1 ⎝ 0 A 22 ⎠ ⎝ A 22 A 21 ⎠ −1 −1 A 21 ) −1 ( −I M A12 A 22 ⋅ ( A11 − A12 A 22 )

(a-2) Lemma 2 If two matrices F and G are reversible and the inverse matrix (F - HG -1K ) −1 exists, then (F - HG -1K ) −1 = F −1 + F −1H (G - KF -1H ) −1 KF −1 (a-3)

[Step 1] We first analyze the expression E n . From the expression for the block matrix D11 and Lemma 1 we have

D -111

{

ˆ LS ( n +1) = Ψ τ Ψ Φ n +1 n n

}

−1

⎡ D = ⎢ τ 11 ⎣ A n +1Cn +1

τ⎡ Y ⎤ Ψn ⎢ n ⎥ ⎣Yn +1 ⎦ −1

Cτn +1A n +1 ⎤ τ ⎡ Yn ⎤ ⎥ Ψn ⎢ ⎥ τ A n +1A n +1 ⎦ ⎣Yn +1 ⎦

⎡ n +1 τ ⎢∑ B l B i ⎢ i =1 τ = ⎢ A1 B1 ⎢ M ⎢ τ ⎢⎣ A n B n

B τ1 A 1

L

A τ1 A 1 M 0

L O L

⎤ B nτ A n ⎥ ⎥ 0 ⎥ M ⎥ ⎥ A nτ A n ⎥⎦

−1

It can be shown that the first r rows of the matrix D -111 can be expressed as follows

D -111

which is added into the sampling set, then the LS estimators of all the coefficients in model (2) must be modified according to the following expressions:

(a-5)

-1 -1 ⎛ D-1 + D11 ⎞ D12 Ω −1D 21D11 ⎟. and E n = ⎜ 11 − 1 1 ⎜ ⎟ - Ω D 21D11 ⎝ ⎠

ˆ LS ( n ) = ( Hτ H ) −1 Hτ Y Φ n n n n n

Yn+1 = A n+1X n+1 + B n+1β + ε n+1

⎪⎭

C nτ +1Yn+1 ⎞ ⎟

-1 τ ⎛ D -1 [C τ + D12 Ω −1D 21D11 C n+1 − D12 Ω −1A nτ +1 ] ⎞ ⎟ Fn = ⎜⎜ 11 n+1 -1 τ ⎟ - Ω −1[D 21D11 C n+1 − A nτ +1 ] ⎝ ⎠

Proof of Theorem 1

If there is another sampling datum

)⎫⎪⎬

where

The proofs of these two lemmas can be found in reference [5]. Now, let us prove Theorem 1 in detail.

With model (2) and n samples, the equation (5) shows that the LS estimators are

(

+ ⋅⎜ τ ⎟ A n+1Yn+1 ⎠ ⎝ = E n H nτ Yn + Fn Yn+1

(a-1) Similarly, if the matrix A and the block A 22 are reversible, then we have ⎛ A11 ⎜⎜ ⎝ A 21

-1 0 ⎞ ⎛ D11 D12 ⎞ −1 -1 ⎟Ω D 21D11 ⎟+⎜ M−I ⎟ ⎜ 0 ⎠ ⎝ - I ⎟⎠

−1 n +1 ⎡ ⎛ n +1 ⎞ ⎢ ⎜⎜ ∑ B iτ B i − ∑ B iτ U i A iτ B i ⎟⎟ i =1 ⎠ = ⎢⎢ ⎝ i =1 −1 n +1 n +1 ⎛ ⎢ - Tn ⋅ ⎜ ∑ B iτ B i − ∑ B iτ U i A iτ B i ⎞⎟ ⎜ ⎟ ⎢⎣ i =1 ⎝ i =1 ⎠

⎤ *⎥ ⎥ (a-6) ⎥ *⎥ ⎥⎦

where Ui = Ai (Aiτ Ai ) −1£¬Tn = [B1τ U1 ,L, Bnτ Un ]τ and the asterisk “*” denotes an omitted matrix block which is very complicated and has no effect on the following deduction process. After analyzing the formulae for the matrix block D22 and matrix block D12 = D 21 , we have the following equation -1 -1 -1 τ -1 -1 D11 + D11 D12Ω−1D21D11 = {I + D11 Cn Vn+1Cn+1}D11

(a-7) where −1 τ Vn+1 = An+1(Anτ +1An+1 − Anτ +1Cn+1D11 Cn+1An+1)−1 Anτ +1

formula can be obtained: −1



and the equation -1 τ D11 C n Vn+1C n+1

=

τ −1 τ -1 ⎛ B n +1 A n +1 Wk +1 A n +1B n +1 ⎜ D11 ⎜

(a-13)

0⎞ ⎟ 0 ⎟⎠

(a-8)

0 ⎝ ~-1 τ ⎛ L B A W −1 A τ B 0⎞ ⎟ = ⎜⎜ n~+-11 n+1τ n+1 k +1−1 n +1τ n+1 ⎟ ⎝ -Tn L n+1B n+1A n+1Wk +1A n+1 Bn+1 0 ⎠

i =1

−1 Obviously, the matrix D11 can be expressed as follows: −1 D11 = (Hnτ Hn )−1 − (Hnτ Hn )−1Cnτ +1

(a-9)

⋅ [I + Cnτ +1 (Hnτ Hn )−1Cnτ +1 ]−1Cn+1 (Hnτ Hn )−1

the

notation

n

Ln = ∑ Biτ [I-Ai ( Aiτ Ai )−1 Aiτ ]Bi , we have i =1

⎡n τ ⎢∑ B1 B i ⎢ i =1 τ ( H nτ H n ) −1 = ⎢ A1 B1 ⎢ M ⎢ τ ⎣⎢ A n B n

⎛ = ⎜⎜ -1 ⎝ -Tn L n L-n1

B1τ A1

L

A1τ A1 L M O 0

-L-n1Tnτ *

L

⎤ ⎥ ⎥ 0 ⎥ M ⎥ ⎥ A nτ A n ⎦⎥

−1

B nτ A n

⎞ ⎟ ⎟ ⎠

(a-10)

so we get ~ ⎧⎪ ⎛ L-1Z -1 n +1 = ⎨I − ⎜⎜ n -1 ~ D11 ⎪⎩ ⎝ -Tn L n Z n+1

0 ⎞⎫⎪ τ ⎟⎬( H n H n ) −1 0 ⎟⎠⎪⎭

(a-11)

~ where Z n+1 = B nτ +1 (I + B n+1L-n1B nτ +1 ) -1 B n+1 . On the other hand, the matrix Ω −1D 21 = -1 (D22-D21D11 D12 )−1 D21 can be simplified as follows −1 τ Ω−1D21 = {Anτ +1 (I − Cn+1D11 Cn+1 )An+1}−1 Anτ +1Cn+1 ~ ( a-12) = {Anτ +1 (I − Bn+1L−n1+1Bnτ +1 )An+1 ) −1

⋅ Anτ +1 (Bn+1,0L0)

~ Letting Qn+1 = {Anτ +1(I − Bn+1L−n1+1Bnτ +1)An+1}−1 Anτ +1Bn+1 and

Pn +1 = B nτ +1A n +1Q n +1

then

the

-1 -1 -1 D11 D12Ω−1D21D11 + D11

(a-14) Combining the equations (a-5), (a-13) and (a-14), we get an expression for the matrix E n as follows:

n ~ Ln +1 = ∑ Biτ [I-Ui Aiτ ]Bi + Bnτ +1Bn +1

using

and ~ ⎧⎪ ⎛ L-1 P 0⎞⎫⎪ ⎧⎪ ⎛ L-n1Zn+1 0⎞⎫⎪ τ ⎟⎬(HnHn )−1 ⎟⎬ ⋅ ⎨I − ⎜ = ⎨I + ⎜⎜ ~n+−11 n+1 -1 ~ ⎪⎩ ⎝ TnLn+1Pn+1 0⎟⎠⎪⎭ ⎪⎩ ⎜⎝ -TnLnZn+1 0⎟⎠⎪⎭ ~ ~ ~ ⎛ I + L-1 P -(I + L P )L-1Z 0⎞ τ ⎟(HnHn )−1 = ⎜⎜ ~-1 n+1 n+1 ~-1 n+1 n+1 n -1n~+1 ⎟ ⎝ -TnLn+1Pn+1 + Tn (Ln+1Pn+1 + I)LnZn+1 I ⎠

holds, where ~ Wk +1 = Anτ +1 (I-Bn+1L-n1+1Bnτ +1 )An+1

Furthermore,

~ ⎧⎪ ⎛ L-n1Z 0⎞⎫⎪ τ n+1 ⎟⎬(HnHn )−1 0 )⎨I − ⎜⎜ ~ 1 ⎟ p×nq ⎪ T L Z 0 ⎩ ⎝ n n n+1 ⎠⎪⎭ ~ = (Qn+1[I − L-n1Zn+1] M 0)(Hnτ Hn )−1

−1 D21D11 = (Qn+1,

following

~ ~ ~ ⎛ I + L-n1+1Pn+1 (I − E n ) − E n 0 ⎞ ⎟ ⎜ ~ ~ ~ E n = ⎜ -Tn [L-n1+1Pn+1 (I − E n ) − E n ] I ⎟(H nτ H n ) −1 ~ ⎟ ⎜⎜ -Q n+1(I − E(n) ) 0 ⎟⎠ ⎝ (a-15) ~ -1 ~ where E n = L n Z n +1 . Using the following formula

~ ~ L-n1+1 = (L n + B nτ +1B n +1 )-1 = L-n1 − L-n1Z n +1L-n1 we get ~ I − E n = L-n1+1L n (a-16) So, the equation (25) can be expressed as ~ ~ ⎛ I + Lpn+1 + L-n1+1[Ln − L-n1+1 ] 0⎞ ⎜ ⎟ ~ ~ En = ⎜ -Tn[Lpn+1 + L-n1+1 (Ln − L-n1+1 )] I ⎟(Hnτ Hn )−1 (a-17) ~ ⎜⎜ ⎟ 0⎟⎠ -Qn+1L-n1+1Ln ⎝ ~ ~ where Lpn +1 = L-n1+1Pn +1L-n1+1L n .

[Step 2] We next analyze the expression for Fn . From equation (a-6), we have −1 τ −1 τ D12Ω−1D21D11 Cn+1 = Cnτ +1An+1Ω−1Anτ +1Cn+1D11 Cn+1 namely, ⎛ Bnτ +1An+1Ω−1Anτ +1Bn+1 0 L 0⎞ ⎜ ⎟ ⎜ 0 0 L 0⎟ −1 −1 τ D12Ω D21D11Cn+1 = ⎜ ⎟ M M M⎟ ⎜ ⎜ 0 0 L 0⎟⎠ ⎝ ~ ~ ⎛ L-n1+1 -L-n1+1Tnτ ⎞⎛ Bnτ +1 ⎞ ⎟⎜ ⎟ ⋅ ⎜⎜ ~ -1 * ⎟⎠⎜⎝ 0 ⎟⎠ ⎝ -TnLn+1 ~ ⎛ Bτ A Ω−1Anτ +1Bn+1L-n1+1Bn+1 ⎞ ⎟ = ⎜⎜ n+1 n+1 ⎟ 0 ⎝ ⎠ (a-18) So we have the following four equations:

−1 −1 τ D11 [ D12 Ω −1 D 21 D 11 C n+1 ] ~-1 ~-1 τ ~ ⎛ Ln+1 -Ln+1Tn ⎞⎛ Mn+1 ⎞ ⎛ L-n1+1Mn+1 ⎞ (a-19) ⎟ ⎟⎜ ⎜ ~ ⎟ = = ⎜⎜ ~ -1 * ⎟⎠⎜⎝ 0 ⎟⎠ ⎜⎝ -TnL-n1+1Mn+1 ⎟⎠ ⎝ -Tn Ln+1

~ ~ ⎛ L-n1+1 -L-n1+1Tnτ ⎞⎛ B nτ +1 ⎞ −1 τ ⎟⎜ ⎟ D11 C n+1 = ⎜⎜ ~ -1 * ⎟⎠⎜⎝ 0 ⎟⎠ ⎝ -Tn L n+1 ~-1 τ ⎛ L ⎞ n +1B n +1 ⎟ ⎜ =⎜ ~ (a-20) -1 τ ⎟ ⎝ -Tn L n +1B n +1 ⎠ −1 D11 D12 Ω −1 A nτ +1

=

−1 τ D11 C n+1 A n+1Ω −1A nτ +1

~-1 τ −1 τ ⎛ L ⎞ n +1B n +1 A n +1Ω A n +1 ⎟ ⎜ =⎜ ~ −1 τ ⎟ -1 τ ⎝ -Tn L n +1Β n +1A n +1Ω A n +1 ⎠

(a-21)

~ −1 τ Ω −1D 21D11 C n+1 = Ω −1A nτ +1B n+1L-n1+1B nτ +1

~ ⎛ L-n1+1Bnτ +1[I − An+1 (Anτ +1Rn+1An+1 )−1 Anτ +1Rn+1 ] ⎞ ⎜ ~ ⎟ Μbn+1 = ⎜ -TnL-n1+1Bnτ +1[I − An+1 (Anτ +1Rn+1An+1 )−1 Anτ +1Rn+1 ]⎟ ⎜⎜ ⎟⎟ (Anτ +1Rn+1An+1 )−1 Anτ +1Rn+1 ⎝ ⎠

If we let Ξn+1 = A n+1 ( A nτ +1R n+1A n+1 ) −1 A nτ +1 , then we have ~ Pn +1L-n1+1Lnβˆ LS(n)-Bnτ +1An +1(Anτ +1R n +1An +1 )−1Anτ +1R n +1Yn +1 ~ = Bnτ +1Ξn +1Bn +1L-n1+1Lnβˆ LS(n) − Bnτ +1Ξn +1R n +1Yn +1 ~ = Bτ Ξ [B L-1 L βˆ LS(n) − R Y ] n +1 n +1

n +1

n +1

n +1 n +1

n

and (a-22)

~

where M n+1 = B nτ +1A n+1Ω −1A nτ +1B n+1L-n1+1B n+1 . Inserting equations (a-19), … , (a-22) into the expression for Fn which was defined in equation (a-5), we immediately have that ~ ⎛ L-n1+1 (B nτ +1 + M n+1 − B nτ +1 A n +1Ω −1 A nτ +1 ) ⎞ ⎟ ⎜ ~ Fn = ⎜ -Tn L-n1+1 (B nτ +1 + M n +1 − B nτ +1 A n+1Ω −1 A nτ +1 ) ⎟ ⎟⎟ ⎜⎜ ~-1 τ -1 τ τ ⎠ ⎝ − Q ( A n +1B n+1L n+1B n +1 − A n+1 ) (a-23) And, after considering the definition of Ω and equation (a-16), we have −1 τ Ω = A nτ + 1 A n + 1 − A nτ + 1C n + 1D 11 C n +1A n +1 ~ −1 τ τ τ = A n + 1 A n + 1 − A n + 1B n + 1L n + 1B n + 1 A n + 1 ~ = A nτ + 1 ( I − B n + 1L−n1+ 1B nτ + 1 ) A n + 1

so Ω -1 = [ A nτ +1R n +1A n +1 ]−1

and

(a-24)

~ Bn+1L-n1+1Lnβˆ LS(n) − R n+1Yn+1 ~ ~ = Bn+1 (I-L-n1+1Bnτ +1Bn+1 )βˆ LS(n) − (I-Bnτ +1L-n1+1Bn+1 )Yn+1 ~ = (Bn+1L-n1+1Bnτ +1 − I)(Yn+1 − Βn+1βˆ LS(n)) = −R (Y − B βˆ LS(n)) n+1

n+1

n+1

So, decomposing the matrix equation (a-26) into some appropriate blocks, we get ~ ~ βˆ LS(n+1) = βˆ LS(n) + L-n1+1[Pn+1L-n1+1Lnβˆ LS(n)-Bnτ +1Ξn+1Rn+1Yn+1] ~ + L-n1+1Bn+1(Yn+1 − Bn+1βˆ LS(n)) ~ ~ = βˆ LS(n) + [L-n1+1Bnτ +1 − L-n1+1Bnτ +1Ξn+1Rn+1](Yn+1 − Bn+1βˆ LS(n)) ~ = βˆ LS(n) + L-n1+1Bnτ +1[Rn−1+1-Ξn+1]Rn+1(Yn+1 − Bn+1βˆ LS(n))

(a-27) and −1 τ ˆ LS(n +1 ) ⎞ ⎛ X ˆ LS(n) ⎞ ⎛ τ ⎛X ⎞ ⎜ 1 ⎟ ⎜ 1 ⎟ ⎜ ( A 1 A 1 ) A 1 B1 ⎟ ~ ⎜ M ⎟=⎜ M ⎟−⎜ ⎟L-n1+1B nτ +1 M ⎜⎜ ˆ LS(n +1 ) ⎟⎟ ⎜⎜ ˆ LS(n) ⎟⎟ ⎜⎜ τ ⎟⎟ −1 τ ⎝ Xn ⎠ ⎝ X n ⎠ ⎝ (A n A n ) A n B n ⎠ ⋅ R −1 -Ξ R ( Y − B βˆ LS(n) )

(

n +1

n +1

)

n +1

n

n +1

~ where Rn+1 = I - Bn+1L−n1+1Bnτ +1 . Inserting equations (a-18) and (a-24) into equation (a-23), we have

(a-28) From the formula (a-28), for i = 1,2 ,L ,n , we have

~ ⎛ L-n1+1Bnτ +1[I − An+1(Anτ +1Rn+1An+1)−1 Anτ +1Rn+1] ⎞ ⎜ ~ ⎟ Fn = ⎜ -TnL-n1+1Bnτ +1[I − An+1(Anτ +1Rn+1An+1)−1 Anτ +1Rn+1]⎟ ⎜⎜ ⎟⎟ (Anτ +1Rn+1An+1)−1 Anτ +1Rn+1 ⎝ ⎠ (a-25)

ˆ LS(n+1 ) = X ˆ LS(n) + (Aτ A )−1 Aτ B (βˆ LS(n) − βˆ LS(n+1 ) ) (a-29) X i i i i i i

[Step 3] From step 1 and step 2, substituting equations (a-18) and (a-26) into the equation (a-6), we immediately get the following expression ˆ LS(n+1 ) ⎞ ⎛Φ ˆ LS(n) + Μb Y ⎜ n ⎟ = Μan+1Φ n n+1 n+1 ⎜X ˆ LS(n+1 ) ⎟ ⎝ n+1 ⎠

(a-26)

~ ~ ~ ⎛ I + L-n1+1Pn+1L-n1+1Ln − L-n1+1Βnτ +1Bn+1 0⎞ ⎜ ⎟ ~-1 ~-1 ~-1 τ where Μan+1 = ⎜ -Tn (L , n+1Pn+1Ln+1Ln − Ln+1Bn+1Bn+1 ) I ⎟ ~-1 ⎜⎜ ⎟⎟ -Qn+1Ln+1Ln 0⎠ ⎝

and

~-1 ˆ LS(n) + (Aτ R A ) −1 Aτ R A ˆ LS(n+1) = -Q L X n+1 n+1Lnβ n+1 n+1 n+1 n+1 n+1 n+1 n+1 −1 τ ~−1 ˆ LS(n) τ = An+1Rn+1An+1 An+1 (Rn+1Yn+1 − Bn+1Ln+1Lnβ )

( = (A = (A

) ) )

−1 τ τ An+1 n+1Rn+1An+1

(R

n+1Yn+1

−1 τ τ An+1Rn+1 n+1Rn+1An+1

(Y

n+1

)

− Rn+1Bn+1βˆ LS(n)

)

ˆ LS(n)

− Bn+1β

(a-30) Combining the equations (a-27) ,…, (a-30), the result of Theorem 1 is obtained.

Suggest Documents