On Consistency of Subspace Methods for System Identication ? Magnus Jansson and Bo Wahlberg S3{Automatic Control Royal Institute of Technology (KTH), SE-100 44 Stockholm, Sweden
Conditions for consistency for a certain class of subspace system identication methods are given. It is shown that persistence of excitation of the input signal is, in general, not sucient. Stronger sucient conditions are presented. Abstract Subspace methods for identication of linear time-invariant dynamical systems typically consist of two main steps. First, a so-called subspace estimate is constructed. This rst step usually consists of estimating the range space of the extended observability matrix. Secondly, an estimate of system parameters is obtained, based on the subspace estimate. In this paper, the consistency of a large class of methods for estimating the extended observability matrix is analyzed. Persistence of excitation conditions on the input signal are given which guarantee consistent estimates for systems with only measurement noise. For systems with process noise, it is shown that a persistence of excitation condition on the input is not sucient. More precisely, an example for which the subspace methods fail to give a consistent estimate of the transfer function is given. This failure occurs even if the input is persistently exciting of any order. It is also shown that this problem can be eliminated if stronger conditions on the input signal are imposed. Key words: Consistency, subspace methods, identication, performance analysis, linear systems.
Parts of this paper have been presented at the IFAC'96 World Congress, San Francisco, USA, and at SYSID'97, Fukuoka, Japan. Corresponding author: Magnus Jansson, tel. +46 8 7907425, fax +46 8 7907329, e-mail:
[email protected]. ?
Preprint submitted to Elsevier Preprint
29 May 1998
1 Introduction The large interest in subspace methods for system identication is motivated by the need for useful engineering tools to model linear multivariable dynamical systems using experimental data 18,11,23,25]. These methods, sometimes referred to as 4SID (subspace state-space system identication) methods, are easy to apply to identication of multivariable systems. Reliable numerical algorithms form the basis of these methods. The statistical properties, however, are not yet fully understood. Initial results in this direction are reported in 26,14,8,27,2]. Conditions for consistency of the 4SID methods have often involved assumptions on the rank of certain matrices used in the schemes. It would be more useful to have explicit conditions on the input signals and on the system. The consistency of combined deterministic and stochastic subspace identication methods has been analyzed in 14]. In that paper, consistency is proven under two assumptions. One, is the assumption that the input is an ARMA process. Secondly, it is assumed that a dimension parameter tends to innity at a certain rate as the number of samples tends to innity. This assumption is needed to consistently estimate the stochastic sub-system and the system order. However, subspace methods are typically applied with xed dimension parameters. A consistency analysis under these conditions is given in 9,10]. The analysis is further extended in the current paper. The analysis focuses on the crucial rst step of the subspace identication methods. That is, the focus is on estimating the extended observability matrix, or the \subspace estimation" step. The methods analyzed herein, are the so-called Basic-4SID and IV-4SID. The Basic-4SID method 4,5,12,21] has been analyzed in 6,26,12,24]. Herein, we extend the analysis and give persistence of excitation conditions on the input signal which guarantee a consistent estimate of the extended observability matrix. The IV-4SID class of methods include N4SID 18], MOESP 23] 2
and CVA 11]. An open question regarding the IV-4SID methods has been whether or not only a persistence of excitation condition on the input is sufcient for consistency. The current contribution employs a counter-example to show that this is not the case. For systems with only measurement noise, however, it is possible to show that a persistently exciting input of a given order ensures consistency of the IV-4SID subspace estimate. For systems with process noise, IV-4SID methods may suer from consistency problems. This is explicitly shown in the aforementioned counter-example. A simple proof shows that a white noise or a low order ARMA input signal guarantees consistency in this case. The remainder of the paper is organized as follows. In Section 2, the system model is dened and the general assumptions are stated. Some preliminary results are also collected in this section. In Section 3, the underlying ideas of the Basic-4SID and IV-4SID methods are presented. In Section 4, the consistency of the Basic-4SID method is analyzed. In Section 5, the critical relation for consistency of IV-4SID methods is presented. Conditions for this relation to hold are given. In Section 6, the counter-example is given, describing a system for which the critical relation does not hold. In Section 7, some simulations are presented to illustrate the theoretical results. Finally, conclusions are provided in Section 8.
2 Preliminaries Assume that the true system is linear, time-invariant and described by the state-space equations
x(t + 1) = Ax(t) + Bu(t) + w(t) y(t) = Cx(t) + Du(t) + v(t):
(1a) (1b)
Here, x(t) 2 R n is the state-vector, u(t) 2 R m consists of the observed input signals, and y(t) 2 R l contains the observed output signals. The system is corrupted by the process noise w(t) 2 Rn and the measurement noise v(t) 2 R l . 3
In the remainder of this paper we will consider the system (1) in its innovations representation
x(t + 1) = Ax(t) + Bu(t) + Ke(t) y(t) = Cx(t) + Du(t) + e(t)
(2a) (2b)
where e(t) 2 R l is the innovation process. Let us introduce the following general assumptions.
A1: The innovation process e(t) is a stationary, zero-mean, white noise process with second order moments
Efe(t)eT (s)g = re ts where ts is the Kronecker delta. A2: The input u(t) is a quasi-stationary signal 13]. The correlation function is dened by
ru( ) = E fu(t + )uT (t)g where E is dened as N 1X E fg = Nlim !1 N t=1 Efg :
The input u(t) and the innovation process e(t) are assumed to be uncorrelated, E fu(t)eT (s)g = 0 for all t and s. A3: The system (2) is asymptotically stable, i.e., the eigenvalues of A lie strictly inside the unit circle. The eigenvalues of the matrix A ; KC are inside the unit circle. A4: The description (2) is minimal in the sense that the pair (A C) is observable and the pair (A B K]) is reachable. The concept of persistently exciting signals is dened in the following way: 4
Denition 1 A (quasi-stationary) signal u(t) is said to be persistently exciting of order n, denoted by PE(n), if the following matrix is positive denite 2 66 ru(0) 66 r (;1) 66 u ... 66 4
3
ru(1) : : : ru(n ; 1)7 7 ru(0) ru(n ; 2)777 ...
ru(1 ; n) ru(2 ; n) : : :
... ru(0)
: 7 7 7 5
For easy reference, the following two well known lemmas are given. They will be useful in the consistency analysis.
Lemma 2 Consider the (block) matrix 2 A S = 64
3
B7 5 CD
(3)
where A is n n and D is a non-singular m m matrix. Then
rankfSg = m + rankfA ; BD;1Cg:
PROOF. The result follows directly from the identity, 2 66 I 66 4
0
32 ;BD;1 77 66A 77 66 54
I
32 B777 666 7 56 4
I
3 2 0777 666 = 7 5 6 4
C D ;D;1 C I
3 0 777 7 5
0D
(4)
where = A ; BD;1C is the Schur complement of D in S.
Using (4), the next lemma is readily proven.
Lemma 3 Let S, as dened in (3), be symmetric (C = BT ), and let D be positive denite. Conclude then that S is positive (semi) denite if and only if (A ; BD;1BT ) is positive (semi) denite. 5
3 Background
In this section, the basic equations used in subspace state-space system identication (4SID) methods are reviewed. Dene the l 1 vector of stacked outputs as
T
y (t) = yT (t) yT (t + 1) : : : yT (t + ; 1)
where is a user-specied parameter which is required to be greater than the observability index (or, for simplicity, the system order n). In a similar manner, dene vectors made of stacked inputs and innovations as u (t) and e (t), respectively. By iteration of the state equations in (2), it is straightforward to verify the following equation for the stacked quantities
y(t) = ; x(t) + u (t) + n (t) 6
(5)
where n(t) = e(t), and where the following block matrices have been introduced: 2 66 66 66 ; = 6666 66 66 4
2 66 66 66 = 6666 66 66 4 2 66 66 66 = 6666 66 66 4
3 C 77 7 7 CA 7777 ... 777 7 7 7 ;1 5
CA
D
0
CB
D
...
...
...
...
3 0 77 7 7 0 7777 ... 777 7 7 7 5
CA;2B CA;3B D I
0
CK
I
...
...
...
...
3 077 7 7 07777 ... 777 7 7 7 5
:
CA;2K CA;3K I
The key idea of subspace identication is to rst estimate the n-dimensional range space of the (extended) observability matrix ;. The system parameters are then estimated in dierent ways. For consistency of 4SID methods it is crucial that the estimate of ; is consistent. There have been several suggestions in the literature for estimating ;. Most of them, however, are closely related to one another 20,25]. To describe the methods considered, some additional notation is needed. Assume there are N + + ; 1 data samples available and introduce the block Hankel matrices
Y = y (1 + ) : : : y (N + )] Y = y (1) : : : y (N )]: 7
(6) (7)
The parameter is to be chosen by the user. Intuitively, can be interpreted as the dimension of the observer used to estimate the current state 28]. The partitioning of data into Y and Y is sometimes referred to as \past" and \future". Dene U and U as block Hankel matrices of inputs similar to (6) and (7), respectively. Comparing (6) and (5), it is clear that Y is given by the relation
Y = ;Xf + U + N
(8)
where #
"
Xf = x(1 + ) x(2 + ) : : : x(N + ) and N is constructed from n(t) similar to (6). Two methods of estimating the range space of ; from data are presented next. The methods are based on relation (8). 3.1 Basic-4SID
The method described in this section is sometimes referred to as the Basic4SID method 4,5,12,21]. When considering this method, it is not necessary to partition the data into past and future. Hence it can be assumed that = 0. The idea is to study the residuals in the regression of y(t) on u (t) in (5). Introduce the orthogonal projection matrix on the null space of U as
?UT = I ; UT (U UT );1U : If the input u(t) is PE(), then the inverse of UUT exists for suciently large N . The residual vectors in the regression of y (t) on u (t) are now given by the columns of the following expression: p1 Y ?UT = p1 ; Xf ?UT + p1 N ?UT
N
N
N
8
(9)
p where the normalizing factor 1= N has been introduced for later convenience. The Basic-4SID estimate of ; is then given by
;^ = Q^ s
(10)
where Q^ s is obtained from the singular value decomposition (SVD) "
2 # 6S 66 ^ s Q^ s Q^ n 64
32 3 0 777 666V^ sT 777 = 7 56 4 7 5
0 S^ n V^ nT
p1 Y ?UT :
N
Here, S^ s is a diagonal matrix containing the n largest singular values and Q^ s contains the corresponding left singular vectors. 3.2 IV-4SID
The second type of estimator that will be analyzed is called IV-4SID 18,11,23]. The reason for this name is that this approach is most easily explained in terms of instrumental variables (IV:s) (see e.g. 25]). The advantage of IV-4SID (compared to Basic-4SID) is that it can handle more general noise models. The main idea here is to use an instrumental variable, which not only removes the eect of u (t) in (5) (as in the Basic-4SID), but also reduces the eect of n (t). The rst step is to remove the future inputs. This is the same as in Basic-4SID. The second step is to remove the noise term in (9). This is accomplished by multiplying from the right by the \instrumental variable matrix" 2 3 6 Y 777 6 6 P = 64 75 :
U
(11)
Post-multiplying both sides of (9) with the matrix in (11), and normalizing p by 1= N yields, 1 Y ? PT = 1 ; X ? PT + 1 N ? PT N UT N f UT N UT 9
Table 1 Weighting matrices corresponding to specic methods. Notice that some of the weightings may not be as they appear in the referred papers. These weightings, however, give estimates of ;^ identical to those obtained using the original choice of weighting. W^ c W^ r Method ;1=2 ? T 1 I PO-MOESP N P UT P ;1 1=2 1 ? T 1 T I N4SID N P UT P N P P ;1=2 ;1=2 1 1 ? T T IVM N P P N Y UT Y ;1=2 ;1=2 ? T ? T 1 1 CVA N P UT P N Y UT Y
where, due to the assumptions, the second term on the right tends to zero with probability one (w.p.1) as N tends to innity (since the input and the noise process are uncorrelated, and since there is a \time shift" between N and the noise in Y ). A quite general estimate of ; is then given by ^ r;1Q^ s ;^ = W
(12)
where Q^ s is the l n matrix made of the n left singular vectors of 1W ^ r Y ?UT PT W ^c (13) N corresponding to the n largest singular values. The (data dependent) weight^ r and W ^ c can be chosen in dierent ways to yield a class of ing matrices W estimators. This class includes most of the known methods appearing in the literature see Table 1, cf. 20,25]. The methods PO-MOESP, N4SID, IVM and CVA appear in 23,18,25,11], respectively. For some results regarding the eect of the choice of the weighting matrices, see 7,8,19].
4 Analysis of Basic-4SID In this section, conditions are established by which the Basic-4SID estimate ;^ in (10) is consistent. Here, consistency means that the column span of ;^ tends to the column span of ; as N tends to innity (w.p.1). (In other words, there exists a non-singular n n matrix T such that limN !1 ;^ = ; T.) 10
This method has been analyzed previously (see 6,26,12,24]). This problem is reconsidered here, however, for two reasons. First, the proof to be presented appears to be more explicit than previous ones (known to the authors). We give less restrictive conditions on the input signal than previously known results and the order of persistence of excitation needed is given. Second, these results are useful in the analysis of the more interesting IV-based methods given in the next section. Recall Equation (5) and note that the noise term n(t) is uncorrelated with u (t) and x(t). It is therefore relatively straightforward to show that the sample covariance matrix of the residuals in (9) R^ " = N1 Y ?UT YT tends to
R" = ;(rx ; rxuR;u 1rTxu);T + Rn
(14)
w.p.1, as N ! 1. Here, the various correlation matrices are dened as:
rx = E fx(t)xT (t)g rxu = E fx(t)uT (t)g Ru = E fu(t)uT (t)g Rn = E fn(t)nT (t)g: Recall that u(t) is assumed to be PE(). This implies that Ru is positive denite. From (14) it can be concluded that the estimate of ; in (10) is consistent if and practically only if, 1
Rn = 2I rx ; rxuR;u 1rTxu > 0
(15) (16)
for some scalar 2. If (15) and (16) hold, then the limit of ;^ (as dened in (10)), equals ;T for some non-singular n n matrix T. The condition (15) 1 There may exist special cases when Basic-4SID is consistent even when Rn = 6 2 I. However, these cases depend on the system parameters and are not of interest here.
11
is essentially only true for output error systems perturbed by a measurement noise of equal power in each output channel. That is, if K = 0 and re = 2 I. In the following it is shown under what conditions (on the input signals) the condition (16) is fullled. In view of Lemma 3, the condition (16) is equivalent to 82 32 3T 9 2 3 > > > > > > > > > > rx rxu777 0: 7 7 > > > 4 54 5 > 4 T 5 > > > > > rxu Ru : u (t) u (t) >
(17)
Next, the state is decomposed into two terms x(t) = xd (t) + xs(t), where the \deterministic" term xd(t) is due to the observed inputs and the \stochastic" term xs(t) is caused by the innovations process e(t). Due to assumption A2, xs(t) is uncorrelated with xd (t) and u(t). Furthermore, introduce a transfer function (operator) description of xd (t) as u ;1 xd (t) = Fa((qq;1)) u(t) where q;1 is the backward shift operator (q;1u(t) = u(t;1)). The polynomials Fu (q;1) and a(q;1) are dened by the relations
Fu(q;1) = AdjfqIqn; AgB = Fu1 q;1 + Fu2 q;2 + + Fun q;n a(q;1) = detfqI ; Ag
qn = a0 + a1 q;1 + + an q;n (a0 = 1)
(18) (19)
where AdjfAg denotes the adjugate matrix of A and detfg is the determinant. The polynomial coecients ai (i = 0 : : : n) are scalars while all Fui (i = 1 : : : n) are n m matrices. Analogously to Fu(q;1 ), dene the matrix polynomial
Fe (q;1) = AdjfqIq;n AgK = Fe1 q;1 + Fe2 q;2 + + Fen q;n 12
where Fei (i = 1 : : : n) are n p matrices Thus, xs(t) is given by the relation
;1
xs(t) = Fa((qq;1 )) e(t): e
Also dene the Sylvester-like matrix
S
2 66 Fun 66 66 6an Im = 66 66 66 64 |
0
:::
Fu 1
0 ::: 0
Fen
:::
: : : a1Im a0Im 0 0 ::: ... ... ...
3
Fe 7
17 7 7 0 7777 : ... 777 7 7 7 5
anIm : : : : : : a0 Im 0 : : : 0 {z (n+m)((+n)m+np)
(20)
}
Using the notation introduced above, it can be veried that 2 3 6 u(t ; n) 77 6 6 7 6 7 6 7 . . 6 7 . 6 7 6 7 2 3 6 7 6 7 66 x(t) 77 6 7 u ( t + ; 1) 6 7 66 77 = S 1 6 7 : 6 7 a(q;1) 66 7 4 5 7 u (t) e(t ; n) 77 6 6 6 7 6 7 ... 6 7 6 7 6 7 6 7 4 5
(21)
e(t ; 1)
It is shown in Appendix A that S has full row rank under assumption A4. This implies that the condition (17) is satised if the following matrix is positive 13
denite:
8 2 3 2 3T 9 > > > > > > > > 6 7 6 7 > > u ( t ; n ) u ( t ; n ) > > 6 7 6 7 > > > > 6 7 6 7 > > > > 6 7 6 7 > > > > 6 7 6 7 > > . . > > . . 6 7 6 7 > > . . > > 6 7 6 7 > > > > 6 7 6 7 > > > > 6 7 6 7 > > > > 6 7 6 7 > > > 6 7 6 7 > > = < 1 6u(t + ; 1)7 1 6u(t + ; 1)7 > 66 7 6 7 E : 7 7 > > a(q;1 ) 66 a(q;1 ) 666 7 7 > > > > 7 7 > > > e(t ; n) 77 >>>> 66 e(t ; n) 7 6 > > 7 6 > > > 66 7 6 7 > > > > 7 6 7 > > . . > > . . 6 7 6 7 > > > > . . 6 7 6 7 > > > > 6 7 6 7 > > > > 6 7 6 7 > > > > > > 4 5 4 5 > > > > > > e ( t ; 1) e ( t ; 1) :
(22)
Since u(t) and e(t) are uncorrelated, the covariance matrix in (22) will be block diagonal. The upper diagonal block can be shown to be positive denite if u(t) is PE( + n) see Lemma A.3.7 in 17], modied to quasi-stationary signals. The lower diagonal block is positive denite assuming re > 0. If re = 0, then x(t) does not depend on the process fe(t)g and, hence, its contribution in Equations (21) and (22) can be removed. In conclusion, the following theorem has been proven.
Theorem 4 Under the general assumptions A1-A4, the Basic-4SID subspace estimate (10) is consistent if the following conditions hold:
(1) The covariance matrix of the stacked noise is proportional to the identity matrix. That is, Rn = 2 I. (2) The input signal u(t) is persistently exciting of order + n.
Remark 5 As previously mentioned, condition (1) of the theorem is essentially only true if K = 0 and if re = 2 I. Furthermore, K = 0 implies that the system must be reachable from u(t). 5 Analysis of IV-4SID In this section, conditions for the IV-4SID estimate ;^ dened in (12) to be consistent are established. As N ! 1, the limit of the quantity in (13) is 14
w.p.1 given by
Wr ;(rxp ; rxuR;u 1Rup)Wc
(23)
^ r and W ^ c, respectively. The correlation where Wr and Wc are the limits of W matrices appearing in (23) are dened as follows:
rxp = E fx(t + )pT (t)g rxu = E fx(t)uT (t)g Ru = E fu (t)uT (t)g Rup = E fu (t + )pT (t)g where p (t) t = 1 : : : N are dened as the columns of P in (11). If the weighting matrices Wr and Wc are non-singular, 2 the estimate (12) is consistent if and only if the following matrix has full rank equal to n:
rxp ; rxuR;u 1Rup:
(24)
This is because the limiting principal left singular vectors of (13) will be equal to Wr ;T for some non-singular n n matrix T. Hence, the limit of ;^ in (12) is ;T. In view of Lemma 2, the condition that (24) should have full rank can be reformulated as
rank
8 3T 9 2 > > > > > > > > 2 3 7 > > 6 y ( t ) > > 7 > > 6 > > 7 > > 6 > > 6 7 7 = >> = n + m: > 6 > 4 5 > 6 > > 7 > u (t + ) 64 > > > > 5 > > > > > > > u ( t + ) :
(25)
This is the critical relation for consistency of IV-4SID methods. 2 It is not necessary to assume that the weighting matrices are non-singular
for proving consistency of IV-4SID. For example, it is sucient that rank(rxp ; rxuR;u 1 Rup)Wc = rank(rxp ; rxuR;u 1 Rup) = n. However, we believe that the results become clearer if we assume that Wr and Wc are non-singular and focus on the rank of (24).
15
Table 2 Sucient conditions for the weighting matrices Wr and Wc appearing in Table 1 to be non-singular. Method Order of PE Condition on noise PO-MOESP + Efn (t)nT (t)g > 0 N4SID + Efn (t)nT (t)g > 0 IVM , max( ) Efn (t)nT (t)g > 0 CVA + Efn (t)nT (t)g > 0
Let us begin by deriving conditions which ensure that the weighting matrices in Table 1 are non-singular. Consider for example the limit of the column weighting Wc for the PO-MOESP method. We need to establish that P ?UT PT =N is positive denite for large N . Using Lemma 3, that condition can equivalently be written as 82 32 3T 9 > > > > > > > > > 6 7 6 7 > y ( t ) y ( t ) > > > 6 7 6 7 > > > > 6 7 6 7 > > > 66 u (t) 77 66 u (t) 77 > > 0: > > > 66 7 6 7 > > > > 7 6 7 > > > > > 4 5 4 5 > > > > > : u (t + ) u (t + ) >
(26)
This is recognized as a similar requirement appearing in least-squares linear regression estimation 16, Complement C5.1]. The condition (26) holds if u(t) is PE( + ) and if Efn (t)nT (t)g > 0. Similarly, conditions for all cases in Table 1 can be obtained. The result is summarized in Table 2. Having established that the weighting matrices are non-singular, condition (25) may now be examined. The correlation matrix in (25) can be written as the sum of two terms: 8 8 2 3T 9 2 3T 9 > > > > > > > > > > > > > > 2 3 6 yd (t) 7 > 2 3 6ys (t)7 > > > > > > > > > > 6 7 > > 6 7 > > > > > > 6 7 > > 6 7 > d s > 6 7 > > 6 7 > x ( t + ) >> > 66 7 > > 6 > > > 4 5 4 5 > 7 > > 6 > > > > > 7 > > 6 7 > u (t + ) 64 0 > > > > > > > > 5 4 5 > > > > > > > > > u (t + ) : 0 >>> :
(27)
where the superscripts ()d and ()s denote the deterministic part and the 16
stochastic part, respectively. Below we show that the rst term in (27) has full row rank under a PE condition on u(t). This implies that the critical relation (25) is fullled for systems with no process noise (since xs(t) 0 in that case). For systems with process noise, one may in exceptional cases lose rank when adding the two terms in (27). One example of such a situation is given in the next section. Generally, loss of rank is unlikely to occur (at least if l n). If l n then the matrix in (27) has many more columns than rows. Hence it is more likely that the matrix has full rank. This last point can be made more strict (see Theorem 11 and 14]). In the following it is shown that the rst term in (27) has full row rank under certain conditions. It can be veried that the correlation matrix is equivalent to 82 32 3T 9 2 3 > > > > > > 2 3 > T 0 0 7 d (t) 7 > > 66 xd (t) 7 6 > 6 x ; > > > 7 6 7 > 6 7 > > 7 6 7 > 6 7 66A C (A B) 0 77 > > > 6 7 6 7 > 6 7 > 4 7 6 7 > 6 7 > 7 6 7 > 6 7 0 0 Im 5 >>>>>>>664 > > 5 4 5 4 5 > > > > > : u (t + ) u (t + ) 0 0 Im
(28)
where C (A B) denotes the reversed reachability matrix. That is, "
#
C (A B) = A;1 B A;2 B : : : B :
(29)
Clearly, the rst factor has full row rank (n + m) if the system (2) is reachable from u(t) and if is greater than or equal to the controllability index. Notice that the last factor has full row rank, provided is greater than or equal to the observability index. A sucient condition is then obtained if the covariance 17
matrix in the middle of (28) is positive denite. We have 82 32 3T 9 > > > > > > > d (t) 7 6 xd (t) 7 > > 6 > x > > > 6 7 6 7 > > > > 6 7 6 7 > > = E 6 u (t) 7 6 u (t) 7 > 66 77 66 77 > > > > > > > > 6 7 6 7 > > > > > 4 5 4 5 > > > > > > u ( t + ) u ( t + ) : ( ! !) 1 1 T = S E a(q;1 ) u++n (t ; n) a(q;1) u++n(t ; n) S T
where
S
2 66 Fun 66 66 6an Im = 66 66 66 64 |
0
: : : Fu1
0 ::: 0
: : : a1Im a0Im 0 ... ...
3 7 7 7 7 7 7 7 7 : 7 7 7 7 7 7 5
(30)
(31)
anIm : : : : : : a0 Im {z } (n+m+m)((++n)m)
If the system is reachable from u(t) then S is full row rank according to Appendix A. The matrix in (30) then, is positive denite if u(t) is PE( + + n). The result is summarized in the following theorem. Theorem 6 The IV-4SID subspace estimate ;^ given by (12) is consistent if the general assumptions A1-A4 hold in addition to the following conditions: (1) (2) (3) (4) (5)
The weighting matrices Wr and Wc are non-singular. (A B) is reachable. max(observability index, controllability index). K = 0. The input u(t) is persistently exciting of order + + n.
Remark 7 The condition K = 0 ensures that the second term in (27) is zero. Remark 8 The conditions for the non-singularity of the weighting matrices in Table 2 are fullled if re > 0 in addition to the PE condition on u(t). We conclude this section by giving three alternate consistency theorems for IV4SID. The rst relaxes the order of PE on u(t) for single input systems, while 18
the second strengthens the conditions on u(t) in order to prove consistency for systems with process noise. Finally, the third theorem considers single input ARMA (autoregressive moving average) signals.
5.1 Single input systems
For single input systems, we may prove that the rst term in (27) is full rowrank in an alternate way than was made in Theorem 6. This leads to a less restrictive condition on the input excitation.
Theorem 9 For single input systems, the IV-4SID subspace estimate ;^ given by (12) is consistent if the general assumptions A1-A4 apply in addition to the following conditions: (1) (2) (3) (4) (5)
The weighting matrices Wr and Wc are non-singular. (A B) is reachable. the observability index. rw = 0. The input u(t) is persistently exciting of order + n.
PROOF. Consider a single input system and dene the vector polynomial in q;1:
Fy (q;1) = CAdjfqqIn; AgB + D = Fy0 + Fy1 q;1 + Fy2 q;2 + + Fynq;n where Fyi (i = 0 ::: n) are l 1 vectors. This implies that Pn y q ;k F k =0 d y (t) = Pn a kq;k u(t) k=0 k
19
(32)
where faig is dened in (19). Next, dene the Sylvester matrix 2 66Fyn 66 66 66 66 66 0 66 66 66 an 6 Sy = 666 66 66 66 0 66 66 0 66 66 .. 66 . 64 |
::: ...
Fy
0 0 :::
0
...
...
Fyn : : : Fy0 0 : : : : : : a0 0 0 ::: ... ... ... an : : : a0 0 : : : : : : 0 an : : : a0 ... ... ...
0 ::: 0 0
an : : :
{z (++l)(n++)
3 0 77 7 ... 777 7 7 7 0 777 7 7 0 7777 ... 777 7 7 7 7 0 777 7 7 0 777 7 7 7 7 7 7 5 a0 }
9 > > > > > > > > = l rows > > > > > > > > 9 > > > > > > > > = rows : > > > > > > > > 9 > > > > > > > > = rows > > > > > > > >
(33)
It can be shown that the rank of Sy is n + + if the system is observable and reachable from u(t), and if is greater than or equal to the observability index 1,3]. A straightforward (simple) proof of this fact is given in Appendix B, using the notation introduced in this paper. The rst term in (27) can now be written as 8 2 3T 9 > > > > > > > 2 3 6 yd (t) 7 > > > > > 6 7 > > > > 6 7 > > d 6 7 > > 7 = 6 > > > 4 5 6 7 > > > > 7 > > u (t + ) 6 > > > > 4 5 > > > > > > u ( t + ) :
where
(34)
)
(
RU = E a(q1;1 ) un+(t + ; n) a(q1;1) uT++n(t ; n) : The covariance matrix RU is full row rank if u is PE( + n). It is then clear that (34) is full rank since S is non-singular and Sy is full column rank. 20
Therefore, for single input systems we can relax the assumption about the input signal. The proof of Theorem 9 does not carry over to the multi-input case because Sy is not full column rank if m > 1 (see Appendix B). 5.2 White noise input
Next, we give the following consistency result concerning the case when the input is a white noise process.
Theorem 10 Let the input u(t) to the system in (2) be a zero-mean white sequence in the sense that ru( ) = ru(0)0 and ru (0) > 0. Assume that A1A4 hold, the weighting matrices Wr and Wc are positive denite, and
rank C (A B) C (A G) = n: Here, C (A B) is dened in (29),
(35)
C (A G) = A;1 G A;2 G : : : G
and the matrix G is dened as
G = E fx(t + 1)yT (t)g: Given the above assumptions, the IV-4SID subspace estimate ;^ dened in (12) is consistent.
PROOF. Consider the correlation matrix in (25):
8 2 3T 9 > > > > > > > > 2 3 2 > 6 7 > y ( t ) > > > 6 7 > > > > 6 7 > fx(t + )yT (t)g > = 6E E 666 77 6 u (t) 77 = 6 6 6 > 7 > > > 4 5 66 4 > 7 > > > > 6 7 > u ( t + ) 0 > > > 4 5 > > > > > > u (t + ) > :
E fx(t + )uT (t)g 0
where
Ru = E fu (t)uT (t)g = I ru(0): 21
0
R u
3 7 7 7 7 5
(36)
The symbol denotes the Kronecker product. In (36), we used the fact that E fx(t)uT (t + k)g = 0 for k 0 since u(t) is a white sequence. Notice that Ru > 0 since ru(0) > 0. Under the given assumptions it is readily proven that E fx(t + )yT (t)g = C (A G) and E fx(t + )uT (t)g = C (A B) (I ru(0)) : We conclude that the matrix in (36) is full row rank under the reachability assumption given in (35). This concludes the proof. Notice that Theorem 10 guarantees consistency even for systems with process noise. Also observe that we here were able to relax the reachability condition on (A B) compared to the previous theorems. Some other results for the case of a white noise input signal can be found in 22]. 5.3 Scalar ARMA input signal
Finally, we give a consistency result for the case when u(t) is a scalar ARMA signal.
Theorem 11 Assume that the scalar (m = 1) input u(t) to the system in (2) is generated by the ARMA model
F (q;1)u(t) = G(q;1)"(t) where "(t) is a white noise process. Here, F (z ) and G(z) are relatively prime polynomials of degree nF and nG, respectively, and they have all zeros outside the unit circle. Furthermore, assume that A1-A4 hold, and (1) (2) (3) (4) (5)
the weighting matrices Wr and Wc are non-singular. (A B) is reachable. n nG ; n nF + ; n
22
Given the above assumptions, the IV-4SID subspace estimate ;^ in (12) is consistent.
PROOF. Study the last two block columns of (25). Clearly, if (recall that m = 1)
rank
82 32 3T 9 > > > > > > > > > 6 7 > u ( t ) 66 =n+ 7 6 7 > > > 4 5 4 5 > > > > > : u (t + ) u (t + ) >
(37)
holds, then the critical relation (25) is satised. Using the notation introduced in (31), we can write the correlation matrix in (37) as (
)
h 1 i T S E u ( t + ; n ) u+ (t) a(q;1 ) n+
(38)
where S is (n + ) (n + ) and non-singular since (A B) is reachable (see Appendix A). Lemma A3.8 in 17] shows that the correlation matrix appearing in (38) is full row rank if max( + n + nG n + nF ) + which can be easily transformed to the conditions of the theorem.
Notice that the theorem holds for arbitrarily colored noise as long as it is uncorrelated with the input. With some more eort the theorem can be extended to the multi-input case by modifying Lemma A3.8 in 17]. For example, the theorem holds if the dierent inputs are independent ARMA signals. The result of the theorem is very interesting since it gives a quantication of the statement that \for large enough , the matrix in (27) is full rank". Also observe that an AR-input of arbitrarily high order is allowed if is chosen large enough. 23
6 A Counter-example The purpose of this section is to present an example of a system for which the IV-4SID methods are not consistent. We have shown that a necessary condition for consistency is the validity of (25). In the discussion following (27), it was also pointed out that consistency problems may occur for systems with process noise. In this section, we give an explicit example for which the rank drops when the two terms in (27) are added. Consider the cross correlation matrix appearing in (25) which with obvious notation, reads 2 66Rxdyd 66 4
3 + Rxsys Rxdu Rxdu 77 7 : 7 5
Ruyd
Ruu Ruu
The construction of a system for which this matrix loses rank is divided into two major steps: (1) Find a vector v such that
2 6 Rxd u vT 6664
3 Rxdu 777 = 0: 7 5
Ruu Ruu
(2) Design Rxsys such that
2 3 6R d d + Rxs ys 7 7 7 vT 6664 x y = 0: 7 5
Ruyd
(39)
The rst step turns out to be similar to a construction used in the analysis of instrumental variable methods (see 15,17] and the references therein). Once a solution to the rst step is found, the second step is a matter of investigating whether or not a stochastic subsystem exists, generating a valid Rxsys . Let us study the rst step in more detail. For notational convenience, dene 24
the following covariance matrix )
(h
1 u (t ; t )iuT (t ; t ) : P (t1 t2 ) = E 1
2 a(q;1) The correlation matrix of interest in the rst step can be written as 2 66 Rxd u 66 4
3 Rxdu 777 = SP(+n)(+) (n ; 0) 7 5
Ruu Ruu
where S is dened in (31). Here, however, it is of dimension (n + m) (( + n)m). The rank properties of P (0 0) have been studied in 17]. P is full rank either if a(q;1 ) is positive real and u(t) is persistently exciting, or if u(t) is an ARMA-process of suciently low order (e.g., white noise, cf. Theorem 11). Here, we consider an a(q;1 ) that is not positive real to make P rank decient. This implies that n 2. Let us x the following parameters: n = 2 m = 1 l = 1 = 2 and = 3. If (A B) is reachable then S is non-singular (see Appendix A). The rst step can then be solved if there exists a vector g such that gT P55 (0 0) = 0. A solution to this can be found by using the same setup as that in the counter-example for an instrumental variable method given in 17]. Choose a pole polynomial and an input process according to:
a(q;1) = (1 ; q;1 )2 u(t) = (1 ; q;1 )2(1 + q;1)2"(t)
(40) (41)
where "(t) is a white noise process of unit variance. 3 Notice that should have a modulus less than one for stability and that the input is persistently
3 Note that condition (4) of Theorem 11 is violated since = n and nG = 4.
25
exciting of any order. Now, a straightforward calculation leads to 2 66 1 ; 2 4 66 66 66 2 6 P55 (0 0) = 666 2 66 66 66 0 64
0
3 ;4 3 6 ; 2 2 2 5 4 77 7 7 7 1 ; 2 4 ;4 3 6 ; 2 2 2 5 77 7 7 7 4 3 6 2 2 1 ; 2 ;4 ; 2 77 : 7 7 7 2 2 1 ; 2 4 ;4 3 77 7 7 5 2 4 2 1 ; 2 0
This matrix is singular if is a solution to det(P55 (0 0)) = 1 + 4 4 + 10 8 ; 8 12 ; 15 16 ; 12 20 = 0:
(42)
The polynomial has two real roots for 0:91837. Choose, for example, the positive root for and g as the left eigenvector corresponding to the zero eigenvalue of P . This provides a solution to the rst step (namely, vT = gT S ;1 ). The solution depends on the matrix B. Until now, we have xed the poles of the system and a specic input process.
Remark 12 Observe that a subspace method that only uses the inputs U as
instrumental variables (cf. (11)) is not consistent for systems satisfying the rst step. In other words, the past input MOESP method (see 23]) is not consistent if the poles and the input process are related as in (40)-(41) and if is a solution to (42).
The free parameters in the second step are B C D and those specifying the stochastic subsystem, with constraints on minimality of the system description. Somewhat arbitrarily, 4 we make the choices: B = 1 ; 2]T , C = 2 ; 1] and D = 0. Consider the following description of the system in state space
B C and D we were not able to nd a solution to the second step. For other choices of C, making the system \less observable", we found solutions with lower noise variance. 4 For some choices of
26
form:
2 662 x(t + 1) = 664 "
3 2 6 ; 2 77 6 7 x (t) + 66 7 5 4
1 0
#
3 2 3 6 1 77 k1 777 6 7 6 u(t) + 64 75 e(t) 7 5 ;2 k2
y(t) = 2 ;1 x(t) + e(t)
(43a) (43b)
where the variance of the noise process e(t) is re. With the choices made, the system is observable and reachable from u(t). For a given , (39) now consists of two non-linear equations in k1 k2 and re. The equations are, however, linear in k1 re, k2 re, k21re, k22re and k1k2 re. If we change variables according to:
krel = kk1 2 2 1 = k2 re 2 = k2 re
then the two non-linear equations become linear in 1 and 2. Thus, given and krel , we can easily solve for 1 and 2 . Observe that 1 must be positive to be a valid solution. A plot of 1 as a function of and krel is shown in Fig. 1. The solution 1 is positive when krel is between 0:7955 and (4 3 ; 3 2 + 1)=(2 2 ; 2 + 2) 0:8475 (1 is discontinuous at this point). Let us, for example, choose the solution corresponding to krel = 0:825. The parameters become: 0:9184 re 217:1 k1 ;0:1542 and k2 ;0:1869. Thus, if the input (41) is applied to the system (43), and if = 3 and = 2, then the matrix in (25) is rank decient. We conclude this section with some short comments about the counter-example. Indeed, the example presented in this section is very special. However, the existence of these kind of examples indicates that the subspace methods may have a very poor performance in certain situations. As should be clear from the derivations in this paper, the consistency of 4SID methods are related to the consistency of some instrumental variable schemes (see 15,17] and the references therein). When using 4SID methods, it is a good idea to estimate a set of 27
100
80
1
60
40
20
0 0.9183 0.84
0.91835
0.83
0.9184
krel
0.82 0.81
0.91845 0.8 0.9185
0.79
Fig. 1. The graph shows 1 as a function of and krel .
models for dierent choices of and . These models should then be validated separately to see which gives the best result. This reduces the performance degradation due to a bad choice of and . In the counter-example presented above, such a procedure would probably have led to another choice of the dimension parameters (given that the system order is known). While this may lead to consistent estimates, the accuracy may still be poor (as demonstrated in the next section).
7 Numerical Examples In this section, the previously obtained results are illustrated by means of numerical examples. First, the counter-example given in the previous section is simulated. Thus, identication data are generated according to (41) and (43). The extended observability matrix is estimated by (12) using the weighting matrices for the CVA method (see Table 1). Similar results are obtained with other choices of the weighting matrices. The system matrix is then estimated 28
Table 3 The gures shown in the table are the absolute values of the bias and, within brackets, the standard deviations. = 3 = 2 = 6 = 5 pole 1 5.9572 (69.8826) 0.4704 (0.4806) pole 2 2.6049 (30.9250) 0.1770 (0.3526)
as
A^ = ;^ y1;^ 2 where ()y denotes the Moore-Penrose pseudo inverse, ;^ 1 denotes the matrix made of the rst ; 1 rows and ;^ 2 contains the last ; 1 rows of ;^ . To measure the estimation accuracy we consider the estimated system poles (i.e., the eigenvalues of A^ ). Recall that the system (43) has two poles at = 0:9184. Table 3 shows the absolute value of the bias and the standard deviation of the pole estimates. The sample statistics are based on 1000 independent trials. Each trial consists of 1000 input-output data samples. The results for two dierent choices of and are given in the table. The rst corresponds to the choice made in the counter-example. It is clearly seen that the estimates are not useful. For the second choice ( = 6 = 5), it can be shown that the algorithm is consistent. This is indicated by the much more reasonable results in Table 3. The accuracy, however, is still very poor since the matrix in (25) is nearly rank decient. Next, we study the inuence of the stochastic sub-system on the estimation accuracy. Recall that in Section 6 the stochastics of the system were designed to cause a cancellation in (27). The zeros of the stochastic sub-system are given by the eigenvalues of A ; KC which become, approximately: 0:9791 i0:1045. In the next simulation, the locations of these zeros are changed such that they have the same distance to the unit circle, but the angle relative to the real axis is increased ve times. This is accomplished by changing K in (43) to ;0:2100 ;0:5590]T . The zeros then become 0:8489 i0:4990. The simulation results are presented in Table 4. It can be seen that the estimation accuracy is more reasonable in this case. 29
Table 4 Same as Table 3, but for the system with K = ;0:2100 ;0:5590]T . = 3 = 2 = 6 = 5 pole 1 0.1161 (0.2405) 0.0316 (0.0452) pole 2 0.2266 (0.5043) 0.0282 (0.0290) Table 5 Same as Table 3, but with a white noise input signal. = 3 = 2 = 6 = 5 pole 1 0.0464 (0.1146) 0.0184 (0.0271) pole 2 0.0572 (0.0523) 0.0182 (0.0171)
Finally, we change the input to be realizations of a zero-mean white noise process. The power of the white noise input is chosen to be equal to the power of the colored process (41). Except for the input, the parameters are the same as in the counter-example. Recall Theorem 10, which shows that the estimates are consistent in this case. Table 5 presents the result of the simulations. A comparison with the results of Table 3 clearly shows a great dierence.
8 Conclusions Conditions that ensure consistency of the subspace estimate used in subspace system identication methods have been given. For systems without process noise, a persistence of excitation condition on the input signal was obtained. With process noise, a persistence of excitation condition alone is not sucient. In fact, an example was given of a system for which subspace methods are not consistent. The system is reachable from the input and the input is persistently exciting of any order. The subspace methods fail, however, to provide a consistent model of the system transfer function. It was also shown that a white noise input or an ARMA input signal of suciently low order guarantees consistency under weak conditions. This shows that the performance of subspace methods may be sensitive to the input excitation. 30
Acknowledgement The authors are very grateful to Professor Petre Stoica for fruitful discussions, suggestions and comments.
References 1] B. D. O. Anderson and E. I. Jury. \Generalized Bezoutian and Sylvester Matrices in Multivariable Linear Control". IEEE Transactions on Automatic Control, AC-21:551{556, Aug. 1976. 2] D. Bauer, M. Deistler, and W. Scherrer. \The Analysis of the Asymptotic Variance of Subspace Algorithms". In Proc. 11th IFAC Symposium on System Identi cation, pages 1087{1091, Fukuoka, Japan, 1997. 3] R. R. Bitmead, S. Y. Kung, B. D. O. Anderson, and T. Kailath. \Greatest Common Divisors via Generalized Sylvester and Bezout Matrices". IEEE Transactions on Automatic Control, 23(6):1043{1047, Dec. 1978. 4] B. De Moor and J. Vandewalle. \A Geometrical Strategy for the Identication of State Space Models of Linear Multivariable Systems with Singular Value Decomposition". In Proc. of the 3rd International Symposium in Applications of Multivariable System Techniques, pages 59{69, Plymouth, UK, Apr. 1987. 5] B. De Moor, J. Vandewalle, L. Vandenberghe, and P. Van Mieghem. \A Geometrical Strategy for the Identication of State Space Models of Linear Multivariable Systems with Singular Value Decomposition". In Proc. IFAC 88, pages 700{704, Beijing, China, 1988. 6] B. Gopinath. \On the Identication of Linear Time-Invariant Systems from Input-Output Data". The Bell System Technical Journal, Vol. 48(5):1101{1113, 1969. 7] M. Jansson and B. Wahlberg. \On Weighting in State-Space Subspace System Identication". In Proc. European Control Conference, ECC'95, pages 435{440, Roma, Italy, 1995. 8] M. Jansson and B. Wahlberg. \A Linear Regression Approach to State-Space Subspace System Identication". Signal Processing, 52(2):103{129, July 1996. 9] M. Jansson and B. Wahlberg. \On Consistency of Subspace System Identication Methods". In Preprints of 13th IFAC World Congress, pages 181{186, San Francisco, California, 1996. 10] M. Jansson and B. Wahlberg. \Counterexample to General Consistency of Subspace System Identication Methods". In 11th IFAC Symposium on System Identi cation, pages 1677{1682, Fukuoka, Japan, 1997.
31
11] W. E. Larimore. \Canonical Variate Analysis in Identication, Filtering and Adaptive Control". In Proc. 29th CDC, pages 596{604, Honolulu, Hawaii, December 1990. 12] K. Liu. \Identication of Multi-Input and Multi-Output Systems by Observability Range Space Extraction". In Proc. CDC, pages 915{920, Tucson, AZ, 1992. 13] L. Ljung. System Identi cation: Theory for the User. Prentice-Hall, Englewood Clis, NJ, 1987. 14] K. Peternell, W. Scherrer, and M. Deistler. \Statistical Analysis of Novel Subspace Identication Methods". Signal Processing, Special Issue on Subspace Methods, Part II: System Identi cation, 52(2):161{177, July 1996. 15] T. Soderstrom and P. Stoica. \Comparison of Some Instrumental Variable Methods { Consistency and Accuracy Aspects". Automatica, 17(1):101{115, 1981. 16] T. Soderstrom and P. Stoica. System Identi cation. Prentice-Hall International, Hemel Hempstead, UK, 1989. 17] T. Soderstrom and P. G. Stoica. Instrumental Variable Methods for System Identi cation. Springer-Verlag, Berlin, 1983. 18] P. Van Overschee and B. De Moor. \N4SID: Subspace Algorithms for the Identication of Combined Deterministic-Stochastic Systems". Automatica, Special Issue on Statistical Signal Processing and Control, 30(1):75{93, Jan. 1994. 19] P. Van Overschee and B. De Moor. Choice of state-space basis in combined deterministic-stochastic subspace identication. Automatica, Special issue on trends in system identi cation, 31(12):1877{1883, Dec. 1995. 20] P. Van Overschee and B. De Moor. A unifying theorem for three subspace system identication algorithms. Automatica, Special issue on trends in system identi cation, 31(12):1853{1864, Dec. 1995. 21] M. Verhaegen. \A Novel Non-iterative MIMO State Space Model Identication Technique". In Proc. 9th IFAC/IFORS Symp. on Identi cation and System Parameter Estimation, pages 1453{1458, Budapest, 1991. 22] M. Verhaegen. \Subspace Model Identication. Part III: analysis of the ordinary output-error state space model identication algorithm". Int. J. Control, 58:555{586, 1993. 23] M. Verhaegen. \Identication of the Deterministic Part of MIMO State Space Models given in Innovations Form from Input-Output Data". Automatica, Special Issue on Statistical Signal Processing and Control, 30(1):61{74, Jan. 1994. 24] M. Verhaegen and P. Dewilde. \Subspace Model Identication. Part I: The Output-Error State-Space Model Identication Class of Algorithms". Int. J. Control, 56:1187{1210, 1992.
32
25] M. Viberg. \Subspace-based Methods for the Identication of Linear Time-invariant Systems". Automatica, Special issue on trends in system identi cation, 31(12):1835{1851, Dec. 1995. 26] M. Viberg, B. Ottersten, B. Wahlberg, and L. Ljung. \A Statistical Perspective on State-Space Modeling Using Subspace Methods". In Proc. 30th IEEE Conf. on Decision & Control, pages 1337{1342, Brighton, England, Dec. 1991. 27] M. Viberg, B. Wahlberg, and B. Ottersten. \Analysis of State Space System Identication Methods Based on Instrumental Variables and Subspace Fitting". Automatica, 33(9):1603{1616, 1997. 28] B. Wahlberg and M. Jansson. \4SID Linear Regression". In Proc. IEEE 33rd Conf. on Decision and Control, pages 2858{2863, Orlando, USA, December 1994.
A Rank of Coecient Matrix Lemma 13 Let A be an n n matrix with all eigenvalues inside the unit circle and let B be an n m matrix. Dene the matrix polynomial Fu (q;1) as
in (18). The coecient matrix is, then
Fun
:::
Fu 1
and it has the same rank as the reachability matrix of (A B).
PROOF. The result follows from a simple construction. Let a(q;1) be as in (19) then,
Fu(q;1) = AdjfqIqn; AgB ;1 = a(q;1) qI ; A B
= (a0 + a1 q;1 + + anq;n) q ;1 (I + q ;1 A + q ;2 A2 + : : : )B: 33
Comparison of coecients leads to
"
#
"
Fun : : : Fu1 = An;1B An;2B : : :
2 6 a0Im 0 6 6 6 #6 6 6 a1 Im a0 Im B 666 . .. 6 6 6 6 4 an;1Im an;2Im
0 ::: 0 0 ...
...
:::
a0 Im
3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
where Im is a m m identity matrix. The last factor is clearly non-singular (since a0 = 1), so the lemma is proven. The lemma shows that the rst block row of S as dened in (20) has full row rank if (A B K]) is reachable. This implies that S has full row rank (since a0 = 1).
B Rank of Sylvester Matrix Consider the Sylvester matrix Sy in (33) generalized to the multi-input case. This means that every ai is replaced by aiIm and that every Fyi is now a matrix of dimension l m. The dimension of Sy becomes (( + )m + l) (( + + n)m). Observe that the relation (32) holds. Dene 2 6 6 6 6 z = 666 6 6 4
yd (t)
3 7 7 7 7 7 7 7 7 7 5
u (t)
u (t + )
and express z in the following two equivalent ways:
z = Sy a(q1;1) u++n(t ; n) 34
and
2 66; 66 z = 666 0 Im 66 4
0 0
32 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 56 4
xd(t)
3 2 7 6 7 6 7 6 7 6 7 6 7 , 6 7 6 7 6 7 6 5 4
u (t)
0 0 Im u (t + )
Next, study E fzzT g = Sy E
xd (t)
3 7 7 7 7 7 7 : 7 7 7 5
u (t)
u(t + )
(
)
1 T 1 u T ++n (t ; n) ;1 u++n (t ; n) Sy ; 1 a ( q ) a ( q ) 8 9
2 32 3T > > > > > > > d (t) 7 6 xd (t) 7 > > 6 > x > > > 6 7 6 7 > > > > 6 7 6 7 > > > 66 u (t) 77 66 u (t) 77 > T : > > > 66 7 6 7 > > > > 7 6 7 > > > > > 4 5 4 5 > > > > > > u ( t + ) u ( t + ) :
(B.1)
(B.2)
From the proof of Theorem 6 we know that the covariance matrix in the middle of (B.2) is positive denite if u(t) is at least PE( + + n) and if (A B) is reachable. If rankf; g = n, then is full column rank and it follows that: n
o
rank E fzzT g = n + ( + )m: Furthermore, if u(t) is PE( + + n) then
(B.3)
(
)
E a(q1;1 ) u++n(t ; n) a(q1;1) uT++n (t ; n)
is positive denite 17]. Notice that throughout the paper, we assume that the system is stable. This assumption implies that a(z) has all roots outside the unit circle. Comparing (B.3) and (B.1), it can be concluded that, rankfSy g = n + ( + )m: This result holds if (A B) is reachable, if (A C) is observable, and if is greater than or equal to the observability index. The observability and reachability assumptions are equivalent to requiring that the polynomials Fy (z) and a(z) do not share any common zeros. 35