upper bound on time complexity for identification of FIR models, which ... key issues such as optimal input design and time complexity estimation are still open ...
1
Worst-case System Identification using Binary Sensors: Input Design and Time Complexity Marco Casini, Andrea Garulli, Antonio Vicino Dipartimento di Ingegneria dell’Informazione Universit`a di Siena Via Roma 56, 53100 Siena, Italy
{casini,garulli,vicino}@ing.unisi.it
Abstract Binary-valued sensors are widespread in monitoring and control systems, due to both simplicity of use and low cost. This paper addresses system identification using binary-valued sensors in a worstcase set-membership setting. The main contribution is the solution of the optimal input design problem for identification of scalar gains. Two different cost functions are considered for input design: the maximum parametric identification error and the relative uncertainty reduction with respect to the minimum achievable error. It is shown that in the latter case, the solution enjoys the property of being independent on the length of the identification experiment and as such it can be implemented as an optimal recursive procedure over a time interval of arbitrary length. A further contribution is a new upper bound on time complexity for identification of FIR models, which improves over existing bounds in the literature. The resulting input sequence can be combined with the optimal input design strategy for scalar gains, in order to devise suboptimal input signals for identification of FIR models of arbitrary order.
I. I NTRODUCTION Recent years have witnessed a considerable amount of work on identification of systems with binary valued or quantized measurements. Binary data is generated by binary sensors which are widespread devices characterized by a threshold according to which the output is digitized. Similarly, multi-threshold sensors or banks of cascaded binary sensors generate quantized measurements, with a quantization error depending on the number of thresholds or binary sensors.
2
This identification approach draws motivation from several engineering fields. Communication systems show contexts like ATM networks where traffic information, e.g., bit rate, queue length, is measured through binary sensors characterized by appropriate thresholds. Typical sensors used in monitoring and control systems of industrial production plants are binary devices, popular examples being chemical process sensors in gas and oil industry, or sensors monitoring liquid or pressure levels. Binary valued or quantized measurements can be easily found in a number of automotive applications, including switching sensors for exhaust gas oxygen, shift-by-wire and ignition systems, photoelectric sensors for position detection, Hall-effect sensors for speed and acceleration measurement, ABS (anti-lock braking system), and others. The basic reason for the widespread diffusion of binary sensors is mainly related to their relative low cost and to the fact that in spite of the simplicity of the control laws adopted, monitoring and control of industrial plants often calls for measurement of a large number of variables. Actually, binary sensor networks are becoming a key issue in decentralized detection and surveillance systems [1], [2] as well as in production plants and in general in complex technological systems. In their pioneering work, Wang, Zhang and Yin [3] introduced and motivated very clearly this problem, providing a framework to deal with it. Of course, while the literature on regular measurements provides exact or approximated solutions to almost all of the basic issues, like estimation quality evaluation, model complexity and unmodeled dynamics, optimal input design, time complexity and estimate convergence (see, e.g. [4]), the scenario is quite different when dealing with quantized measurements. The main difficulty in this case is to tackle a discontinuous nonlinearity present in the sensor, which reduces drastically the information conveyed by measurements. While in [3] several important results have been obtained for FIR models either in a stochastic setting or in a worst-case setting, in [5] further results on output error models have been pursued in the stochastic framework. Consistency properties of weighted least-squares algorithms for parameter estimation based on binary data have been studied in [6]. The stochastic approach introduced in [3] has been applied to cope with several problems involving binary valued observations, including identification of Wiener systems [7], identification and tracking of linear systems with time-varying parameters [8] and design of state observers [9]. Further contributions on system identification with quantized information can be found in [10]–[12]. On the other hand, when the identification problem is cast in a worst-case deterministic setting,
3
key issues such as optimal input design and time complexity estimation are still open problems. Preliminary results in this respect have been presented in [13] for binary sensors, and in [14], [15] for quantized measurements. An attempt to introduce a joint framework taking into account the features of both the stochastic and the deterministic setting has been recently made in [16]. The aim of the present paper is to develop a complete analysis of FIR models in a worstcase setting, based on the set-membership paradigm of uncertainty representation [17]–[19]. As it is well known, time complexity represents the minimum number of measurements to be performed to reduce the estimation uncertainty below a given threshold [20], [21]. In this sense, an upper bound on time complexity of a fixed order FIR model is derived in this paper, which improves over the bound obtained in [3]. The upper bound is obtained by choosing a specific class of inputs, whose main feature is that of independent excitation of each single sample of the impulse response. A second contribution of the paper regards optimal input design. Specifically, for the case of identification of a scalar gain, an input is devised which minimizes the worstcase identification error. Moreover, an exact formula for the time complexity is derived. These results can be fruitfully used together with the first contribution of the paper to construct efficient suboptimal procedures for identification of FIR models of arbitrary order. Since the optimal input sequence for identification of a scalar gain derived by minimizing the worst-case error depends on the length of the sequence itself, the resulting excitation experiment and related estimation algorithm turn out to be a batch procedure, where the input sequence length is a design parameter, which cannot be tuned during the execution of the experiment. Actually, in several contexts, one wants to keep the option of increasing the input length, without running a new identification experiment from the beginning. In these cases, it is fairly natural to look for an optimal recursive algorithm, where the parameter estimate updating can be performed for an arbitrary number of steps, preserving optimality of the solution achieved. In this perspective, a further contribution of the paper consists in introducing a new worst-case cost function, such that the resulting optimal input selection strategy is recursive. Roughly speaking, this cost function is a normalized error representing the worst-case relative reduction of the gain uncertainty estimate with respect to the asymptotic error, i.e., the minimum achievable uncertainty as the input length tends to infinity. Beyond the recursive nature of the optimal input, numerical experiments show that this algorithm is quite effective for FIR model identification, providing accuracy results which outperform the input design algorithm based on the actual identification error, in several
4
test cases. The paper is organized as follows. Section II introduces notation and problem formulation. In Section III a general upper bound on the time complexity is derived. Section IV addresses the problem of optimal input design with respect to the actual worst-case identification error, while Section V provides the alternative optimal input design strategy based on the worst-case normalized identification error. A numerical example highlighting the comparison between the two input design approaches is reported in Section VI, while concluding remarks and future perspectives are reported in Section VII. II. P ROBLEM
FORMULATION
Let RN denote the N-dimensional Euclidean space. A sequence of real numbers {x(t), t =
1, . . . , N} will be identified with a vector x ∈ RN and ||x||p will be the standard ℓp norm. Let
Bp (c, r) = {x ∈ RN : ||x − c||p ≤ r} be the ball of radius r ≥ 0 and center c ∈ RN in the ℓp norm. Let us denote by log(x) the base 2 logarithm of x, and by ⌈x⌉ the minimum integer greater or equal to x. Let us consider an n-th order FIR SISO linear time-invariant model n X y(t) = θi u(t − i + 1) + d(t)
(1)
i=1
where u(t) is the input signal, bounded in the max norm ||u||∞ ≤ U, and d(t) represents the output disturbance. The model parameters {θi , i = 1, . . . n} represent the truncated system impulse response. The disturbance d(t) is assumed to be bounded by a known quantity, i.e., |d(t)| ≤ δ, t = 1, 2, . . .. The true system generating the data is assumed to be exponentially stable. Notice that since exponential stability of the system and boundedness of the input u(t) are assumed, unmodeled dynamics (i.e., the system impulse response tail {θi , i = n + 1, . . .}) can be accounted for by suitably tuning the noise bound δ. Observations at the system output are taken by a binary sensor with a known threshold C, such that
1 s(t) = 0
if y(t) ≤ C
(2)
if y(t) > C.
Let θT = [θ1 , θ2 , . . . , θn ] ∈ Rn denote the FIR parameter vector and φT (t) = [u(t), . . . , u(t − n + 1)] the regressor vector. Then, (1) can be expressed as y(t) = φT (t) θ + d(t).
(3)
5
Let F0 = Bp (c, ε0) represent the prior information available on the FIR parameter vector. Let
us now denote by u, s ∈ RN the input signal {u(t), t = 1, . . . , N} and the sequence of binary measurements {s(t), t = 1, . . . , N}, respectively. For a given input-output realization {u, s} of length N, the problem feasible parameter set is defined as: FN = {θ ∈ F0 : φT (t) θ ≤ C + δ if s(t) = 1;
(4)
φT (t) θ > C − δ if s(t) = 0; t = 1, . . . , N}. The worst-case local identification error is defined as ep (N, u, s) = infn sup kθ − ckp . c∈R θ∈FN
(5)
For a fixed input sequence u, let us define the global worst-case error with respect to the disturbance realization as ep (N, u) = sup ep (N, u, s) ,
(6)
s∈S0
where S0 = {s : FN 6= ∅}. Since we are interested in optimal input design and time complexity of the identification problem, the aim is to compute the minimum worst-case identification error [20] ep (N) =
inf
u:kuk∞ ≤U
ep (N, u) .
(7)
For a given level of accuracy ε < ε0 , we define the time complexity as the minimum time length of the experiment such that the optimal worst-case error reaches the accuracy ε, i.e. N(ε) = min N. ep (N )≤ε
(8)
In the next section, an upper bound for time complexity N(ε) and suboptimal design techniques for choosing the input u are presented. III. A N
UPPER BOUND TO TIME COMPLEXITY
Let us define γ , min {|θi | : θ ∈ F0 }. i=1,...,n
The following assumptions are enforced throughout the paper. Assumption 1: i) γ > C/U.
(9)
6
ii) The signs of the FIR parameters θi are known. The first assumption is necessary to guarantee that the initial uncertainty on each parameter can be reduced by using the information from the binary sensor (see [3, Prop. 1]). The signs of the parameters θi can be easily detected by performing a suitable preliminary identification experiment, as discussed in Subsection VI-A. For F0 = Bp (c, ε0 ) and δ = 0 (noise free observations), the following lemma holds (see [3, Lemma 2]). Lemma 1: Let n = 1 and γ > C/U. There exists an input sequence u, such that a sequence of observations s of length N allows one to reduce the worst-case identification error from ε0 to 2−N ε0 . We can now state the following theorem. Theorem 1: Let F0 = Bp (c, ε0) and δ = 0. Then, the time complexity to reduce the worst-case error from ε0 to ε is bounded as follows:
ε n(n + 1) 1 0 N(ε) ≤ log n + log . 2 p ε
(10)
Proof: The proof is based on the following idea. Consider a FIR model. We want to excite each FIR parameter individually, with a predefined value of the input. Assume that we want to excite θi with a given value of the the input ui . In order to this, it can be shown that one has to feed the system with an input u(1) = ui , u(t) = 0, t = 2, . . . , i. Since this reasoning holds for i = 1, . . . , n, the number of input samples to excite the n FIR parameters is at least n(n + 1)/2. Now, we prove constructively that a suitable input sequence u(t), t = 1, . . . , k(n(n + 1)/2) can be generated which allows one to excite the n FIR parameters k times individually. To this purpose, let us assume for the moment that n is even and let us denote by uji the input sample exciting the i-th coefficient for the j-th time (i = 1, . . . , n, j = 1, . . . , k). Let us consider the following input: u(1) = u11 , u(2) = u1n , u(3) = . . . = u(n + 1) = 0.
7
The corresponding output is:
y(1) = θ1 u11 1 1 y(2) = θ1 un + θ2 u1 ... y(n) = θn−1 u1n + θn u11 y(n + 1) = θ u1 . n n
Thus, at time 1 and n + 1 the coefficients θ1 and θn have been excited individually. It is straightforward to note that it is possible to iterate the previous procedure k times, so that, after k(n + 1) samples, the coefficients θ1 and θn have been correctly excited k times. Now, with a similar procedure, it is shown how it is possible to excite the coefficients θ2 and θn−1 . In this case, the input block of length n + 1 to be repeated k times is (let us reset the time to 0 for simplicity of exposition): u(1) = u12, u(2) = 0, u(3) = u1n−1, u(4) = . . .= u(n + 1) = 0. The corresponding output is: y(1) = θ1 u12 y(2) = θ2 u12 y(3) = θ1 u1n−1 + θ3 u12 ... y(n) = θn−2 u1n−1 + θn u12 y(n + 1) = θn−1 u1n−1.
So, at time 2 and n + 1 the coefficients θ2 and θn−1 have been correctly excited. As before, it is now possible to iterate the procedure k times. Note that the tail of the first input block will not interfere with the second block at the time of interest, i.e., 2 and n + 1. In fact, let us apply the second input block u(n + 2) = u22 , u(n + 3) = 0, u(n + 4) = u2n−1 , u(n + 5) = . . . = u(2n + 2) = 0.
8
It follows that the corresponding output is y(n + 2) = θ1 u22 + θn u1n−1 y(n + 3) = θ2 u22 ... y(2n + 2) = θ u2 , n−1 n−1
and at time n + 3 and 2n + 2 the coefficients θ2 and θn−1 have been excited individually for the second time. The procedure can be repeated for all the n/2 pairs of coefficients θi and θn−i+1 , leading to the excitation of all the FIR coefficients. The total input length is therefore n N = k(n + 1) . 2
(11)
If n is odd, it is possible to repeat the previous reasoning, except for the last input block which has to excite only the parameter θ n+1 , and will contain (n + 1)/2 samples. Again, it is 2
easy to check that equation (11) provides the total number of samples needed. Now, let us consider the uncertainty error on each parameter. We know by Lemma 1 that the uncertainty interval on each parameter θi can be reduced by a factor 2−k after k observations obtained through a suitable choice of input samples uji , j = 1, . . . , k. Denote by u∗(t) the global input obtained by iterating k blocks constructed as shown above. The minimum worst-case error e(N, u∗ , s) for an observation sequence s, can be bounded as follows: ep (N, u∗ , s) ≤ n1/p e∞ (N, u∗, s) ≤ n1/p 2−N/(n(n+1)/2) ep (0, u, s) = n1/p 2−N/(n(n+1)/2) ε0 , where we used the fact that e∞ (0, u, s) ≤ ep (0, u, s) and kxkp ≤ n1/p kxk∞ . Now, imposing that
n1/p 2−N/(n(n+1)/2) ε0 ≤ ε leads to the inequality (10).
Remark 1: Note that the upper bound provided in Theorem 1 improves over that given in l m [3, Thm. 10], where it was shown that N(ε) ≤ (n2 − n + 1) p1 log n + log εε0 . In fact, n(n + 1)/2 < n2 − n + 1, ∀ n > 2. By following the same reasoning as in [3, Thm. 11], it can
be shown that the same improvement applies as well to the time complexity bound for the case of noisy observations (i.e., δ > 0).
9
The input sequence introduced in the proof of Theorem 1 suggests a strategy for tackling the input design problem (7). The idea is to select the input samples which individually excite the parameter θi in order to achieve the maximum uncertainty reduction for that parameter. Optimal input design for each scalar parameter θi will result in an effective suboptimal solution for FIR models of arbitrary order. For this reason, the exact solution of the optimal input design problem for scalar gains is provided in the next section. IV. I NPUT
DESIGN FOR IDENTIFICATION OF GAINS
In this section, we address the optimal input design problem for the special case of FIR systems of order n = 1. The following setting will be considered 1 if y(t) ≤ C y(t) = a u(t) + d(t) , s(t) = 0 if y(t) > C, with |u(t)| ≤ U, |d(t)| ≤ δ and a ∈ F0 = [a0 , a0 ].
Let us denote by Ft = [at , at ] the feasible parameter set at time t. The aim is to choose the
input signal u(t) at time t as a function of the available information up to time t − 1, i.e. u(t) = η(Ft−1 ; t)
,
t = 1, 2, . . . , N.
In order to stress that the set-theoretic information is exploited in the input choice, the optimal input design problem (7) is rewritten as u∗ = arg
inf
u: kuk∞ ≤U u(t)=η(Ft−1 ;t)
ep (N, u) .
(12)
According to Assumption 1, one has a0 > C/U. Moreover, let C > δ > 0 and define β , δ/C. The following proposition clarifies under which conditions, reduction of the feasible parameter set is possible [3, Thm. 14]. Proposition 1: The following statements hold:
b) If c) If
a0 , then there exists an input sequence such that aa∞ , > 1+β a0 1−β ∞ at−1 1+β > 1−β , then there does not exist any input u(t) such that at−1
at−1 at−1
1+β . 1−β . limN →∞ aaN = 1+β 1−β N at ≤ 1+β . at 1−β
a) At a given time t, uncertainty reduction is possible if and only if
>
By Proposition 1, it follows that the condition a0 1+β > a0 1−β
(13)
10
is necessary in order to reduce the radius of the initial feasible parameter set, and it is also sufficient to guarantee that uncertainty reduction is possible at any time t > 0. Hereafter, the assumption (13) will be enforced. The following result provides the solution to problem (12). Theorem 2: Let (13) hold and N be the length of the input signal to be applied. Then, the optimal input solving problem (12) is given by u∗ (t) = where e at|N
C e at|N
,
t = 1, 2, . . . , N
(14)
N−t −1)
N−t
at−1 (1 + β)(2 −1) + at−1 (1 − β)(2 = (1 + β)(2N−t ) + (1 − β)(2N−t )
with estimate updating laws: a =a at|N (1 + β) t t−1 , at = e a =e at|N (1 − β) , at = at−1 t
(15)
if s(t) = 1 if s(t) = 0.
Moreover, the radius of the feasible parameter set FN is N
εN = β
,
N −1)
a0 (1 + β)(2 −1) − a0 (1 − β)(2 (1 + β)(2N ) − (1 − β)(2N )
.
(16)
Proof: Let us define PN ,
N −1 h Y j=0
First, let us prove that
j
N)
PN =
j)
(1 + β)(2 ) + (1 − β)(2
(1 + β)(2
i
.
N
− (1 − β)(2 ) . 2β
(17)
For N = 1, (17) holds trivially. Moreover, if (17) holds for a generic N, then it holds also for N + 1, because h i N N PN +1 = PN · (1 + β)(2 ) + (1 − β)(2 ) h i h i N N N N (1 + β)(2 ) − (1 − β)(2 ) · (1 + β)(2 ) + (1 − β)(2 ) = 2β N+1 )
=
(1 + β)(2
N+1 )
− (1 − β)(2 2β
.
Let us now demonstrate (15) and (16) by induction. First, let us show that e a1|1 =
a0 +a0 . 2
In
fact, by applying u(1) = C/e a1|1 , one obtains two possible feasible sets at time 1, namely
11 (0)
F1
(1)
= [e a1|1 (1 − β), a0 ] or F1
= [a0 , e a1|1 (1 + β)], depending on the value of s(1) being (0)
respectively 0 or 1. Notice that (13) guarantees the existence of an e a1|1 such that both F1 (1) F1
and
are strictly contained in F0 . The maximum worst-case uncertainty reduction is thus obtained
by choosing e a1|1 so that the size of both intervals is the same, thus leading to e a1|1 = (0)
substituting this value of e a1|1 in F1
(1)
a0 (1+β)−a0 (1−β) 4
or F1 , one gets ε1 =
2C a0 +a0
to (16). It is worth remarking that the resulting input u(1) =
a0 +a0 . 2
By
which corresponds
satisfies |u(1)| < U thanks
to Assumption 1.
Now, let us assume that (15) and (16) hold for a given N. We want to show that they hold also for an input of length N + 1. By (17), (16) can be rewritten as N −1)
N
a0 (1 + β)(2
− a0 (1 − β)(2 −1) εN = . (18) 2PN Observe that the last N samples of the optimal input of length N + 1 can be chosen by applying to the feasible set at time 1, i.e. [a1 , a1 ], the same input selection strategy adopted for computing the optimal input of length N for the initial feasible set [a0 , a0 ]. In other words, the second optimal input sample for an input of length N + 1 depends on [a1 , a1 ] in the same way as the first optimal sample of an input of length N depends on [a0 , a0 ], and so on. Indeed, if in (15) the time indices of the feasible set bounds a and a are suitably rearranged, one has e at+1|N +1 = e at|N
for all t = 1, . . . , N. Moreover, since (16) holds for an input of length N, the radius of the final feasible set FN +1 for the optimal input sequence of length N + 1 can be expressed according to (18) as
N −1)
N
a1 (1 + β)(2
− a1 (1 − β)(2 −1) εN +1 = . (19) 2PN We have now to select the first input sample u∗ (1) = C/e a1|N +1 . The choice of e a1|N +1 leads (0)
to two possibile feasible sets at time 1: F1
(1)
= [e a1|N +1 (1 − β), a0 ] if s(1) = 0, or F1
[a0 , e a1|N +1 (1 + β)] if s(1) = 1. From (19) the two corresponding final radii turn out to be (0) εN +1
N −1)
=
and (1) εN +1
a0 (1 + β)(2
N)
−e a1|N +1 (1 − β)(2 2PN
N)
=
e a1|N +1 (1 + β)(2 (0)
N −1)
− a0 (1 − β)(2 2PN (1)
The optimal choice of e a1|N +1 is such that εN +1 = εN +1 , i.e., N −1)
a0 (1 + β)(2
N)
−e a1|N +1 (1 − β)(2
=e a1|N +1 (1 + β)
(2N )
.
− a0 (1 − β)
=
(2N −1)
=
12
which leads to
N −1)
N
e a1|N +1
that is in accordance with (15).
a0 (1 + β)(2 −1) + a0 (1 − β)(2 = (1 + β)(2N ) + (1 − β)(2N ) (0)
(1)
Finally, to compute εN +1 one has to substitute e a1|N +1 in εN +1 (or in εN +1 ), thus obtaining ( ) (2N −1) (2N −1) (1 − β) 1 a (1 + β) + a N N 0 0 a0 (1 + β)(2 −1) − εN +1 = (1 − β)(2 ) N) N) (2 (2 2PN (1 + β) + (1 − β) N+1
=
N+1
a0 (1 + β)(2 −1) − a0 (1 − β)(2 −1) 2PN [(1 + β)(2N ) + (1 − β)(2N ) ] N+1 −1)
=
a0 (1 + β)(2
N+1 −1)
− a0 (1 − β)(2 2PN +1
N+1 −1)
N+1
a0 (1 + β)(2 −1) − a0 (1 − β)(2 = β (1 + β)(2N+1 ) − (1 − β)(2N+1 ) which concludes the proof.
(20)
Remark 2: According to (16), the minimum achievable error in a worst-case sense is given by β a0 . (21) N →∞ 1+β In order to compute the time complexity as defined in (8), let us introduce the following ε∞ , lim εN =
measure of relative uncertainty reduction at time N εN − ε∞ QN , . ε0 − ε∞ The next theorem provides the exact time complexity for the identification of gains.
(22)
Theorem 3: Let ε0 ≥ ε > ε∞ . Then, the time complexity for reducing the uncertainty from ε0 to ε is
N(ε) = log
a −a
log
ε0 −ε∞ ε−ε∞
Proof: Since ε0 = 0 2 0 , from (22) one has 2(1 + β)εN − 2 β a0 QN = a0 (1 − β) − a0 (1 + β)
log
·
2β 1−β
1+β 1−β
+1
(23)
N
=
N −1)
2β a0 (1 − β)(2 ) − a0 (1 + β) (1 − β)(2 · a0 (1 − β) − a0 (1 + β) (1 + β)(2N ) − (1 − β)(2N ) N
2 β (1 − β)(2 −1) 2β 1 = · N) N) = N) (2 (2 (2 1−β (1 + β) − (1 − β) 1+β 1−β
. −1
(24)
13
Then, by (24)
and therefore
1+β 1−β
(2N )
=
N = log
log
The result follows from (22) with εN = ε.
2β + 1, (1 − β) QN 2β (1−β) QN
log
1+β 1−β
+1 .
Remark 3: The results in Theorems 1, 2 and 3 can be exploited to devise a suboptimal input design strategy for a FIR of order n > 1. Indeed, the proof of Theorem 1 provides an input sequence that is able to individually excite each FIR parameter k times. Theorem 3 establishes the value of k necessary to guarantee a prescribed uncertainty reduction, for a single parameter. Finally, Theorem 2 says that the input values must be chosen according to (15)-(16). The resulting input sequence is suboptimal, as the input design is addressed separately for each FIR parameter. In this respect, the expression of εN in (16) allows one to obtain an upper bound for minimum achievable identification error. The characterization of the optimal input sequence minimizing the identification error with respect to the entire parameter vector θ remains an open problem. It is worth discussing the main differences between the above results and those reported in [3] for the same setting. There, it is proposed as optimal input the signal u(t) =
2C . at−1 + at−1
(25)
Such signal actually minimizes the size of the worst-case Ft according to the available information up to time t − 1, but it does not provide the optimal input sequence of length N minimizing the size of FN (and hence ep (N, u)) in the worst-case sense (12). In other words, if one applies the input signal (25), it is always possible to find a FIR parameter a ∈ F0 and a disturbance sequence d such that the radius of the resulting FN is larger than εN in (16). Concerning the time complexity, in [3] it is shown that in the noise-free case N(ε) = ⌈log(ε0 /ε)⌉, while when the disturbance d is present the following upper bound on εN is given 1 N β a(1 − α2N ) εN ≤ α (a0 − a0 ) + (26) 2 2 α1 where α1 = (1/2)(1−β) and α2 = (1/2)(1+β), which allows one to bound the time complexity
14
as follows
βa ε− α
log N(ε) ≤ log α2
1 βa ε0 − α 1
.
(27)
Notice that such upper bounds depend on the true value of a. In order to get a priori upper bounds, a can be replaced by a0 in the right hand side of (26)-(27). Theorem 3 shows that in the noisy setting, the exact time complexity depends not only on ε0 and ε, but also on ε∞ (or equivalently on a0 ). This means that the true time complexity depends not only on the size of the initial feasible set, but also on its location in R. Moreover, the a priori upper bound resulting from (26) generally turns out to be much more conservative than (16), as shown in the following example. Example 1: Let us consider a FIR of order 1, and let F0 = [1, 100], U ≥ 10, δ = 1 and C = 10, i.e., β = 0.1. In Figure 1 the worst-case radius of the feasible set for different input lengths N is reported for the optimal input u∗ given by Theorem 2. The a priori bound obtained by replacing a with a0 in (26) is also reported in Figure 1. As expected, the final uncertainty radius guaranteed by the optimal input is always smaller. For the considered example, the improvement provided by u∗ with respect to the considered bound can be up to 54%. 35
30
25
20
15
10
5
1
2
3
4
5
6
7
8
Number of samples (N )
Fig. 1.
Example 1: εN computed in (16) (solid) and worst-case error bound (26) provided in [3] (dashed).
15
It is interesting to remark that QN in (22) is a normalized form of εN between 0 and 1, i.e., Q0 = 1 and Q∞ = 0. Since, according to (24), QN depends only on β, one can conclude that the relative reduction of uncertainty depends only on β. This means that the number of input samples needed to obtain a prescribed relative uncertainty reduction in worst-case sense, can be derived a priori from the knowledge of β alone. Figure 2 shows the plot of QN for 0 ≤ N ≤ 10, when β = 0.1. Notice that after 4 input samples the value of QN approaches 0, i.e., no additional input is useful to practically reduce the uncertainty in the worst-case sense. Indeed, from (24) it results Q4 ≃ 9.3 · 10−3 . 1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9
10
Number of samples (N )
Fig. 2.
Plot of QN for β = 0.1.
V. I NPUT
DESIGN FOR RELATIVE UNCERTAINTY REDUCTION
So far, the considered input design problem has addressed the minimization of the actual estimate uncertainty, i.e. the size of the feasible set. The exact solution for identification of gains, provided by Theorem 2, has shown that the optimal input sequence depends on the length N of the sequence itself, which is clearly an undesirable feature. In fact, one may want to adapt on-line the length of the identification experiment without loosing optimality. Moreover, it would be more sensible to measure the performance of the identification procedure with respect to the minimum error achievable in the considered setting, i.e. to optimize the relative uncertainty reduction instead of the absolute one, with the aim of designing a recursive optimal input design strategy. This would also give a rationale for devising a stopping criterion for the identification experiment.
16
Motivated by the above reasons, in this section a new worst-case error cost function is introduced and the input signal minimizing such a functional is derived for the identification of a scalar gain. The same setting and assumptions as in Section IV are adopted. Let a ∈ F be the true parameter and let F = [a, a] at a given time instant. Let us denote by Dmin (a, a, a) the minimum size of the worst-case feasible set, obtainable by applying an infinite sequence of inputs. In other words, Dmin represents the irreducible size of the worst-case feasible set. The following result holds. Lemma 2: Let F = [a, a], and let a ∈ F be the true parameter. One has: 2 β a , if a ≤ a ≤ 1−β a 1−β 1+β Dmin (a, a, a) = . 2 β 1−β a , if a 1, as expected. Moreover, by Proposition 1, at 1+β there exists an infinite input sequence such that lim = , and hence lim Jmax (at , at ) = 1. t→∞ at t→∞ 1−β Since by hypothesis
>
sup J(a, a, a) =
Remark 4: To clarify the meaning of Jmax (a, a), notice that it allows one to quantify the
worst-case relative reduction of the feasible set w.r.t. the minimum achievable uncertainty. For example, Jmax = 1.2 means that the size of the feasible set is 20% greater than that of the minimum feasible set achievable by any input sequence of infinite length, in worst-case sense.
18
Hence, Jmax can be used to devise a rationale for stopping the identification experiment. For instance, one can decide to stop the experiment whenever Jmax < 1.01, because it will not be possible to further reduce the feasible set by more than 1%. In the following, the optimal input sequence minimizing Jmax is derived. In particular, the aim is to solve the input design problem u∗ = arg
inf
u: kuk∞ ≤U u(t)=η(Ft−1 ;t)
Jmax (aN , aN )
(32)
Theorem 4: Let N be the length of the input signal to be applied. Then, the optimal input sequence solving problem (32) is given by u∗ (t) = where
C , e at
e at =
,
t = 1, 2, . . . , N
r
(33)
at−1 at−1 1 − β2
(34)
with estimate updating laws: q at = at−1 , at = e at (1 + β) = at−1 at−1 1+β 1−β q 1−β a =e at (1 − β) = at−1 at−1 1+β , at = at−1 t
, if s(t) = 1
(35)
, if s(t) = 0.
Moreover, the value of Jmax after N input samples is s " # 1 − β 1 + β 2N a0 1 − β Jmax|N = −1 . 2β 1−β a0 1 + β
(36)
Proof: Let us prove (34) and (36) by induction, in a similar way as in the proof of Theorem 2. a0 − 1 . By applying u(1) = C/e a1|1 one obtains Let N = 1. By (31) one has Jmax|0 = 1−β 2β a 0
(0)
(1)
two possible feasible sets at time 1, namely F1 = [e a1|1 (1 − β), a0 ] or F1 = [a0 , e a1|1 (1 + β)],
depending on the value of s(1) being respectively 0 or 1. The optimal value of e a1|1 is such that (0)
(1)
the value of Jmax|1 for both intervals is the same, i.e., Jmax|1 = Jmax|1 , thus obtaining a1|1 (1 + β) 1−β a0 1−β e −1 = −1 2β e a1|1 (1 − β) 2β a0 a0 a0 . Since 0 ≤ a0 ≤ e a1|1 ≤ a0 and β < 1, 1−β q2 a0 a0 e a1|1 = 1−β 2 . By substitution, one obtains Jmax|1
which leads to e a21|1 =
feasible, and so
only the positive solution is as in (36).
19
Now, let us assume that (34) and (36) hold for a given N. We want to show that they hold also for N + 1. By following the same reasoning as in the proof of Theorem 2, we can state that s " # 1 − β 1 + β 2N a1 1 − β Jmax|N +1 = −1 . (37) 2β 1−β a1 1 + β We have now to select the first input sample u∗ (1) = C/e a1|N +1 . The choice of e a1|N +1 leads (0)
to two possibile feasible sets at time 1: F1
(1)
= [e a1|N +1 (1 − β), a0 ] if s(1) = 0, or F1
=
[a0 , e a1|N +1 (1 + β)] if s(1) = 1. From (37), the two corresponding final values of Jmax turn out to be
(0)
Jmax|N +1
1−β = 2β
and (1)
Jmax|N +1
1−β = 2β
"
"
s
1+β 1−β
1+β 1−β
2N
s
2N
(0)
a0
e a1|N +1
1 −1 1+β
#
# e a1|N +1 (1 − β) − 1 . a0 (1)
The optimal choice of e a1|N +1 is such that Jmax|N +1 = Jmax|N +1 , i.e., a0
e a1|N +1
which leads to
Since e a1|i = e a1|j
e a1|N +1 1 (1 − β) = 1+β a0 r
a0 a0 . 1 − β2 for all i, j = 1, . . . , N +1, one can define e a1 = e a1|i . Finally, to compute Jmax|N +1 (0)
e a1|N +1 =
(1)
one has to substitute e a1 in Jmax|N +1 (or in Jmax|N +1 ), thus obtaining, after some algebra s " # 1 − β 1 + β 2N+1 a0 1 − β Jmax|N +1 = −1 2β 1−β a0 1 + β
which concludes the proof. Remark 5: A key feature of Theorem 4 is that the optimal input sequence solving problem (32) does not depend on the length N of the identification experiment. This means that the length of the experiment can be adapted on-line, without loosing optimality with respect to the relative uncertainty reduction Jmax . The next theorem states the time complexity for problem (32). Theorem 5: Let F0 = [a0 , a0 ], 1 < σ < Jmax (a0 , a0 ) and for obtaining Jmax ≤ σ is
a0 a0
>
1+β . 1−β
log aa0 1−β 1+β 0 N(σ) = log . 2 β σ+1−β log 1+β
Then, the time complexity
(38)
20
Proof: Let us assume Jmax|N
1−β = 2β
"
1+β 1−β
s
2N
# a0 1 − β − 1 ≤ σ. a0 1 + β
(39)
Then, the time complexity is given by the minimum N satisfying s 2βσ +1− β a0 1 − β 2N ≤ a0 1 + β 1+β and hence (38) holds. In the following, we will denote by “Optimal Error Procedure” (OEP) and “Optimal Relative Error Procedure” (OREP) the input design procedures described in Theorems 2 and 4, respectively. With the aim of providing a quantitative comparison of the two approaches, the performance of the OREP will be evaluated in terms of the radius of the resulting feasible set. Let us now state a lemma which is instrumental to this purpose. Lemma 3: Let N be the length of the input signal. Then, the maximum achievable value of aN when applying the OREP is 1−β a ˇN , sup aN = a0 1+β s at at
s
2N
a0 1 + β . a0 1 − β
(40)
1+β 1−β
∀t, by (35) it follows that the maximum q value for at is always obtained when s(t) = 0, being at−1 at−1 1−β > at−1 . This condition 1+β Proof: Since by Proposition 1 one has
>
holds, for instance, when a = a0 . By defining µ , a0 1−β , and observing that if s(t) = 0 ∀t, 1+β then at = a0 ∀t, one has
q
=
√
a0 µ = µ
=
√
a1 µ = µ
a ˇN =
√
aN −1 µ = µ
a ˇ1 a ˇ2 .. .
q 4
a0 µ a0 µ
q
2N
a0 . µ
Theorem 6: Let N be the length of the OREP input signal. Then, the worst-case radius of the feasible parameter set FN is
s " # a0 1 − β 2N a0 1 + β ǫN = 1− 2 1+β a0 1 − β
(41)
21
Proof: By (31), one has aN − aN 2β a 1−β N
Jmax|N = which can be rewritten as
aN − aN β = Jmax|N · a . 2 1−β N
(42)
Since the left-hand side of (42) denotes the radius of FN , and Jmax|N is given in (36), by Lemma 3 one has ǫN = sup Jmax|N · aN
β β aN = Jmax|N · a ˇ . 1−β 1−β N
The result follows by substitution of Jmax|N and a ˇN from (36) and (40), respectively. Since εN reported in (16) denotes the radius of FN when the OEP is applied, by construction one has εN ≤ ǫN . Figure 4 shows the comparison between the two radii εN and ǫN , for the setting β = 0.1, a0 = 1 and a0 = 100. Note that for N ≥ 12 the two radii are very close. In fact, from (36) one has Jmax|12 = 1.0059. 50
45
40
35
30
25
20
15
10
5
0
5
10
15
N
Fig. 4.
Plot of εN (solid) and ǫN (dashed) for N = 1, . . . , 15.
The comparison shows that, in a worst-case setting, if N is small, i.e., Jmax|N is much greater than 1, it is convenient to use the OEP, while if Jmax|N ≃ 1, the two procedures are almost equivalent in terms of size of the final feasible set. A different scenario arises when the true location of the parameter a is not the worst-case one. In this case, the OREP may lead to a much better performance than the OEP, in terms of
22
the size of the actual resulting feasible set. The worst-case identification errors (with respect to the disturbance d(t)), for different values of a, are reported in Figures 5 and 6, in the cases N = 10 and N = 4, respectively. In both figures, β = 0.1 and a ∈ [1, 100]. It can be noticed that the error returned by the OEP does not depend on the actual value of a. Conversely, the radius of the feasible set provided by the OREP is a nondecreasing function of the true value of a. Clearly, if the maximum error also with respect to a is considered (as it is in the nature of the worst-case approach), the value returned by the OEP is smaller, by construction. However, Fig. 5 shows that for almost all values of a, OREP gives a smaller error than OEP. This is not the case when a shorter identification experiment is performed, as in Fig. 6 (N = 4). The plots of the errors with respect to the experiment length N are reported in Fig. 7. The solid plot refers to the OEP error which does not depend on a. The OREP error is shown for three different values of a (namely a0 , a0 and the middle value
a0 +a0 ). 2
It can be noticed that, while the error
for a = a0 is always larger than the OEP error, the error for the average case, a = (a0 + a0 )/2 becomes smaller for N > 5, and for a = a0 it is always significantly smaller than the OEP error.
10
9 OEP 8
|d|≤δ
sup
a10 − a10 2
7
6
5
4
OREP
3
2
1
0
Fig. 5.
0
10
20
30
40
50
a
60
70
80
90
100
Worst-case identification error for OREP (solid) and OEP (dashed) for N = 10, a ∈ [1, 100] and β = 0.1.
Remark 6: From a practical point of view, the discussion above suggests that, if N is small (and then Jmax|N is significantly larger than 1) it is convenient to choose the OEP, which gives a guaranteed value of the worst-case radius, while if N is large (i.e., Jmax|N ≃ 1), it is convenient
23 20
18
16
12
10
|d|≤δ
sup
a4 − a4 2
14
OEP
8
6
4
OREP
2
0
Fig. 6.
0
10
20
30
40
50
60
a
70
80
90
100
Worst-case identification error for OREP (solid) and OEP (dashed) for N = 4, a ∈ [1, 100] and β = 0.1.
50
45
40
35
30
25
a0
20
15
a0
10
5
0
a0 + a0 2 0
5
10
15
N
Fig. 7.
Plot of εN (solid) and ǫN (dashed) for N = 1, . . . , 15 for three values of a.
to use the OREP, which has almost the same performance as the OEP for the worst values of the parameter a, but for other values of a may lead to a much smaller radius. If N is not given a priori, it is convenient to use the OREP and choose N such that Jmax|N satisfies the desired tolerance on the final radius reduction. In this case, N can be computed a priori by using (38) in Theorem 5.
24
Remark 7: Notice that OEP is highly sensitive to the a priori information, and in particular to a0 . In fact, by (21) it follows that limN →∞ εN = ε∞ =
β a, 1+β 0
independently of the value of true
parameter a. So, if a priori bounds are chosen in a conservative way, such a conservativeness will affect the final value of the radius, also if N is large. Instead, this fact is not true if one uses the OREP strategy. In fact, Dmin (a0 , a0 , a) is highly sensitive to the true parameter value and less sensitive to the initial feasible set. In Fig. 8, values of Dmin (a0 , a0 , a)/2 are reported as a function of a0 , for different values of the true parameter a and β = 0.1. The true parameter is chosen as a = (1 − α) a0 + α a0 , α ∈ [0, 1]. Overestimation of a0 corresponds to small values of α and consequently smaller values of Dmin (a0 , a0 , a)/2 with respect to ε∞ . 10
9
ε∞
8
7
α = 0.75
6
5
α = 0.5 4
3
α = 0.25
2
1
α=0 0
0
10
20
30
40
50
a0
60
70
80
90
100
Fig. 8. Plot of Dmin (a0 , a0 , a)/2 for different feasible sets F = [1, a0 ]. The true parameter a is chosen as a = (1−α) a0 +α a0 , for α = 0, 0.25, 0.5, 0.75.
VI. N UMERICAL
EXAMPLE
Let us consider a system whose transfer function is G(z) =
38.4 z 3 + 28.8 z 2 + 35.2 z − 20 3.15 z 3 − 0.08 z 2 + 1.84 z − 1
By observing that the impulse response of G(z) can be considered negligible after n = 30 samples, the aim is to identify the first n samples of the impulse response.
25
Due to the prior knowledge on the system, let us assume that the initial bounds on the impulse response coefficients are −Mρi ≤ θi ≤ Mρi , M = 20, ρ = 0.9, i = 1, . . . , n. Moreover, let us assume −500 ≤ u(t) ≤ 500, δ = 3 and C = 20, i.e., β = 0.15. Since the signs of the FIR parameters are not known, it is necessary to apply a preliminary input sequence to detect them. In the following, it is shown how 2n input samples are sufficient to determine such signs. A. Sign detection Let us first consider the case of a scalar gain with unknown sign, i.e. a0 < 0 < a0 . A possibile approach to find the sign of this parameter is as follows. Let us apply u(1) = U. •
•
C−δ . Hence, a1 = C−δ (and a1 = a0 ). Since C > δ, then a1 > 0 U U and the sign has been detected since both a1 and a1 are positive. C +δ If s(1) = 1, then a < . Hence, a1 = C+δ (and a1 = a0 ). Since the sign of a is not U U found yet, one applies u(2) = −U and: −C + δ and a2 = −C+δ – If s(2) = 0, then a < (and a2 = a1 ). Thus, both a2 and a2 U U are negative. C +δ C +δ – If s(2) = 1, then a ≥ − and a2 = − (and a2 = a1 ). In this case F2 = U U [− C+δ , C+δ ] and no further feasible set reduction can be achieved. In this case, the U U If s(1) = 0, then a >
identification experiment can be stopped here. The proposed method needs at most 2 input samples to find the sign of a scalar gain. If we are dealing with a generic FIR of order n > 1, whose coefficients have unknown signs, one needs at most 2n input samples to detect all the parameters signs. In fact, by applying u(1) = U, u(2) = . . . = u(n) = 0, u(n + 1) = −U, u(n + 2) = . . . = u(2 n) = 0 each parameter is excited individually twice with both U and −U. It is easy to check that 2n is the (worst-case) minimum number of samples for detecting the parameter signs. B. Worst-case input design Let us assume to excite every FIR parameter k = 4 times. Since to find the sign of each parameter one needs 2n samples, by (11) the total input length turns out to be N = 2n+k n(n+1) = 2
26
1920. Let us apply both OEP and OREP strategies. In Fig. 9, a comparison between the resulting radii of the two approaches is reported. Both radii are computed for the worst-case realization of the disturbance d(t) (i.e., the realization that produces the largest feasible set, for each input strategy). Since for some parameters Jmax|k is quite larger than 1, the corresponding OREP radii may result considerably bigger than the OEP radii (depending on the true value of the parameters). = 4710 Let us now repeat the experiment for k = 10. In this case, one needs N = 2n+k n(n+1) 2 samples. In Fig. 10, a comparison between the resulting radii of the two approaches is reported, while in Fig. 11 the feasible set bounds are reported for the OREP procedure. In this case, Jmax|k is close to 1, and the OREP strategy leads to better performance, as expected. In Table I, numerical values of the previous example are reported. Notice that when k = 10 the OEP error approaches ε∞ , while the OREP error approaches Dmin /2, which can be considerably smaller depending on the true position of each parameter. Moreover, notice that although the number of parameter excitation (and of input samples) has been increased, the final radii for the OEP strategies is close to that obtained for k = 4, according to the fact that, from (24), Q4 = 2.82 · 10−3 is very close to 0. k=4
4.5
OEP OREP 4
3.5
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
FIR coefficient Fig. 9.
Comparison between final radii obtained by using OEP (dark) and OREP (light), for k = 4.
27
k = 10
2.5
OEP OREP
2
1.5
1
0.5
0
0
5
10
15
20
25
30
FIR coefficient Fig. 10.
Comparison between final radii obtained by using OEP (dark) and OREP (light), for k = 10.
20
15
ւ M ρi Bounds on the feasible set
10
5
0
−5
−10
տ −M ρi −15
−20
0
5
10
15
20
25
30
FIR coefficient
Fig. 11.
Feasible set bounds and true values of FIR parameters (crosses) by using OREP, for k = 10.
28
TABLE I N UMERICAL VALUES OF
THE EXAMPLE FOR EACH
(OREP )
ε10
2.3666
4.4193
2.3478
2.1995
1.6681
2.1299
3.9501
2.1130
1.6979
1.9017
0.7577
1.9169
1.7187
1.9017
0.7725
13.1220
1.7116
1.3926
1.7252
3.1550
1.7116
1.4171
0.2922
11.8098
1.5404
0.0516
1.5527
0.1244
1.5404
0.0526
6
5.9803
10.6288
1.3864
1.0553
1.3974
2.5190
1.3864
1.0731
7
-2.5241
9.5659
1.2477
0.4454
1.2577
0.8273
1.2477
0.4535
8
-3.4646
8.6093
1.1230
0.6114
1.1319
1.4497
1.1230
0.6234
9
3.2849
7.7484
1.0107
0.5797
1.0187
1.3034
1.0107
0.5880
10
1.3059
6.9736
0.9096
0.2305
0.9168
0.4571
0.9096
0.2348
11
-2.9855
6.2762
0.8186
0.5269
0.8251
1.0533
0.8186
0.5361
12
0.2042
5.6486
0.7368
0.0360
0.7426
0.0633
0.7368
0.0366
13
2.1637
5.0837
0.6631
0.3818
0.6684
0.8509
0.6631
0.3870
14
-1.0121
4.5754
0.5968
0.1786
0.6015
0.3228
0.5968
0.1813
15
-1.2247
4.1178
0.5371
0.2161
0.5414
0.3917
0.5371
0.2192
16
1.2470
3.7060
0.4834
0.2201
0.4872
0.4691
0.4834
0.2233
17
0.4258
3.3354
0.4351
0.0751
0.4385
0.1454
0.4351
0.0763
18
-1.1064
3.0019
0.3916
0.1952
0.3946
0.3835
0.3916
0.1979
19
0.1191
2.7017
0.3524
0.0210
0.3552
0.0351
0.3524
0.0213
20
0.7845
2.4315
0.3172
0.1384
0.3196
0.2445
0.3172
0.1405
21
-0.4009
2.1884
0.2854
0.0707
0.2877
0.1373
0.2854
0.0717
22
-0.4306
1.9695
0.2569
0.0760
0.2589
0.1265
0.2569
0.0768
23
0.4722
1.7726
0.2312
0.0833
0.2330
0.1464
0.2312
0.0842
24
0.1363
1.5953
0.2081
0.0240
0.2097
0.0443
0.2081
0.0243
25
-0.4091
1.4358
0.1873
0.0722
0.1887
0.1227
0.1873
0.0730
26
0.0599
1.2922
0.1685
0.0106
0.1699
0.0172
0.1685
0.0107
27
0.2837
1.1630
0.1517
0.0501
0.1529
0.0839
0.1517
0.0506
28
-0.1577
1.0467
0.1365
0.0278
0.1376
0.0430
0.1365
0.0281
29
-0.1507
0.9420
0.1229
0.0266
0.1238
0.0404
0.1229
0.0268
30
0.1783
0.8478
0.1106
0.0315
0.1114
0.0546
0.1106
0.0318
i
θi
ε0
ε∞
Dmin /2
1
12.1905
18.0000
2.3478
2.1513
2
9.4525
16.2000
2.1130
3
4.2939
14.5800
4
-7.8916
5
(OEP )
FIR COEFFICIENT
ε4
ε4
(OEP )
(OREP )
ε10
29
VII. C ONCLUSIONS System identification based on binary-valued observations is an intriguing problem, due to the remarkable information reduction caused by the binary sensor. When a deterministic setmembership setting is considered, problems like optimal input design and time complexity estimation turn out to be even more challenging. The paper has provided new results for worst-case identification of systems equipped with binary sensors. A new upper bound on time complexity for FIR systems of arbitrary order and the exact solution of the optimal input design problem for identification of gains, have been combined in order to devise an effective input design strategy for identification of general FIR systems. Moreover, an undesirable property of input design based on the worst-case identification error, namely the dependence of the optimal input sequence on the length of the identification experiment, has been overcome by introducing a new cost function. This led to the derivation of an optimal recursive design procedure, over a time interval of arbitrary length. Several issues remain open and deserve attention for future investigations. Optimal input design for ARX models and the related evaluation of the exact time complexity, are still open problems. Another open question is whether the input selection strategy based on independent excitation of each impulse response sample is optimal in general. A related interesting research area concerns worst-case system identification with quantized measurements. While quantized sensors are generally much more powerful than binary ones, they increase the complexity of the optimal input design problem. Only preliminary results are available at present on this topic, which is subject of ongoing work. R EFERENCES [1] W. Kim, K. Mechitov, J. Y. Choi, and S. Ham, “On target tracking with binary proximity sensors,” in Proceedings of the 4th international symposium on Information processing in sensor networks, 2005, pp. 301–308. [2] P. M. Djuric, M. Vemula, and M. F. Bugallo, “Target tracking by particle filtering in binary sensor networks,” IEEE Transactions on Signal Processing, vol. 56, no. 6, pp. 2229–2238, Jun. 2008. [3] L. Y. Wang, J. F. Zhang, and G. G. Yin, “System identification using binary sensors,” IEEE Transactions on Automatic Control, vol. 48, no. 11, pp. 1892–1907, 2003. [4] L. Ljung, System Identification: Theory for the User, 2nd ed.
Upper Saddle River, NJ: Prentice Hall, 1999.
[5] L. Y. Wang, G. G. Yin, and J. F. Zhang, “Joint identification of plant rational models and noise distribution functions using binary-valued observations,” Automatica, vol. 42, no. 4, pp. 535–547, 2006.
30
[6] J. Juillard, K. Jafaridinani, and E. Colinet, “Consistency of weighted least-square estimators for parameter estimation problems based on binary measurements,” in 15th IFAC Symposium on System Identification, Saint-Malo, France, July 2009, pp. 72–77. [7] Y. Zhao, L. Y. Wang, G. Yin, and J. Zhang, “Identification of wiener systems with binary-valued output observations,” Automatica, vol. 43, no. 10, pp. 1752 – 1765, 2007. [8] G. Yin, L. Y. Wang, and S. Kan, “Tracking and identification of regime-switching systems using binary sensors,” Automatica, vol. 45, no. 4, pp. 944 – 955, 2009. [9] L. Wang, C. Li, G. Yin, L. Guo, and C. Xu, “State observers and output feedback of lti systems with binary observations,” in 15th IFAC Symposium on System Identification, Saint-Malo, France, July 2009, pp. 84–89. [10] L. Y. Wang and G. G. Yin, “Asymptotically efficient parameter estimation using quantized output observations,” Automatica, vol. 43, no. 7, pp. 1178–1191, 2007. [11] J. C. Aguero, B. I. Godoy, G. C. Goodwin, and T. Wigren, “Scenario-based EM identification for FIR systems having quantized output data,” in 15th IFAC Symposium on System Identification, Saint-Malo, France, July 2009, pp. 66–71. [12] F. Gustafsson and R. Karlsson, “Estimation based on quantized information,” in 15th IFAC Symposium on System Identification, Saint-Malo, France, July 2009, pp. 78–83. [13] M. Casini, A. Garulli, and A. Vicino, “Time complexity and input design in worst-case identification using binary sensors,” in Proc. 46th IEEE Conference on Decision and Control, New Orleans (USA), December 2007, pp. 5528–5533. [14] ——, “Optimal input design for identification of systems with quantized measurements,” in Proc. 47th IEEE Conf. on Decision and Control, Cancun (mexico), December 2008, pp. 5506–5512. [15] ——, “Input design for worst-case system identification with uniformly quantized measurements,” in 15th IFAC Symposium on System Identification, Saint-Malo (France), July 2009, pp. 54–59. [16] Y. Zhao, L. Wang, J. Zhang, and G. Yin, “Jointly deterministic and stochastic identification of linear systems using binary-valued observations,” in 15th IFAC Symposium on System Identification, Saint-Malo, France, July 2009, pp. 60–65. [17] M. Milanese and A. Vicino, “Optimal estimation theory for dynamic systems with set membership uncertainty: an overview,” Automatica, vol. 27, no. 6, pp. 997–1009, 1991. [18] ——, “Information-based complexity and nonparametric worst-case system identification,” Journal of Complexity, vol. 9, pp. 427–446, 1993. [19] A. Garulli, A. Tesi, and A. Vicino, Eds., Robustness in Identification and Control, ser. Lecture Notes in Control and Information Sciences.
London: Springer, 1999.
[20] D. N. C. Tse, M. A. Dahleh, and J. N. Tsitsiklis, “Optimal asymptotic identification under bounded disturbances,” IEEE Transactions on Automatic Control, vol. 38, no. 8, pp. 1176–1190, 1993. [21] K. Poolla and A. Tikku, “On the time complexity of worst-case system identification,” IEEE Transactions on Automatic Control, vol. 39, no. 5, pp. 944–950, 1994.