Fault Detection and Identification Through Variance of ...

3 downloads 0 Views 242KB Size Report
means of the variance of the continuous wavelet transform (CWT) of the outputs of the ... Keywords:- Fault detection, Fault identification, Wavelets, Fisher linear ...
Fault Detection and Identification Through Variance of Wavelet Transform of System Outputs* G. D. GONZALEZ , G. CEBALLOS, R. PAUT, D. MIRANDA, P. LA ROSA. Department of Electrical Engineering University of Chile Tupper 2007, Casilla 412-3, Santiago. CHILE [email protected] Abstract: - The problem of fault detection and identification is approached without using a plant model, by means of the variance of the continuous wavelet transform (CWT) of the outputs of the plant or process. If an output may be considered to be a wide sense stationary stochastic process during a time interval, it is shown that the variance of its CWT depends only on the scale a and not on the displacement b. It is also shown that the average with respect to displacement b of the squared CWT, designated b-average (akin to time average), is an unbiased estimator of such variance. Moreover, it turns out that the standard deviation of this b-average decreases as the length of the data record increases, thus suggesting ergodicity. These properties are then used for fault identification by defining variance templates characterizing normal or fault conditions of a process. The problem of fault identification is solved using the ergodic property by finding the distances of the baverage of a single sample (realization, measurement) of the output considered, to normal and fault condition variance templates. The Fisher linear discriminant method is used to optimize the discrimination between fault and normal conditions for single outputs and for combined outputs. For fault detection a window of a relatively small length ending at the present time t is used. The b-averages in this window are found for each t. The change from one condition to another is detected by considering when the difference of the distances of this baverage - as a function of t - to each predetermined fault or normal templates changes sign after using the Fisher transformation and filtering . The method is tested using a spring fault in a two mass-spring-damper system. Keywords:- Fault detection, Fault identification, Wavelets, Fisher linear discriminant, Stochastic processes.

1 Introduction A given plant may be operating under normal conditions or under various fault conditions. Fault detection and identification (FDI) methods are used for dealing with this problem. The detection consists in determining when a change of operating conditions (from normal to fault and conversely, or between faults) happens. The identification or diagnosis problem consists in determining which type of fault or normal condition is present. There are FDI methods which use a model of the plant [7] and those which do not. When a plant model is unknown or uncertain, methods not relying on a plant model may be used. In such case features of plant outputs related to the faults must be determined. In [6] a wavelet based procedure is used to characterize various fault and normal condition through wavelet based patterns (templates) in the *

case variables measured in an oil refinery. For a particular variable, its corresponding pattern is compared to the normal and fault patterns using a distance. In this paper the method is further developed and supported, on the one hand by providing mathematical proofs concerning statistical properties of the CWT of the plant output (or actuator-plant-sensor system). Furthermore, the discrimination between normal and fault conditions is improved through the use of more than one plant output and by applying Fisher’s linear discriminant method. Another case in which no plant model is used in [9]. Again, a wavelet based method is used both to detect the appearance of a failure of sensors in time, as well as to characterize the type of sensor fault that has occurred. More examples in fault detection and identification which do not use plant models have been developed for aircraft fault detection and identification [1] where a power

In Recent Advances in Intelligent Systems and Signal Processing. Athens, WSEAS Press, 2003. pp. 47-53.

signature is sought to distinguish the operating condition so that it is independent of time (displacement) using singular value decomposition. In this paper displacement independent templates are sought through determination of statistical properties of the CWT (mean and variance) of the outputs considered. The developments in this paper rely on the treatment of CWT of stochastic processes as in [5].

2 Fault Detection and Identification The problem at hand is to design a method for fault detection and identification (FDI) in plants employing some useful properties of the wavelet transform of measured plant outputs, to be derived in Section 3. In particular, the problem is to discriminate between normal and fault conditions occurring in the plant by characterizing appropriate features of such outputs. Although the method is general, it will be exemplified here using a mechanical system shown in Fig. 1, consisting of masses M1 and M2, springs and dampers. The input is force F applied to mass M2 and the measured outputs are the positions y1 and y2 of the masses (Fig. 1). The fault consists of a change of the elastic spring constant k12. Such would, e.g., be the case if k12 represents the constant of parallel springs, one of which becomes broken. For fault detection and identification outputs y1(t) and y2(t) will be used, either separately or combined. The applied force is a stochastic process ~ ( t ) generated by filtering zero mean white noise F using a first order filter. Basically, a real stochastic processes x(t,ξ) is a function of two variables: A real variable (e.g., time t) and an element ξ ∈ Ω, where Ω is a set of outcomes in a probability space [8]. For a given outcome ξ = ξ ′ x ( t, ξ ′) is a time function called a sample (or realization) of the stochastic process, e.g. a measurement performed. Furthermore, for a given time t = t′ , x (t′, ξ) is a random variable [8]. In what follows a simplified notation will be used: the stochastic process will be denoted using a tilde, ~x(t ) = x(t , ξ) and a sample by x (t ) = x (t, ξ′) . The continuous wavelet transform (CWT) of a time function y(t) , e.g. y1(t) in Fig.1, is given by C1 (a, b) =

1 a



t −b  dt a 

∫ y (t ) ψ  1

−∞

(1)

where ψ(t) is a wavelet, b is a displacement and a is a scale factor. The CWT of a variable y(t) corresponds to an expansion similar to a Fourier transform. But instead of sinusoidal functions, wavelet functions are used [2],[4]. For example, the basic Haar wavelet (mother wavelet) is a double pulse, having the value 1 in the interval [0, ½) and −1 in [½, 1). Dilated (widened or shrunken) versions through a and time shifts b of these pulses constitute a basis for the wavelet expansion of y(t). The widths of these functions, which are determined by scale a (akin to frequency in the Fourier case), establish the resolution.

η1

k1 M11

y1 k12

M22 η2

F

y2 k2

Fig. 1. Mechanical system used to test the FDI method by simulation. Wavelets having a limited support in time allow a suitable representation of transient aspects of y(t), including localisation of the times at which these transients occur. This feature is useful for determining changes, and therefore appropriate for fault detection.

3 Variance Template Based Fault Identification .

3.1 Variance of the wavelet transform of outputs Let output ~y( t) = y (t, ξ) be a stochastic process, then its CWT is also a stochastic process, but now of two real variables a and b:



1 t −b  y(t , ξ) ψ  dt ∫ a −∞  a 

C(a,b,ξ) =

(2)

Using the simplified notation, ~ C( a, b ) =

~ (a) = V b



~y ( t ) ψ  t − b  dt a −∞  a 

1



(3)

In the Appendix it is shown that the expected ~ value of C (a,b) is given by ~ (a,b)} = C ~ (a,b) = 0 , E{C

(4)

where the bar indicates expected value, and that its variance is

{[

]}

~ (a,b) − C ~ (a,b) 2 = ~ E C C 2 (a,b ) ~ 2 (a,b) = C ~ 2 (a ) = V(a ), C

(5)

~(a,b) is not a function of so that variance V( a) of C displacement b (Fig. 2). This result is important because, as seen below, it allows the construction of variance based templates to characterize normal and fault conditions of the plant.

Fig. 2. Checks that the variance V(a) of independent of displacement b (within variations) for scales 16, 24, 48 and bottom to top.

The average with respect to b from –T to T of ~ C2 (a,b) , where ~ C(a,b ) is given by (3), is

~ C(a,b) is statistical 64, from

3.2 Time averages of the wavelet transform of an output measurement

T 1 2T

∫C

~ 2 (a,b) db

(6)

−T

and it will be designated b-average. For each a ~ Vb (a) is a random variable which is an unbiased estimator of variance V(a) of the stochastic process ~y(t ) (see Appendix), i.e., ~ (a)} = C ~ 2 (a,b) = V(a) E {V b

(7)

Figure 3 shows that as the length of 2T the output sample y(t) increases the variance of the b~ (a) decreases , suggesting ergodicity. average V b This is an important result, for it enables practical on-line application of the method since variance V(a) may be estimated from a b-average Vb(a) of a single sample (measurement) of length 2T, in accordance with ergodicity.

~ (a) with Fig. 3. Checking ergodicity of b-average V b respect to displacement b. Thick solid line is V(a), ~ (a) }. The finer lines are which is equal to E{ V b ~ (a ) : dotted for V(a) ± the standard deviation of V b an output sample length 5000 sec., solid for a sample of 500 sec. and dashed for one of 100 sec.

4 Fault Detection and Identification in the Mechanical System The fault condition in the mechanical system of Fig. 1 was the change in the elastic constant of the

spring joining the two masses, from k12 = 0.1 (normal) to k12 = 0.07 (fault). Variance Templates V(a) were built for normal and fault conditions by approximating V(a) by the corresponding baverage (6) for a long sample (measurement) of 5000 seconds. Figure 4 shows such normal (*) and fault (o) variance templates. Once these variance templates have been determined the problem is to find whether fault or normal conditions are present. For practical considerations sample measurements yi(t) of smaller length must be used, in this case of length 40 sec., 60 sec and 100 sec. For example, the solid and segmented lines in Fig. 4 are b-averages Vb(a) of C2 (a,b ) obtained from sample measurements of y1(t), respectively for normal and fault conditions. Larger dispersions around the variance templates would be observed for shorter time windows

b-average of C2 (a, b) corresponding to a sample yi(t). The case is classified as normal if distance dN to the normal template is smaller than distance dF to the fault template, and conversely. Better results are obtained, though, if the Fisher linear discriminant method is used [2]. For doing this, both in variances V(a) and in the b-averages Vb(a), the scale a is discretized into M discrete values ai , i = 1, ..., M (in this case, 64), so that these functions of a are turned into vectors in ℜM., and distance (8) turns into the Euclidean distance between template V(a) and

d=

∑ [V ( a ) − V ( a )] M

2

i

b

i

(9)

.

i =1

Figure 5 depicts the variance templates VN(a) and VF ( a ) for normal and fault condit ions, i.e., corresponding to the case of Fig. 4. Also shown on Fig. 5 is the b-averages Vb(a) for a given output sample of relatively short length.

* VN (a) o VF (a)

*

V’N

Fig. 4: Templates of V(a) for normal (*) and fault (o) conditions. For samples of length 100 sec., baverages Vb(a) for different samples y1(t) are shown for normal (solid lines) and fault (dotted lines) conditions.

If only one output is used - y1(t) or y2(t) - the classification of normal or fault conditions may be done finding the (L2) distance 64

∫ [V ( a ) − V ( a )]

2

b

0

* VN (a) • V’N

Vb(a) o VF (a) *

V’b(a) • V’F •

4.1 Using Only One Output

d=

• V’F (a V’b(a)



da

(8)

Fisher

(b

Fig. 5. Representation of templates VN(a) for normal VF(a), as well as b-average Vb(a) for a sample are projected into the Fisher subspace (a) and to a subspace prior to the determination of Fisher’s subspace (b).

In the Fisher linear discriminant method a reduced dimension subspace is sought where the projections V’N(a) and V’F (a) have the largest separation, while taking into account their dispersion around these means [2]. Dispersion of the b-averages Vb(a) around the means is depicted by ellipses. It may be seen in Fig 5b that in the projections on the Fisher subspace the sample would be correctly classified as fault condition, since its distance to V’F (a) is less than to V’ N(a). This is not the case in the case of Fig. 5a, where a wrong classification would be determined.

4.2 Using Two Outputs Combined For using the two outputs y1(t) and y2(t) combined, the CWT variances of each output, V1(a) and V2(a), using discrete values for a, have been combined in V12(a,b) defined by for a = 1 to a = 64 V ( a ) V12 ( a ) =  1 V2 ( a - 64 ) for a = 65 to a = 128

and Vb12 for b-averages Vb(a) is similarly defined. The procedure for determining normal or fault conditions then is the same as in the case of one output, except that now V12(a) and Vb12(a) are used in Euclidean space ℜ2M = ℜ128.

5 Results of Fault Identification Table 1 shows the results obtained. An average of hits percentages is shown for each one of the three cases. Hit percentage is defined as the percentage of coincidences with the actual condition - for different tests - over the total number of tests (200). The resulting hit percentages are: i) H1, using only output y1(t), , ii) H2, using only output y2(t), and (iii) H12, using both outputs combined by concatenation to form y12(t). It may be seen that using the combination of the two outputs together with Fisher linear discriminant by far the best results are obtained. The time intervals 2T correspond to 2τs, 3τs, and 5τs, where τs = 20 sec. is the settling time of output y1(t) when F(t) is a step function.

6 Fault Detection Using Templates Fault detection is approached through estimating changes in b-average Vb(a) ≈ V(a) corresponding to a moving window of length L ending at the present time t. At each present time t distances DN(t) and

DF (t) to predetermined normal and fault templates are computed and filtered using a first order filter.

Table 1. Percentage of hits using individual and combined signals using distances in the original space and in Fisher’s subspace. 2T = 40 sec. Fisher subspace ℜ64 (H1, H2) ; ℜ128 (H12) 2T = 60 sec. Fisher subspace ℜ64 (H1, H2) ; ℜ128 (H12) 2T =100 sec. Fisher subspace ℜ64 (H1, H2) ; ℜ128 (H12)

H1 65 62 H1 71 68 H1 73 73

H2 50 52 H2 52 49 H2 51 50

H12 89 57 H12 96 54 H12 98 58

These distances may be determined using only one output or combined outputs, whether in the original Euclidean space or in the Fisher subspace of the linear discriminant method. Whatever the case, denoting filtered distances by dN(t) and dF (t), the detection time for change in operating conditions is defined for the time when their difference changes sign, i.e., when their graphs cross each other. As an example, let conditions be normal from t = 1 to t = 499 sec. At t = 500 sec. the operation changes to fault condition and remains there. Figure 6 shows that for a moving window of length L = 100 sec. the detection time is t = 520. The best results are obtained - as in the fault identification case - using the combined outputs and Fisher’s linear discriminant method. The detection time is considerably reduced as compared with detection times determined in all other cases. If a shorter window is used in an attempt to reduce the detection time, the variances of the filtered distances increase, so that the filtering action must be increased in order to avoid false conclusions, which produces an increase in the detection time. Because of this it may happen that decreasing L may turn out to be counterproductive.

7 Conclusions A wavelet variance template based fault detection and identification (FDI) method has been proposed and tested based on the continuous wavelet transform (CWT) of the outputs of a plant.

Acknowledgements Funding for the research leading to this paper has been provided by Fondecyt Project No. 1020741, and by the Electrical Engineering Department of the University of Chile.

References

Fig. 6. Detection at t = 520 of the change of operating conditions from normal to fault occurring at t = 500. Distances to normal (dotted line) and to fault (solid line) - are determined in the Fisher subspace for the case of combined outputs.

If this stochastic process is wide sense stationary at least within time intervals corresponding to fault or normal operating conditions - the expectation of its CWT is zero. Even though the CWT is a function of the scales a and the displacement b, it has been shown here that its variance is independent of b, and depends only on the scale a. Then this variance is used as a displacementindependent template for on-line characterization the operating condition. Since for practical purposes one must rely on time averages rather than on expected values, the “time” or displacement average (b-average) of a single sample - i.e. measurement - of the system outputs are analyzed. It has been determined her e that both the b-average of the CWT and of its square are unbiased estimators of the expected values of the CWT mean and of its variance. Although ergodicity with respect to b has not been proved mathematically, results show that the standard deviation of the b-average diminishes as the averaging interval increases. This feature enables the practical on-line application of the method for FDI. If Fisher’s linear discriminant method is used results are considerably improved when the combined effect of two outputs are considered by concatenating their respective CWT variances. Figure 5 corresponds to this case. When only one output was used though, no noticeable improvement was obtained with this method.

[1] Aravena, J.L., Detecting change using pseudo power signatures, 15th Triennial World Congress of the International Federation of Automatic Control, Barcelona, CD ROM, 2000. [2] Bishop, C.M. Neural networks for pattern recognition, Clarendon Press, Oxford, 1996. [3] Burrus, C.S. R. A. Gopinath, H. Guo. Introduction to wavelets and wavelet transforms. A primer, Prentice Hall, NJ, USA, 1998. [4] Daubechies, I., Ten lectures on wavelets, Society for Industrial and Applied Mathematics, 2002. [5] Dijkerman, R.W., R.R. Mazumdar, Wavelet representation of stochastic processes and multiresolution stochastic models, IEEE Transactions on Signal Processing, Vol. 42, No 7, 1994, pp. 1640-1652. [6] Daigugi M., O. Kudo, T. Wada, Wavelet-based fault detection and identification in an oil refinery , 14 th Trienial World Congress, Beijing, P.R. China, paper P-7e-08-05 , 1999 pp. 205-210. [7] Isserman, R., Process fault detection based on modeling and estimation methods - a survey, Automatica, 1984,Vol. 20, pp. 387-404. [8] Papoulis A., S.U. Pillai, Probability, Random Variables, and Stochastic Processes, McGrawHill., Fourth Edition, 2002. [9] Zhang, J.Q. A wavelet-based approach to abrupt fault detection and diagnosis in sensors, IEEE Transactions on Instrumentation and Measurement, Vol. 50, No. 5, 1992, pp.13891396. APPENDIX It shall be proved here that: (i)

The variance V(a) of the continuous ~ (a, b ) of a wavelet transform (CWT) C wide sense stochastic process ~y(t ) is independent of wavelet displacement b.

(ii)

~ (a) is an unbiased The b-average V b estimator of variance V(a).

The CWT of ~y(t ) is [2],[4], ∞

1 ~ C( a, b ) = a



∫ ∫R

y y(γ

− ε )ψ ( γa ) ψ( εa )dγdε ,

and

−∞

{

 t −b  dt a 

(10)

~ (a, b ) is Then the expectation of C ~ ( a,b ) = 1 E{~ C( a,b)} = C a

1 a −∞

}

V (a ) = ~ C2 ( a). = E ~ C2 (a,b )

∫ ~y(t ) ψ 

−∞



V( a) =



∫ y~(t ) ψ (

−∞

t−b )dt a

(11)

. Since ~y(t ) is wide sense stationary, y~ (t ) = ~y is constant, so that from (11),

(16)

~ ( a,b ) , the CWT of a Hence the variance V(a) of C wide sense stationary stochastic process ~y ( t ) , is independent of the wavelet displacement b, and depends only on the wavelet scale a. ~ (a, b ) with The average in interval [−T,T] of C respect to b will be designated as b-average and is given by

~ (a) = V b

T 1 2T

∫C

~ 2 (a,b) db ,

−T ∞

1 ~ t −b ~ C(a,b) = y ψ  dt = 0 a −∫∞  a 

(12)

because ψ(t) is a wavelet [2, 4]. Furthermore, ~ (a, b ) is because of (12) the variance V( a) of C given by,

{[

]}

2 ~ ~ ~ V(a) = E C(a,b) − C(a,b) = C 2 (a,b).

(13)

~ (a) is a random variable, and For each a, V b ~ (a)} = V ~ (a ) = E {V b b

∫ E{C T

1 2T

}

~ 2 (a,b) db .

−T

From (16) ~ (a ) = V b

1 2T

∫ V(a)db = V(a)

(17)

Therefore, form (10) and (13), ∞

so that the b-average is an unbiased estimator of the variance V( a) of the CWT of the wide sense stationary stochastic process ~y ( t ) . Then let y(t) be a sample of ~ y ( t ) . Let



1 α −b β − b  , V( a ) = E ~y (α ) ψ y (β )}ψ   dα ~  dβ a −∞  a  −∞  a 





V( a) =

1 a −∞





α −b β−b ψ   dα dβ a   a 

∫ ∫ E{~y(α )~y(β)}ψ  −∞

C( a , b ) =

(14) Since the stochastic process ~y ( t ) has been assumed to be wide sense stationary, during each operating condition its autocorrelation function depends only on α - β, and not separately on α and β, i.e., E{y~ (α )~y(β)} = R yy (α − β) . Then, ∞

1 V(a) = ∫ a −∞



α −b β−b ∫− ∞ R y y(α − β)ψ( a )ψ ( a )dαdβ

With the change of variables α − b = γ β − b = ε , (15) becomes

(15) and

1 a



 t −b  dt a 

∫ y(t ) ψ 

−∞

(18)

be its CWT and let T

Vb (a) =

1 2T



C 2 (a,b) db

(19)

−T

be the b-average corresponding to that single sample y(t). Then for different samples y(t) the ~ (a ) and will be distributed Vb(a) are samples of V b ~ ( a) = V(a ) .. around its expected value V b

Suggest Documents