A New Data Clustering Technique and Its Applications Chiman Kwan *, Roger Xu, and Len Haynes Intelligent Automation, Inc. ABSTRACT A new approach to data clustering is presented in this paper. The approach consists of three steps. First, preprocessing of raw sensor data was performed. Intelligent Automation, Incorporated (IAI) used Fast Fourier Transform (FFT) in the preprocessing stage to extract the significant frequency components of the sensor signals. Second, Principal Component Analysis (PCA) was used to further reduce the dimension of the outputs of the preprocessing stage. PCA is a powerful technique for extracting the features inside the input signals. The dimensionality reduction can reduce the size of the neural network classifier in the next stage. Consequently the training and recognition time will be significantly reduced. Finally, neural network classifier using Learning Vector Quantization (LVQ) is used for data classification. The algorithm was successfully applied to two commercial systems at Boeing: Auxiliary Power Units and solenoid valve system.
1. TECHNICAL APPROACH IAI has been working on health monitoring applications for the past 10 years. Our first work was on the automatic fault detection of digital circuit board. Since then, we have expanded our expertise to many diverse areas such as helicopter gearbox failure prediction, liquid propellant engine failure classification, rivet delamination detection, etc. Here we summarize our approach to two of Boeing’s systems: Auxiliary Power Unit (APU) and solenoid valve. Both systems are important in the Space Shuttle. We propose the following approach as shown in Figure 1.
System status
sensor outputs
PCA for feature extraction
FFT
Neural net Classifier
Normalization Removal of DC bias
Fuzzy CMAC neural net Learning VectorQuantization Inputs x1(k)
w11
xn(k)
wpn
Principal Reconstructed Components inputs V1(k) y1(k)
Vp(k)
Competitive Learning
yn(k)
Figure 1 Overall architecture of IAI’s approach to data classification. Our approach consists of three major steps: FFT, PCA, and NN classifiers. They are described below. It should be emphasized that the approach is modular and flexible as different methods can be used in different stages. For example, in some applications, we do not need to perform FFT. Moreover, in some applications, PCA may not be the best choice in the feature extraction stage. An alternative scheme called Canonical Discriminant Analysis (CDA) may yield better results. Also in certain systems, the NN classifiers may not be necessary. A simple distance metric may suffice.
*
[email protected]; phone (301) – 590- 3155; fax (301) – 590 - 9414; http://www.i-a-i.com; Intelligent Automation, Inc., 7519 Standish Place, Suite 200, Rockville, MD 20855
FFT to extract the frequency components out of the sensor signals FFT stands for Fast Fourier Transform. It is a fast procedure to extract the frequency components inside sensor signals. In addition to extracting the frequency components, we also did two further processing. First, we normalized all the frequency components from the sensor data. The will help to alleviate the effects of amplitude fluctuations in some sensor outputs. Second, we took out the DC component by eliminating the first FFT coefficient. This can minimize the effect of the steadystate values of the APU outputs. PCA for extracting the principal directions within the frequency components Principal Component Analysis (PCA) has been applied to many situations in signal processing, image processing, data transmission and storage, and pattern recognition. Its main function is to retain the most important characteristics of its inputs. The main advantage of PCA is its self-learning capability. Figure 2 best illustrates the key ideas of PCA. Step 1 is the formation of a U matrix whose elements are the eigenvectors corresponding to the large eigenvalues in the correlation matrix. The second step is a projection step to extract the features out of a sample vector. Step 1: Form U matrix
Sample vectors from all classes
Calculate correlation matrix R = E(xxT)
Determine the eigenvectors of the largest Q eigenvalues of R
Form U matrix
Step 2: Calculate the principal components Sample vector x
UT
Feature vector of smaller dimension
Figure 2 Basic principle of PCA. The most important application of PCA is for dimensionality reduction. We may reduce the number of features needed for effective data representation by discarding those linear combinations that have small variances and retaining only those terms that have large variances. NN Classifiers IAI has used several NN classifiers before. Fuzzy CMAC, a neural net invented by IAI, was one of them. For this application, however, we decided to use LVQ and Competitive Learning methods, which are simpler to implement and hence more suitable for real-time applications. 1) Improved LVQ The theory of LVQ is detailed in reference 2. The following paragraphs briefly summarize it. LVQ is a supervised learning technique that uses class information to move the Voronoi vectors (vectors that can characterize a class) so as to improve the quality of the classifier decision regions. An input vector (PCA outputs) is picked at random from the input space. Here we chose 3 features to characterize the operating conditions. If the class labels of the input vector x and a Voronoi vector w agree, then the Voronoi vector w is moved in the direction of the input vector x. If, on the other hand, the class labels of the input vector x and the Voronoi vector w disagree, the Voronoi vector w is moved away from the input vector x.
Feature vector Xi Calculate distance between Xi and each Voronoi vector wc in every class
Find the closest wc to Xi
wc and Xi belong to the same class
No
Yes
wc (n + 1) = wc ( n) + α n [ xi − wc (n)]
wc (n + 1) = wc ( n) − α n [ xi − wc (n)]
Repeat for a new feature vector Other wc’s remain unchanged
Figure 3 Flow chart of LVQ. Let {w j j = 1,2,..., N } denote the set of Voronoi vectors, and {xi i = 1,2,..., L} denote the set of input vectors (features from the PCA). Assume there are many more input vectors than Voronoi vectors. The modified LVQ algorithm (also see Fig. 3) proceeds as follows: Step 1: Suppose that the Voronoi vector wc is the closest to the input vector xi . Let Cwc denote the class associated with the Voronoi vector wc and Cxi denotes the class label of the input vector xi . The Voronoi vector wc is adjusted as follows: If Cwc = Cxi , then
wc (n + 1) = wc (n) + α n [ xi − wc (n)] where 0 < α n < 1 . Sometimes we use more than one Voronoi vector to represent each class. For other vectors belonging to the same class, which are not the closest ones to xi , we also made some adjustments to avoid the situation that only one vector is moving all the time. The adjustment of other vectors within the same class is as follows wc (n + 1) = wc (n) + βα n [ xi − wc (n)] with β a number between 0 and 1. If, on the other hand, Cwc ≠ Cxi , then wc (n + 1) = wc (n) − α n [ xi − wc (n)] Step 2: The adjusted vector in Step 1 is disabled to make any changes until all the other vectors have made some changes. The key of the improved LVQ is in the second paragraph of Step 1. We also adjust other weight vectors that are not the closest to the input vector. This additional adjustment avoids the situation that only one weight vector is moving all the time. 2) Competitive Learning (CL) Besides classification using LVQ, we also used an alternative classification method known as Competitive Learning neural networks. The basic idea of CL is similar to LVQ. CL also involves computing the distance between input vectors with NN weights and then choosing the closest weights as the winner. The weight vector of the winner will output a 1 and the other neurons will output 0. Figure 4 shows the architecture of CL. The description of CL also came from reference 3. Here is a summary of the description. The ||dis|| box accepts the input vector p and the input weight matrix IW, and produces a vector having S1 elements. The elements are the negative distances between the input vector and vectors IW. The net input n of a competitive layer is
computed by finding the negative distance between input vector p and the weight vectors and adding the biases b. The competitive transfer function accepts a net input vector for a layer and returns neuron outputs of 0 for all neurons except for the winner, the one associated with most positive element of net input n. The winner’s output is 1. Competitive Layer
input
S1 × R
IW1,1 S 1 ×1
P R ×1
|| ndist ||
+
S 1 ×1
b1 R
a1
n1
C S
1
S ×1
1
R ×1
Fig. 4 Competitive Learning Neural network from reference 3.
2. RESULTS Application to APU This section about the APU system is summarized from reference 1 and 5. There are three APUs onboard the Space Shuttle. The output waveform of APU is similar to that of a capacitor during charging and discharging. The waveform magnitude and frequency vary with time. The waveform outputs can be classified into three categories: 1) Nominal; 2) Aerosurface Gimballing; 3) Engine Gimballing. At present, Boeing is feeding three vectors of 35 elements into a single array and passes this array of 105 elements into a classifier called NeurOn-Line. The NeurOn-Line classifies the sensor data and plots the data in real time. Real-time here means a few seconds of time delay. It is worth to mention two characteristics of Boeing’s existing real-time monitoring software. First, the classifier is a timedomain approach. That is, the NeurOn-Line uses APU signals directly from the sensor outputs. Second, no data reduction is being used to simplify the data vector. Currently, there are 105 elements in the input array to the neural network. A reduction or feature extraction may significantly increase the training and classification speed. The proposed method successfully identified the various operating conditions (> 97 % correct recognition without any optimization effort). For proprietary reasons, we do not show the simulation results. Details can be found in reference 5. Compared with Boeing’s existing approach, our method is a lot faster because we used only 3 inputs to the neural net classifier whereas Boeing’s method used 105 inputs. Application to solenoid valve Under the support of Boeing, University of Florida used Honeywell-Skinner solenoid valves to simulate valves located in the space shuttle orbiter’s main propulsion system. Hall Effect sensors were used to generate electrical signature traces, which were analyzed with neural networks4. The neural net is Radial-basis-function network. Time domain data were used directly as the inputs to the network. One major disadvantage of this approach is that the data amount is over 1000 points and hence the neural net size is also very large, creating a problem in both training and recognition. The other disadvantage is that the existing approach only handles up to 4 cases. The aim of this research by IAI is to improve the work of U. of Florida (UF). We have eliminated all the disadvantages of the approach used by U. of Florida. First, we can identify all 5 modes clearly whereas UF’s approach only dealt with 4 modes.
Second, our neural net requires only 3 inputs whereas UF’s method requires more than 1000 inputs to the neural net. We approached the problem of failure detection and classification from a different angle. We use an approach consisting of FFT, PCA, and LVQ. The advantage is that the training and recognition speed is much faster. Moreover, our approach is a novel frequency-domain approach, which is different from that of UF. For proprietary reasons, we do not show the simulation results. Details can be found in reference 5. We used 3 features from PCA to characterize the 5 cases. It can be seen that all 5 cases are separated. Our test results show that the recognition rate is 100 %.
3. FUTURE RESEARCH DIRECTIONS In our tests, we have limited number of data sets. Future work is necessary to further test the tool for various cases before the tool can be used in practice.
ACKNOWLEDGMENTS This research was supported by NASA under contract NAS10-00019.
REFERENCES 1. Alan Zide, “Description of APU data,” February, 2000. 2. S. Haykin, Neural Networks: A comprehensive Foundation, Macmillan, 1993. 3. MATLAB User Manual, The Mathworks, 1999. 4. University of Florida, AUTOMATED MONITORING PROCESS, Final report, 1998. 5. Phase 1 final report submitted to NASA Kennedy Space Center by Intelligent Automation, Inc., October, 2000.