Abstract. In this work, Self Organizing Map (SOM) is used in order to classify the types of defections in electrical systems, known as Power Quality (PQ) events.
Self Organizing Map (SOM) Approach for Classification of Power Quality Events Emin Germen1, D. Gökhan Ece2, and Ömer Nezih Gerek3 Anadolu University, Turkey {egermen, dgece, ongerek}@anadolu.edu.tr
Abstract. In this work, Self Organizing Map (SOM) is used in order to classify the types of defections in electrical systems, known as Power Quality (PQ) events. The features for classifications are extracted from real time voltage waveform within a sliding time window and a signature vector is formed. The signature vector consists of different types of features such as local wavelet transform extrema at various decomposition levels, spectral harmonic ratios and local extrema of higher order statistical parameters. Before the classification, the clustering has been achieved using SOM in order to define codebook vectors, then LVQ3 (Learning Vector Quantizer) algorithm is applied to find exact classification borders. The k-means algorithm with Davies-Boulding clustering index method is applied to figure out the classification regions. Here it has been observed that, successful classification of two major PQ event types corresponding to arcing faults and motor start-up events for different load conditions has been achieved.
1 Introduction Nowadays, the increasing demands to the highly sensitive electronic devices in broad kind of area and the considerably big investments to the technological equipments require a good quality of power. The Power Quality (PQ) concept hence started to take attraction of researchers and technologically related people. One of the most common PQ events of voltage sag is the drop off effective value of the voltage between 10%-90% of its nominal value. The distribution system faults or switching on large loads have significant effects on the voltage waveform and disturbances can be observed as voltage sags. Also arcing faults or starting of motors in the energy grid can cause similar abnormalities on the quality of power. From the point of view of maintenance of the complex systems, the events possessing imperfections have to be classified and the related precautions have to be taken into consideration. In the previous work on PQ event detection and classification problem, wavelets were used for the detection purpose and FFT (spectral harmonic analyses) was used for the discrimination of event types. Due to the lack of other discriminative parameters, the event discrimination success was arguably limited [1]. Since PQ term broadly refers to pure sinusoidal waveform, it conglomerates different research areas such as power engineering, signal processing and neural networks. W. Duch et al. (Eds.): ICANN 2005, LNCS 3696, pp. 403 – 408, 2005. © Springer-Verlag Berlin Heidelberg 2005
404
E. Germen, D.G. Ece, and Ö.N. Gerek
Each discipline has its own contribution in order to either delineate the principle couses of the defections in the waveform or classify the possible reasons of them [1],[2]. Kohonen’s Self Organizing Map (SOM) is a neural network which projects the higher dimensional input vector space onto one or two-dimensional array in a nonlinear fashion [3]. It is a valuable non-parametric pattern clustering and classification technique which can be adapted to wide spectrum of fields of science and technology. The SOM network is composed of "neurons", their associated "weights" (roughly, the cluster centroids) and an algorithm for updating the weights, given incoming data. Actually the neurons are the codebook vectors connected in planar lattice structure. The codebook vectors without any neighborhood information constitute a vector quantizer, however the organization of the neurons in two-dimensional lattice with a neighborhood formation gives us a clustering information between the input data set. In this work a classification technique based on SOM has been used in order to discriminate the arcing fault type events from motor start-up events. Although the two PQ event types look similar from the point of view of voltage sag type waveform, the causes and the results are quite different. Here, in order to discriminate those types of voltage waveforms causing disturbance in the power quality, the data has been acquired from experiments carried on low voltage system by introducing arcing fault and induction motor start up events. The key point after getting the voltage waveforms is determining and extracting the features from data which can be used in the cluster analyses. In section 2, the proposed method to develop a multi-dimensional feature vector is introduced in detail. Here the time window that contains not only the instant data but, pre and post events will be analyzed and put into the consideration by using a well formulated signal processing methods including local wavelet extrema, short-time spectral harmonics, and local higher order statistical parameters extrema. In section 3, the SOM is introduced and the method to classify the events is explained. In section 4, the results showing the excellent classifications of the events are shown and the discussion on the results is given.
2 PQ Events and Determination of Feature Vectors The feature vector we have used in this work consists of scalar numbers obtained by three major methods; wavelets, spectrum analysis, and higher order statistical parameters. The instrumentation in our experimental system acquires the voltage and current waveforms and their 50 Hz. Notch filtered versions at a sampling rate of 20 KHz for each waveform. We selected local estimation window size as twice the fundamental period length, which corresponds to a size of 800 samples. The feature vector used has a length of 19. The first eight numbers inside the feature vector correspond to the wavelet transform extrema for the four-level decomposition of voltage waveform using the Daubechies-4 (db4) orthogonal wavelet. These four levels depict time-frequency localized signatures at different frequency resolutions. The extrema are simply the maximum and the minimum transform values around the instance of a PQ event. It was previously shown by several authors that the transform domain values exhibit high energy at or around PQ event instances. Usually
SOM Approach for Classification of Power Quality Events
405
simple thresholding of these coefficient magnitudes is enough to detect the existence of a PQ vent. However, classification between different classes of PQ events, more waveform signatures, as well as sophisticated classifiers are required. In order to verify the usefulness of several decomposition level transform coefficients, we have taken both the minimum, and the maximum values corresponding to four decomposition levels. At decomposition levels higher than four, the time resolution is decreased beyond a factor of 32, which is below the desired time-resolution level. The ninth coefficient of the feature vector was selected according to a classical spectral analysis. We have evaluated the signal energy exactly at the line frequency (50 Hz), proportioned it to the remaining spectral energy at all other frequencies, and took its reciprocal: 2 π 50
ν9 =
³
−
Φν ( ω )d ω +
−∞
Φν ( ω )
∞
³
2 π 50
Φν ( ω )d ω +
(1)
ω = 2 π 50
where ν is the feature vector and ν9 corresponds to its 9Oelement, and ĭν(Ȧ) is the power spectral density of the voltage waveform, v(t). The remaining ten coefficients are obtained from the higher-order statistical parameters of 50 Hz. notch filtered voltage waveforms including the local central cumulants of order 2, 3, and 4, and skewness and kurtosis. For this parameter estimation, it is reasonable to keep the estimation window size a small integer multiple of the fundamental period length. The selected local estimation window size as twice the fundamental period length is statistically long enough to accurately estimate statistical parameters, and short enough to accurately resolve time localization. The motivation behind selecting higher order statistical parameters is that, each PQ event can be modeled as a noise contribution over the voltage waveform. In fact, in [4], it was shown that the power system voltage waveform can be modeled as a combination of a pure sinusoid and noise components imposed upon to that sinusoid. The sinusoidal component is not the informative part in terms of an event detection or classification, but its existence greatly perturbs local statistical parameters. On the other hand, the noise component contains valuable information in case of PQ events and transients. For that purpose, during the data acquisition, we removed the 50 Hz sinusoid of the voltage waveform using Frequency Devices70ASC-50 programmable filter adjusted to a very sharp (20Oorder Elliptic) 50 Hz notch filter. Under eventfree operation conditions, the output waveform of the filter can be modeled as Gaussian. This model is pretty accurate, because the voltage waveform may be noisecorrupted due to the ambient conditions such as EMI generating loads running on or near the system. From the central limit theorem, the combination of independent random sources adds up to a Gaussian process as the number of sources grows.
3 SOM as a Classifier Kohonen's SOM is Neural Network, which projects the data vectors Λ∈ͺP belong to higher dimensional input space n into m many codebook vectors of size n organized
406
E. Germen, D.G. Ece, and Ö.N. Gerek
in a two dimensional lattice structure. SOM provides two fundamental issues: the first is the clustering of data and the second is the relationship between the clusters. The clustering is an unsupervised learning period which can be formulated as: M i ( k ) = M i ( k − 1) + α ( k ) ⋅ β (i , c, k )( Λ ( k ) − M i ( k − 1)) ∀i 1 ≤ i ≤ m
(2)
where α(k) is the learning rate parameter which is changed during the adaptation phase and β(i,c,k) is the neighborhood function around c where c is the Best Matching Unit index which can be found during training as: c = arg min Λ ( k ) − M i ( k )
(3)
i
The relationship between clusters can be seen in the planar surface by checking the distances between the codebook vectors. Although it is difficult to deduce exact relationship between those, since the codebook vector size is much greater than the planar surface size of 2, this gives us an insight about the classification regions. In this work the feature vectors Λ∈ͺP where n = 19 has been used to train the SOM of 5x5 neurons connected in hex-lattice structure. The whole data set is obtained by 120 experiments. The 60 experiments have been carried out for arcing faults. The 30 experimental data of them have been obtained only for inductive and resistive loads. The other 30 experiments have been conducted by introducing adjustable speed drives connected to the experimental settings in order to test the quality of classification. Another 60 data also obtained from the experiments of motor start-up events in similar manner data obtained for arcing faults. Consequently, a data set of 4 classes is formed as: Class 1: Arcing fault with adjustable speed drives load. Class 2: Arcing fault without adjustable speed drives load. Class 3: Motor start-up with adjustable speed drives load. Class 4: Motor start-up without adjustable speed drives load.
The set of 120 data is divided into two in order to obtain the training and the testing sets. For training set, 20 different random data have been selected from each class. The remaining 40 data are used to test the results. After codebook vectors obtained using SOM training algorithm, the map is partitioned into subspaces to discriminate the classification regions by Learning Vector Quantization (LVQ3) algorithm. LVQ algorithm is the classification algorithm based on adjusting the Gaussian borders between the centers of possible classification regions represented by codebook vectors obtained in SOM[3]. In literature there are several ways of implementation of LVQ. In essence, the algorithms attempts to move the codebook vectors to positions that reflect the centers of clusters in the training data in supervised manner. Actually the aim is finding the Gaussian borders between the codebook vectors belonging to different classes by decreasing the miss-classification ratio. In LVQ1 algorithm the training has been done in roughly, however in LVQ2 and LVQ3 algorithms, much adequate approaches have been developed for well tuning.
SOM Approach for Classification of Power Quality Events
407
The LVQ3 algorithm can be explained as: M i ( k + 1) = M i ( k ) − µ ( k ) ( Λ ( k ) − M i ( k ) )
M j ( k + 1) = M j ( k ) + µ ( k ) ( Λ ( k ) − M j ( k ) )
(4)
where Mi and Mj are the two closest codebook vectors to Λ(k), whereby Λ(k) and Mj belongs to the same class, while Λ(k) and Mi belong to different classes respectively; furthermore Λ(k) must fall zone of a window defined as;
§ d1
min ¨
© d2
1-window · > s where s = ¸ 1+window d1 ¹
d2
(5)
where d1 and d2 are the distance between codebook vectors Mi - Λ(k) , and Mj - Λ(k). Also it is necessary to have: M i ( k + 1) = M i ( k ) + ε ( k ) µ ( k ) ( Λ ( k ) − M i ( k ) )
M j ( k + 1) = M j ( k ) + ε ( k ) µ ( k ) ( Λ ( k ) − M j ( k ) )
(6)
where Mi and Mj are the two closest codebook vectors to Λ(k), whereby Λ(k) and Mj and Mi belong to same classes. The µ(k) and ε(k) parameters are learning rates in the algorithm.
4 Classification Results There are several techniques to visualize the results of SOM in literature. A well known one is the U-matrix. This method identifies distances between neighboring units and thus visualizes the cluster structure of the map. Note that the U-matrix visualization has much more rectangles that the component planes. This is because in Umatrix, not the codebook vectors but distances between the vectors are shown. High values indicate large distance between neighboring map units, and identify possible cluster borders. Clusters are typically uniform areas of low values. Refer to colorbar to see which colors mean high values in Fig. 1. In the map, there appear to be two clusters. A rough inspection on the U-matrix which is given in Fig. 1 gives idea about possible two classification regions after training. Here the dark regions show the close connections in the component plane which represent the clusters and the light areas comprises the cluster borders. In order to delineate the exact borders, after LVQ3, Davies-Boulding clustering index method is used. Actually this method is a k-means clustering which denotes the classification borders between the clusters. Testing the classes found after SOM and LVQ3 methods, using the 40 point test set, it has been observed that 100% correct classification could be obtained (Fig1). Here, the same analysis has been done with different map size, and similar classification results have been obtained. The method for clustering and classification and the feature vectors obtained from voltage waveform are observed to comprise an adequate
408
E. Germen, D.G. Ece, and Ö.N. Gerek
match to identify types of PQ defections of motor start-up and arcing fault type events. 2.87
Arcing Fault & Motor Start-up Event Classes U-matrix
arc 0
arc 0
arc 1
1.63
arc 1
arc 1
arc 1
arc 0
arc 1
arc 1
arc 1
arc 0
arc 1
arc 1
arc 1 mot1
mot1
mot0
mot0
mot0
mot1
0.384
Fig. 1. 5x5 Map formation after SOM training and LVQ3. Here arc0 represent arcing fault with adjustable speed drives and arc1 indicates inductive and resistive load without adjustable speed drives. Similarly mot1 and mot0 represent motor start-up events.
References 1. Wael R., Ibrahim A., Morcos M. M., Artificial Intelligence and Advanced Mathematical Tools for Power Quality Applications: A Survey, IEEE Trans. on Power Delivery, Vol. 17, No. 2, April 2002. 2. Wang M., Mamishev A. V., Classification of Power Quality Events Using Optimal Time– Frequency Representations–Part 1: Theory, IEEE Trans. on Power Delivery, Vol. 19, No. 3, July 2004. 3. Kohonen T., The Self Organizing Map, Proceedings of IEEE, 78, 9, (1990), 1464-1480 4. Yang H.T., Liao C. C., A De–Noising Scheme for Enhancing Wavelet–Based Power Quality Monitoring System, IEEE Trans. on Power Delivery, Vol. 19, No. 1, January 2004.