www.ietdl.org Published in IET Generation, Transmission & Distribution Received on 21st March 2010 Revised on 26th June 2010 doi: 10.1049/iet-gtd.2010.0201
ISSN 1751-8687
Ensemble decision trees for phasor measurement unit-based wide-area security assessment in the operations time frame S.R. Samantaray1 I. Kamwa2 G. Joos1 1
Department of Electrical and Computer Engineering, McGill University, McConnell, 633, 3480 University Street, Montreal, Quebec H3A 2A7, Canada 2 Hydro-Que´bec/IREQ, Power System Analysis, Operation and Control, Varennes QC J3X 1S1, Canada E-mail:
[email protected]
Abstract: This study proposes ensemble decision trees for phasor measurement units (PMUs)-based wide-area security assessment to provide early warnings of deteriorating system conditions. In the proposed technique, the wide-area response signals in real-time operation are captured after 1 and 2 s fault clearing time, from the respective monitoring buses where PMUs are placed. These wide-area post-disturbance records are processed in time and frequency domains for extracting selected decision features such as the peak spectral density of the angle, frequency and their dot product evaluated over the grid areas called as wide-area severity indices (WASI). WASI are used as input features to train the random forests (RFs) to build effective predictor for early warnings in security assessment. The RF-based learning not only provides high performance accuracy but is also effective in valuing the importance of, and the interaction among, the various WASI input features, for developing the reliable predictor. The RF has been successfully tested for classifying both systemwise and area-wise NERC-compliant contingencies, using 55 196 cases (76% stable) from system operations studied on the Hydro Que´bec network providing 99.9% reliability.
1
Introduction
The dynamic security assessment (DSA) of a power system usually refers to the problem of how well a particular system condition can withstand all credible contingencies, taking into consideration the detailed dynamic characteristic of the system. As the increase in electric power demand outpaces the installation of new transmission and generation facilities, power systems are forced to operate with narrower margins of security. Security is defined as the capability of guaranteeing the continuous operation of a power system under normal operation even following some significant perturbations. As security is a major, if not ultimate, goal of power system operation and control, a fast and reliable security assessment is necessary. DSA can deal with transient stability problems and/or voltage stability problems that, respectively, require transient stability assessment and voltage stability assessment. One of the important requirements of the security assessment is to predict early 1334 & The Institution of Engineering and Technology 2010
warning of the deteriorating system conditions, so that the operators can make corrective actions. This should provide more diagnostic tools than are currently available and allow for the more effective use of automatic controls for selfcorrection such as automatic switching or controlling the flow of power. The most frequently used defence plan against these rare contingencies are based on event detection [1] using breaker status and fault signals from relays in combination, basically because the more appealing response-based approach [2] is not yet fast enough to allow for effective remedial actions. However, with the recent advances in wide-area measurement, fast-response-based stability assessment of extreme contingencies now seems to have better prospects. This new approach, which ranks the contingency severity/ stability using dynamic information measured on-line, is potentially more general and robust than event-detection schemes alone, which rely heavily on off-line simulations of IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org system conditions intentionally set to be more conservative than those prevailing at the present time. The lack of transmission system expansion means that, in any case, current defence plans will reach their limit at a time when it may be appropriate to supplement them with more refined and context-sensitive wide-area response-based remedial actions or special protection systems.
cases (76% stable) from system operations studies on the HQ network. The proposed scheme for early warnings in security assessment is shown in Fig. 1. The scheme starts with monitoring the post disturbance records at PMU buses of the system studied, followed by features extraction, and providing the final output from the RF-based predictor in terms of stable (21) or unstable (1) event.
In recent years, a comprehensive time–frequency-based approach for contingency severity ranking and rapid stability assessment [3] has been proposed by the present authors. The aim was to assist in the classical off-line simulationsbased DSA task [4–12] by correctly classifying all single or multiple contingencies that may result in a loss of stability in the first 20 s following fault-clearing. In this technique, a number of strategic monitoring buses are selected where the phasor measurement units (PMUs) are located to capture representative wide-area voltage magnitudes and angles during real-time operation [13, 14]. The STFFT [15, 16] is then dynamically applied to the responses for extracting selected decision features as the post-disturbance time frame evolves. It is shown that frequency-domain features such as the peak spectral density of the angle, the frequency and their dot product evaluated over the grid areas and referred to the system center-of-inertia (COI) are reliable timevarying stability indicators that can form the basis of an entirely reliable classification system.
2 System studied using wide-area monitoring
Recently, decision tree (DT) has been a most widely used tool for developing the classifier for rapid stability assessment. In our earlier work [17], a rule-based classifier has been developed initialised by DT. However, even if the accuracy and security have been improved compared to heuristic fuzzy logic [3], the reliability stays at around 80%, which is very low for the stability assessment. This paper aims at closing the large gap for reliability, by developing a classifier using random forest (RF) [18–23]. The proposed scheme starts at capturing wide-area severity indices (WASI) features from the studied Hydro Que´bec (HQ ) power network and used to train the RF to build the most effective and reliable predictor for early warning of system deterioration in operation time frame. The open-source DT software known as Rattle [19] is then used to train the classifier for extended data sets with RF for system-wise and area-wise data sets for is 1 and 2 s early termination. RF is able to provide reliability more than 99.9% with highly improved accuracy and security (close to 99%) for the proposed stability assessment. The developed fuzzy-rule base has been successfully tested for classifying both system-wise and area-wise NERC-compliant contingencies, using 55 196
The system studied using wide-area monitoring is the 783bus system representing the HQ grid model used for operations planning [24]. For the present study, winter and summer operation planning models are used with about 1000 load flow patterns generated by the transfer limit search and critical clearing time search processes based on 30 carefully chosen 735 kV contingencies. Some variants of these models include wide-area stabilisers [14] currently under development at HQ. The system studied for building up the proposed RF-based predictor is shown in Fig. 2. The complete network is divided into a number of electrically coherent areas [25] associated with weak inter-tie lines or systematically identified stability boundaries [26]. In the shown representation, these intertie lines are the primary clusters of the system, defining the finite number of possible ways to split the system into unconnected islands with embedded generation. Further, selecting an optimum PMU set allocated for monitoring using a sequential addition algorithm to expand their number, while maximising the amount of information added by each new PMU [13]. During the process of sequential addition, the entropy-based incremental information becomes very small, providing the stopping criteria for deciding number of minimum PMUs that allow good coverage of the dynamic response. In the proposed simulation study, the busses with PMU are only considered for monitoring the wide-area post-disturbance records in case of contingency. The system is designed in such a way that, the simulation will be smoothly replaced by actual measurements in real-time situation, assuming the same PMU configuration.
3
Wide-area severity indices
3.1 WASI features The wide-area severity indices are defined by an equivalent inertia associated with each area, representing the total
Figure 1 Proposed scheme for early warnings in security assessment IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
1335
& The Institution of Engineering and Technology 2010
www.ietdl.org
Figure 2 Studied system: monitoring of the 783-bus HQ system with intra-area and inter-tie PMUs distributed over nine electrically coherent areas Inertia (H)/generation (MW) data are typical values for illustrative purposes
inertia of the generation located in that area. Assuming that each area is coherent following a disturbance, it is reasonable to assimilate its behaviour to that of a single large machine with the same inertia and generation. Even though this assumption is not perfect, it offers a straightforward means of deriving the COI, which is very useful information for tracking the stability of interconnected areas. In real time, a defence plan or an SPS [24, 27, 28] could readily derive these inertia constants through low-speed communication with the control centre state-estimator, which holds the actual load and generation dispatch. The time-domain features such as voltage and COI angle deviation are considered as they are well known to as key descriptor of the power grid stability state. In determining the level of system stress induced by a contingency, the most obvious criterion is the transient stability after a single or multiple swings of the generator angles. In this case, the system is deemed unstable if there is a loss of synchronism between any of the areas monitored earlier in the prescribed observational time frame, that is, the angle shift exceeds 1808. When an early discovery of impeding instability is sought, this criterion is useless because it offers little preemption time. The COI angle deviation with respect to the pre-fault value is computed for each area and the maximum value of these differences is found to be a good measure of the topological stress induced by the contingency. 1336 & The Institution of Engineering and Technology 2010
It has been shown previously that frequency-domain features such as the peak spectral density of the product of angle and frequency evaluated over the grid areas and referred to the system COI are reliable time-varying stability indicators that can form the basis for assessing contingency severity [3]. In this paper, the WASI concept is extended to rapid stability assessment by using simultaneously a short (16-cycle) and long (192-cycle) short-time-fourier-transform analysis window [15, 16], able to respond fast enough for the first swing instability as well as the post-inertial and mid-term dynamic instability [10]. Additionally, multi-signal state-space-based modal analysis [29] of the output of the long filters is shown to be very useful for assessing the postcontingency damping condition of the grid [30]. Following a fault (which is automatically detected by the PMUs), the following serial computation sequence is started:
1. Compute the pilot phasor and frequency of each area by i. averaging the within- area measurements: V i , v 2. Compute the system COI of the angle and frequency variables using the available area inertias (Mi). 3. Project the pilot angle and frequency of step 1 into the COI COI . COI reference of step 2: ui , v i IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org 4. Compute the shift from the pre-fault to post-fault COIangle for each area: PostFltAngle. 5. Compute the power spectrum of the dot-product of energy and angle to obtain a tracking severity index by area. From the above algorithm, WASI can be defined in time and frequency domains as follows [17]: V min Ts
system-wise minimum voltage over the time span of T ¼ 1 s or T ¼ 2 s after fault-clearing
V min Ts j
area-wise minimum voltage over the time span T ¼ 1 s or T ¼ 2 s following fault-clearing, considering only the buses in area j
V min TsR
area-wise minimum voltage over the time span of T ¼ 1 s or T ¼ 2 s after fault-clearing, considering only the buses in the faulted area
FastWASI (Ts)
system-wise frequency-domain severity index defined over the time span of T ¼ 1 s or T ¼ 2 s after faultclearing
FastWASI (Ts)
area-wide frequency-domain severity index for area j, defined over the time span of T ¼ 1 s or T ¼ 2 s after fault-clearing
FastWASITsR
area-wise frequency-domain severity index for the faulted area, defined over the time span of T ¼ 1 s or T ¼ 2 s after fault-clearing
VLowPass2s j, VCriterion2s j
filtered area-wise minimum voltage
VLowPass2s, VCriterion2s
filtered system-wise minimum voltage
TDEF
fault duration
PostFltAngle
system-wise maximum COI angle deviation from steady-state to faultclearing time
PostFltAngle j
area-wide maximum COI angle deviation from steady-state to faultclearing time, for area j.
TsimSt600
duration of simulation up to normal end or loss of synchronism, whichever occurs first
3.2 WASI and stability condition The relationship between WASI features and stability conditions is depicted in Fig. 3. It shows the energy-based FastWASI (2 s), the system-wise maximum value of the IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
dot-product PSD and the voltage-based Vmin1sR, which is the 1 s voltage minimum over the faulted area. These boxplots [31] are categorised according to the stability condition of the case, with OK ¼ 21 (stable) and OK ¼ 1 (unstable) and defines 50% of the whole sample. The left limit of the box defines the first quartile of the data, whereas the right limit corresponds to the third quartile. It can be concluded from the box-plots that the relationship FastWASI2s , 23.2 includes 75% of stable cases, whereas FastWASI2s . 23 includes 75% unstable cases. Thus FastWASI2 s is able to split the stability domain into two largely disjointed sets. Similarly, Vmin1sR . 0.91 includes 75% of stable cases and Vmins1R , 0.90 includes 75% of unstable cases. Fig. 4 illustrates the same features using waveforms from two cases. The first (Fig. 4a) is very stable with good transient voltage profiles and well-damped angle shifts. This is reflected in the basic features as FastWASI2 s , 23.4 and Vmin1sR . 0.9. Since Vmin1 s ¼ Vmin1sR, it should be concluded that the faulted voltage area is the disturbed system-wise. The second case (Fig. 4b) is derived from that in Fig. 4a by increasing the power transfer in the interface studied by 2000 MW. The system remains stable and relatively well-damped but the transient-voltage profile is unacceptable as they violate the planning criteria of Vmin1s. When the stability program runs these two cases, the termination code is 21 for the first and +1 for the second. The system remains stable and relatively well-damped but the transient-voltage profile is unacceptable as they violate the planning criteria of Vmin1s.
4
Ensemble DT (RFs)
4.1 Background RFs [18] are a large combination of de-correlated tree predictors such that each tree depends on the values of a random vector sampled independently. Individual trees are noisy and unstable, but since when grown sufficiently deep, they have relatively low bias. Therefore they are ideal candidates for ensemble growing as they can capture complex interactions, while fully benefit from aggregationbased variance reduction. Using a random selection of features to split each node and re-sampling (with replacement) the training set to grow each tree yields error rates that are de-correlated and more robust with respect to noise. The generalisation error for forests converges as to a limit as the number of trees in the forest becomes large. The basic idea of most ensemble tree growing procedures is that for the kth tree (k ≤ ntree , the number of trees in the ensemble) a random vector Fk is generated, independent of the past random vectors Fk , . . . Fk−1 but with the same distribution, and a single tree is grown using the training set S and the set of attributes in Fk , resulting in a classifier Tk (x, Fk ), where x is an input vector. In random split selection, F consists of a number ntry of independent random integers, where ntry , na , the number of attributes in S. 1337
& The Institution of Engineering and Technology 2010
www.ietdl.org
Figure 3 Example of two key features able to split the databases into two sup-spaces of stable (OK ¼ 21) and unstable (OK ¼ 1) cases a Left: frequency domain b Right: time domain Sample count: 55 196 with 76% stable
An RF consists of a collection of tree-structured classifiers {Tk (x, Fk ), k = 1, . . . , ntree }, where {Fk } are independent identically distributed random vectors and each tree casts a unit vote for the most popular class at input x. An algorithmic view of the RF growing process is summarised below [17]: 1. For k ¼ 1 to ntree: a. Draw a boostrap sample S∗ of size N from the training data S (which contains M . N samples) b. Grow an RF tree Tk (x, Fk ) to the boostrapped data, by recursively repeating the steps below for each terminal noted of the tree, until the no other split is possible (unpurned tree of maximal depth): i. Select ntry variables from the na WASI features, ii. Pick the best variable/split-point among the ntry, iii. Split the node into two daughter nodes. 2. Output the ensemble of trees {Tk (x, Fk ), k = 1, . . . , ntree }. Although RF is a relatively young data mining tool, scholars [20, 21] have started recognising its strengths: (i) it is simple and easy to use; (ii) it has very high accuracy; (iii) It is relatively robust to outliers and noise; (iv) It gives useful internal estimates of error, strength and correlation; (v) it is not over-fitting if selecting large number of trees and (vi) it is insensitive to choice of split. 1338 & The Institution of Engineering and Technology 2010
4.2 Prediction from ensemble trees In an ensemble of trees the predictions of all individual trees need to be combined. For classification, the class that most trees vote for is returned as the prediction of the ensemble n Cˆ RFtree (x) = majority vote{Cˆ k (x), k = 1, . . . , ntree }
(1)
where Cˆ k (x) is the class prediction of the kth RF tree. For predicting probabilities, that is, relative class frequencies, the results of the single trees are averages tree 1 n n Pˆ (Cˆ [ {S, I }|x) Pˆ RFtree (Cˆ RFtree [ {S, I }|x) = ntree 1 Tk (Fk , S) k
n
(2) where Pˆ Tk (Fk , S) denotes the probability associated to an observation x by the RF tree Tk (x, Fk ). A traditional DT essentially represents an explicit decision boundary, and an instance E is classified into class c if E falls into the decision area (a leaf in the DT) corresponding to c [23]. The class probability p(c|E) is typically estimated by the fraction of instances of class c in the leaf into which E falls. This probability estimate is very crude when the tree is pruned because all the instances falling into the same leaf have the same class probability. More accurate probability estimates require unpruned trees [32], which are the backbone of the RFs. Stated otherwise, RF predictor has the additional advantage of providing a stability or instability level of the event through probability-based ranking. IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org
Figure 4 Examples of two apparently stable cases a Very stable b Marginally stable with violation of transient-voltage criteria (about 2000 MW of additional power transfer in the critical interface)
Assuming that the probability estimates from individual trees are random variables, each with variance s2 , the variance of the average in (2) is s2/ntree which confirms that the RF leads seamlessly to improved probability estimates [20]. In addition to the ordinary prediction described above, RFs have a so-called out-of-bag (OOB) prediction. Remember that each tree is built on a bootstrap sample S∗ IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
that serves as a learning set for this particular tree. S∗ contains only two-thirds of the OOB observations [18], that is, those M-N samples not participating to the training of a given tree can serve as ‘built-in’ test sample for computing the prediction accuracy of that tree. The advantage of OOB error is that more realistic estimate of the error rate can be obtained. If we feed the RF inducer with S containing only 70% of the original data and keep the rest for testing, giving that each tree is trained on 1339
& The Institution of Engineering and Technology 2010
www.ietdl.org two-third of the data only, it turns out that only 50% of the data are actually seen by a given RF tree at learning stage. If the resulting predictor worked fine on the external test set, we have to admit that it is very robust and general model.
4.3 Relative importance of the variables Single classification trees are easily interpretable, both intuitively at first glance and descriptively when looking in detail at the tree structure. In particular, variables that are not included in the tree did not contribute to the model. An ensemble of trees has the advantage that it gives each variable the chance to appear in different contexts with different covariates and can thus better reflect its potentially complex effect on the response. Moreover, order effects induced by the recursive variable selection scheme employed in constructing the single trees are eliminated by averaging over the entire ensemble. Therefore in RFs variable importance measures are computed to assess the relevance of each variable over all trees of the ensemble. The most advanced variable importance measure available in RFs is the ‘permutation accuracy importance’ measure. Its rationale is the following: By randomly permuting the values of a predictor variable, its original association with the response is broken. By randomly permuting the predictor variable Xj , its original association with the response Y is broken. When the permuted variable Xj , together with the remaining non-permuted predictor variables, is used to predict the response for the OOB observations, the prediction accuracy (i.e. the number of observations classified correctly) decreases substantially if the original variable Xj was associated with the response. Thus, a reasonable measure for variable importance is the difference in prediction accuracy before and after permuting Xj , averaged over all trees. Let S(t) be the OOB sample for a tree t, with tree t [ 1, . . . , ntree . Then the importance of variable Xj in the tree t is (t)
VI (Xj ) =
i[S
(t)
I (ˆyi(t) )
(t)
|S |
−
i[S
(t)
I (ˆyi,(t)cj )
(t)
|S |
VI(Xj ) =
(t)
VI (Xj ) ntree
t=1
(4)
Since the individual importance scores VI(t)(xj ) are computed from ntree independent bootstrap samples, a simple test for 1340
& The Institution of Engineering and Technology 2010
Training data Model Scenario
Extended data file
Flat data file
RF (210 trees) Single tree
Single tree
S1
S2
S3
System-wise HQ_1s
93 425
93 425
55 196
stable
42 453
42 453
42 453
HQ_2s
93 425
93 425
55 196
stable
42 453
42 453
42 453
Area-wise_1s BJN_1s
16 434
16 434
10 647
stable
8718
8718
8718
BJS_1s
24 725
24 725
16 766
stable
14 113
14 113
14 113
CHU_1s
12 440
12 440
8684
7432
7432
7432
MQ_1s
39 826
39 826
19 099
stable
12 190
12 190
12 190
stable
Area-wise_2s BJN_2s
16 434
16 434
10 647
stable
4680
4680
8718
BJS_2s
24 725
24 725
16 766
stable
14 113
14 113
14 113
CHU_2s
12 440
12 440
8684
7432
7432
7432
MQ_2s
39 826
39 826
19 099
stable
12 190
12 190
12 190
stable
(3)
where yi(t) ¼ f (t)(xi) is the predicted class for observation i before and yˆi,(t)cj = f (t) (xicj ) is the predicted class for observation i after permuting its value of variable Xj , that is with xi,cj = (xi,1 , . . . , xi, j−1 , xcj (i), j , xi, j+1 , . . . , xi,p ). It can be noted that VI(t)(Xj) ¼ 0 by definition, if Xj is not in the tree. The raw importance score for each variable is then computed as the average importance of over all trees ntree
Table 1 Summary count of the data sets used in the study
the relevance of variable Xj can be constructed based on the central limit theorem for the mean importance VI(t)(xj ). Thus, if each individual variable importance VI(t) has standard deviation s, the mean importance from ntree replications has standard error s/ntree . Therefore under the null hypothesis of zero variable importance, the z-score is then computed as z(xj ) =
VI(xj ) √ sˆ / ntree
(5)
is asymptotically standard normal. Hence, when the z-score exceeds the a-quantile of the standard normal distribution, the null hypothesis of zero importance for variable Xj is rejected. Note that the averaging and scaling is not with IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org respect to the sample size n but ntree , the number of trees in the ensemble.
are configured to randomly select 70% only of the assumed data file to build the model.
5
For all RF models, a limit of 100 trees was set with ntry ¼ 10, initially to avoid memory overflow. However, as shown in Fig. 5, this number is actually enough because the OOB error starts to stabilise around 50 trees. The training of scenario S1 with the HQ systems data took 2 min on a 2 GHz Centrino 2 Laptop with 4 GB memory.
Result analysis
5.1 Studied scenarios The proposed study includes developing classifiers for complete HQ system for 1 and 2 s early termination. Also an attempt is made to improve these results by developing separate classifiers according to the geographic area where the fault occurred. This led to four different area-wise DTs as follows, each for T ¼ 1 and T ¼ 2 s: 1. BJN: faults in the northern James Bay area (areas 5 and 6 in Fig. 1). 2. BJS: faults in the southern James Bay (area 4).
Fig. 5 shows that the conventional training approach (S3) results in an OOB error three times larger than the proposed extended data file based training. Moreover, in scenario S1 (extended training file), the OOB misclassification error is much greater for the stable than for the unstable instances. In a sharp contrast, the situation is reverse for the conventional training scenario S3 (flat training file). Therefore the proposed extension of the data set by replicating three to four times the unstable cases, results in a drastically more accurate prediction of the unstable cases.
3. CHU: faults in the Churchill Falls area (area 8). 4. MQ: faults in the Manic-Que´bec area (area 7).
5.2 Co-relation and importance of features
To assess the predictors proposed in this paper, we will use the open-source software R [19], which includes implementations of conventional DTs and RFs. The evaluation setup is summarised in Table 1. The scenario S3 is flat data file and, scenarios S2 and S3 are extended data files for training. In the scenarios S1 and S2, the S3 data set is extended with three times unstable cases. Basically, we will be considering the performance of single DT with the flat file training against the extended file training. We will also compare the RF against a single tree model assuming the extended data file. The DT and RF inducers
The accuracy of the classification is strongly dependent on the quality of attributes describing the security concept. Fig. 6a shows the correlation between the decision (OK) and the attributes for HQ _1 s resulted from RFs during training. Highly correlated variables are close together and presented in the same colour: frequency domain features on one side and voltage features on the other side, with TDEF in the middle. We interpret the degree of any correlation by both the shape and colour of the graphic elements [33]. Any variable is, of course, perfectly correlated with itself, and this is reflected as the diagonal
Figure 5 Convergence characteristics of the RF learning: training of the 1s response time Scenarios S3 (left) and S1 (right) for HQ systems
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
1341
& The Institution of Engineering and Technology 2010
www.ietdl.org
Figure 6 Correlation and importance of features (HQ_1s) a Visual summary of correlations between the 23 candidates attributes and the decision variable OK: Scenario S1 with combined HQ_1s systems b Top-down importance of the variables according to the accuracy loss or misclassification rate reduction (gini) when they are, respectively, removed or included in the attributes set. RF learning for Scenario S1, HQ_1s system
1342 & The Institution of Engineering and Technology 2010
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org
Figure 7 Correlation and importance of features (HQ_2s) a Visual summary of correlations between the 23 candidates attributes and the decision variable OK: Scenario S1 with combined HQ _2s systems b Top-down importance of the variables according to the accuracy loss or misclassification rate reduction (gini) when they are, respectively, removed or included in the attributes set. RF learning for Scenario S1, HQ _2s system
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
1343
& The Institution of Engineering and Technology 2010
www.ietdl.org lies on the diagonal of the graphic. Where the graphic element is a perfect circle, then there is no correlation between the variables, as is the case in the correlation between PostFaultAngle.3, PostFaultAngle.9 and PostFaultAngle.7. The colours used to shade the circles give another clue to the strength of the correlation. The intensity of the colour is maximal for a perfect correlation and minimal (white) if there is no correlation. Shades of red are used for negative correlations and blue for positive correlations. What this result underscores is the high relevance of the energy-based features like WASI for explaining the security concept based on PMU measurements. FastWsai1s.7 is the variable most positively correlated to OK while Vmin1s is the most negatively correlated to OK (largest ellipses in both cases). This complementary behaviour is highlighted by the importance analysis results from the RF learning in Fig. 6b.
Table 3 Reliability, security and accuracy for extended data sets with single DT Early termination
Reliability, %
Security, %
Accuracy, %
System-wise HQ_1s
89.90
90.20
90.13
HQ_2s
93.78
87.34
90.13
Area-wise_1s BJN_1s
96.16
93.51
93.99
BJS_1s
96.07
94.01
94.33
CHU_1s
98.40
98.43
98.43
MQ_1s
97.09
74.93
82.95
Area-wise_2s The important analysis shows the importance of the variables according to the accuracy loss or misclassification rate reduction (gini) when they are, respectively, removed or included in the attributes set. As shown in Fig. 6b, there is accuracy loss when in HQ _1s system when Vmin1s.4, Vmin1s.5, PostAngle.4, PostAngkle.5 etc. are removed in order. Similarly misclassification error rate will be reduced when Vmin1sR, Vmin1s, Vmin1s.7, FastWASI1s.7, Vmin1s.5 etc. are included in order. Similar observations are made with the correlation and top-down importance as shown in Figs. 7a and b, respectively, for HQ _2s. FASTWASI2s is most positively correlated and Vmin2s is most negatively correlated with OK in HQ _2s system. In
this case, there is accuracy loss if VlowPass2s.4, PostFltAngle.3 and TDEF etc. are removed and misclassification rate is reduced if FastWasi2s, FastWasi.2s, Vmin2s.4 etc. are included. The above analysis confirms us how the variables take part in decision making process.
Table 2 Reliability, security and accuracy for flat data sets with single DT
Table 4 Reliability, security and accuracy for extended data sets with RF
Early termination
Reliability, %
Security, %
Accuracy, %
BJN_2s
97.09
96.27
96.42
BJS_2s
96.53
96.51
96.51
CHU_2s
99.04
98.89
98.91
MQ_2s
98.48
68.12
79.10
Early termination
System-wise
Reliability, %
Security, %
Accuracy, %
System-wise
HQ_1s
78.46
97.59
93.17
HQ_1s
99.93
98.92
99.16
HQ_2s
76.70
96.87
92.21
HQ_2s
99.96
99.01
99.23
Area-wise_1s
Area-wise_1s
BJN_1s
85.17
98.19
95.83
BJN_1s
99.84
99.32
99.47
BJS_1s
89.78
97.78
96.51
BJS_1s
99.92
99.44
99.51
ChHU_1s
91.29
99.81
98.58
CHU_1s
100
99.91
99.93
MQ_1s
75.79
94.03
87.43
MQ_1s
99.94
97.62
98.46
Area-wise_2s
Area-wise_2s
BJN_2s
93.36
98.02
97.18
BJN_2s
99.89
99.57
99.63
BJS_2s
88.57
98.65
97.05
BJS_2s
99.92
99.76
99.79
ChHU_2s
96.96
99.74
99.34
CHU_2s
100
99.94
99.95
MQ_2s
79.07
91.81
87.15
MQ_2s
99.97
97.75
98.55
1344 & The Institution of Engineering and Technology 2010
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org 5.3 Performance assessment In assessing the performance of the classifiers over the full data set combining both the training and testing subsets, various statistical indices are defined as follows [3, 8]:
1. Reliability: (Total number of unstable cases – total number of cases converted to stable cases)/total number of unstable cases. 2. Security: (Total number of stable cases – total number of cases converted to unstable cases)/total number of stable cases. 3. Accuracy: (Total numbers of cases – number of misclassification)/total number of cases.
Tables 2–4 depict the performance results for scenarios S1, S2 and S3, respectively, for system-wise and area-wise study. Table 2 provides the performance results for flat data file with single DT. In the Table 2, the reliability for HQ _1s and HQ _2s are 78.46 and 76.70%, respectively, whereas the accuracy and security are above 92.0 and 96.0%, respectively. For area-wise analysis, the reliability is more than 85% in BJN, BJS and CHU in cases of 1 and 2 s early termination. However, the reliability for MQ is less than 80% considering 1 and 2 s early termination. Thus it is observed that the reliability stays at low compared to security and accuracy for flat data sets. To observe the improvement further, the single DT has been trained and tested with extended data file. Table 3 provides the results for extended data sets with single DT. It is observed that the reliability improves to 89.90 and 93.78% for HQ _1 and HQ _2s, respectively.
Figure 8 Comparison results (HQ) a Comparison results between single flat, single extended and RF extended with respect to reliability, security and accuracy for 1s_early termination (HQ_1s) b Comparison results between single flat, single extended and RF extended with respect to reliability, security and accuracy for 2s_early termination (HQ_2s)
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
1345
& The Institution of Engineering and Technology 2010
www.ietdl.org Although there is a little sacrifice in security (90–97%) and accuracy (90–93%), but reliability is the important measure compared to security and accuracy in stability assessment. There is also a substantial jump in reliability in case of BJN, BJS, CHU and MQ crossing 96% considering both 1 and 2 s early termination. Table 4 shows the most excited results for system-wise and area-wise for rapid stability assessment. In this case, the extended data file is used to train the RF, where the training occurs for larger number of tree generations instead of single DT. It is observed that the reliability becomes 99.93 and 99.96% in case of HQ _1s and HQ _2s, respectively. Similar observations are made for area-wise study. It is found that the reliability is more than 99% (except BJS, it is
99.9% in all cases), considering both 1 and 2 s early termination. It is also found that there is substantial increase in security and accuracy in system-wise and area-wise study. The security and accuracy are around 99% for HQ _1s and HQ _2s system-wise as well as area-wise (BJN_1s and 2s, BJS_1s and 2s). Only in case of MQ, the security and accuracy are around 97.5 and 98%, respectively. Fig. 8a shows one-to-one comparison between reliability, security and accuracy for flat data sets and extended data sets with single DT and RF for HQ _1s system-wise. It is categorically seen that the all three performance measures are substantially jumped to around 99% with RF compared to other two cases. The jump is 20% in reliability compared to flat data set with single DT. Similar
Figure 9 Comparison results (MQ) a Comparison results between single flat, single extended and RF extended with respect to reliability, security and accuracy for 1s early termination (MQ_1s) b Comparison results between single flat, single extended and RF extended with respect to reliability, security and accuracy for 2s early termination (MQ_2s)
1346 & The Institution of Engineering and Technology 2010
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201
www.ietdl.org observations are made with HQ _2s system-wise as shown in Fig. 8b. The comparison results for MQ_1s are shown in Fig. 9a), which is the most critical area among all areas, providing 25% jump in reliability compared to flat data sets with single DT and substantial improvement in security and accuracy. Similar observations are made with MQ_2s (Fig. 9b) for all three performance measures. It is observed in the above study that the reliability achieved is 99.9% (3 nines) using RF compared to single DT. Our analysis of recent DT applications to DSA results 80– 85% reliability [22] on system-wise study, which is definitely unsatisfactory, giving that modern electric power systems are normally designed and operated to meet a ‘3 nines’ reliability standard [34, p. 7]. Even if a DT predictor is easily interpretable and transparent for the user compared to RFs, however the reliability and security measures are substantially low to be accepted for the critical issues such as DSA. The RF not only provides high jump in reliability, but also the security and accuracies are around 99%, considering system-wise and area-wise security assessment for 1 and 2 s early termination. One question remains, how to retrain the RF following routine changes in the network states? The answer is yes, to some extent. Although the RF is robust over a widerange of system conditions and was trained to capture the ‘essential’ concept of system security, as any inductive knowledge, it comes with a guarantee limited to the network states, that result in dynamics ‘similar’ to those included in the learning database. Given the improved robustness and reliability of the new predictor, such a function could be called upon infrequently on a yearly, monthly or daily basis using forecast scenarios with some uncertainties. An alternative view could be to execute this retrain functionality in real time, at the speed of SCADA information. RF training is inherently fast (only 2 mn CPU time on a laptop) and the database update (when required) could take advantage of the computational facilities being deployed for fast real-time simulation and modelling [35].
6
Conclusions
This paper presents an accurate and reliable predictor for early warnings in security assessment using ensemble DTs learning of WASI. The WASI features are used as input features to train the RFs to predict the grid stability status effectively. In the proposed study, both system-wise and area-wise classifiers achieved 99.9% reliability, for 1 or 2 s response-time decisions making, ensuring the reliability of security assessment for early warnings in operational time frame. Similarly, substantial improvements have been observed for security and accuracy (around 99%) of both system-wise and area-wise predictors. The most important observation is the effect of extended data set with RF, which drives the predictors to substantially improved regions for all three performance measures.
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334– 1348 doi: 10.1049/iet-gtd.2010.0201
7
References
[1] TRUDEL G., BERNARD S., SCOTT G.: ‘Hydro-Que´bec’s defense plan against extreme contingencies’, IEEE Trans. Power Syst., 1993, PWRS-8, (2), pp. 445 – 451 [2] MEI K., ROVNYAK S.M.: ‘Response-based decision trees to trigger one-shot stabilizing control’, IEEE Trans. Power Syst., 2004, PWRS-19, (1), pp. 531– 537 [3] KAMWA I., GRONDIN R., LOUD L.: ‘Time-varying contingency screening for dynamic security assessment using intelligent-systems techniques’, IEEE Trans. Power Syst., 2001, PWRS-16, (3), pp. 526– 536 [4] ERNST D., RUIZ-VEGA D., PAVELLA M., HIRSCH P., SOBAJIC D.: ‘A unified approach to transient stability contingency filtering, ranking and assessment’, IEEE Trans. Power Syst., 2001, PWRS-16, (3), pp. 435– 443 [5] MANSOUR Y., VAAHEDI E., EL-SHARKAWI A.: ‘Dynamic security contingency screening and ranking using neural networks’, IEEE Trans. Neural Netw., 1997, 8, (15), pp. 942– 950 [6] CIGRE Technical Brochure: ‘Review of on-line power system security assessment tools & techniques’, January 2007 (K. Morison, Convener) [7] EJEBE G.C., JING C., WAIGHT J.G., ET AL .: ‘On-line dynamic security assessment: transient energy based screening and monitoring for stability limits’. Presented at the 1997 IEEE/PES Summer Meeting, Berlin, Germany [8] CHIANG H.D., WANG C.S., LI H. : ‘Development of BCU classifiers for on-line dynamic contingency screening of electric power systems’, IEEE Trans. Power Syst., 1999, PWRS-14, (2), pp. 660– 666 [9] FU C., BOSE A.: ‘Contingency ranking based on severity indices in dynamic security analysis’, IEEE Trans. Power Syst., 1999, PWRS-14, (3), pp. 980– 986 [10] BRANDWAJN V., KUMAR A.B.R., IPAKCHI A. , BOSE A., KUO S.D.: ‘Severity indices for contingency screening in dynamic security assessment’, IEEE Trans. Power Syst., 1997, PWRS-12, (3), pp. 1136– 1142 [11] SCHAINKER R., MILLER P., DOUBLEDAY W., HIRSCH P., GUORUI Z.: ‘Real-time dynamic security assessment: fast simulation and modeling applied to emergency outage security of the electric grid’, IEEE Power Energy Mag., 2006, 4, (2), pp. 51– 58 [12] SUN K., LIKHATE S., VITTAL V., KOLLURI V.S., MANDAL S.: ‘An online dynamic security assessment scheme using phasor measurements and decision trees’, IEEE Trans. Power Syst., 2007, PWRS-22, (4), pp. 1935– 1943
1347
& The Institution of Engineering and Technology 2010
www.ietdl.org [13] KAMWA I., GRONDIN R.: ‘PMU configuration for system dynamic performance measurement in large multi-area power systems’, IEEE Trans. Power Syst., 2002, PWRS-17, (2), pp. 285– 394 [14] KAMWA I., BE´LAND J., TRUDEL G., GRONDIN R., LAFOND C., MCNABB ´ bec: D.: ‘Wide-area monitoring and control at hydro-que past, present and future’. Panel Session on PMU Prospective Applications, 2006 IEEE/PES General Meeting, Montreal, QC, Canada, 18– 22 June 2006 ‘Short-time fourier transform,’ in ‘Advanced topics in signal processing’ (Prentice-Hall, Englewoods Cliff, NJ, 1988) [15]
IEEE Proc. Spec. Issue Energy Infrastruct. Def. Syst., 2005, 93, (5), pp. 907– 917 [25] KAMWA I., PRADHAN A.K., JOOS G., SAMANTARAY S.R.: ‘Fuzzy partitioning of a real power system for dynamic vulnerability assessment’, IEEE Trans. Power Syst., 2009, 24, (3), pp. 1–10 [26] KOLLURI V., MANDAL S., VAIMAN M.Y., VAIMAN M.M., LEE S., HIRSCH P. : ‘Fast fault screening approach to assessing transient stability in entergy’s power system’. 2007 IEEE/PES General Meeting, Tampa, FL, USA, 24– 27 June 2007
NAWAB S.H., QUATIERI T.E.:
LIM J.S. , OPPENHEIM A.V. (EDS.):
[16] OSTOJIC D.R.: ‘Spectral monitoring of power system dynamic performances’, IEEE Trans. Power Syst., 1993, PWRS-8, (2), pp. 445 – 451 [17] KAMWA I., SAMANTARAY S.R., JOOS G.: ‘Development of rulebased classifiers for rapid stability assessment of widearea post-disturbance records’, IEEE Trans. Power Syst., 2009, 24, (1), pp. 258– 270 [18] BREIMAN L.: ‘Random forests’, Mach. Learn., 2001, 45, pp. 5 – 32, http://www.stat.berkeley.edu/users/breiman/ RandomForests/ [19] LIAW A., WIENER M.: ‘Classification and regression by random forest in R’, R News, 2002, 2, (3), pp. 18 – 22, http://www.r-project.org/
[27] LIU C.W., SU M.C., TSAY S.S., WANG Y.J.: ‘Application of a novel fuzzy neural network to real-time transient stability swings prediction based on synchronized phasor measurements’, IEEE Trans. Power Syst., 1999, PWRS-14, (2), pp. 685– 692 [28] CIGRE Technical Brochure: ‘Wide Area Monitoring and Control for Transmission Capability Enhancement, WG C4.6.01’, January 2007 (C. Rehtanz, Convener) [29] KAMWA I., GRONDIN R., DICKINSON E.J., FORTIN S.: ‘A minimal realization approach to reduced-order modelling and modal analysis for power system response signals’, IEEE Trans. Power Syst., 1993, PWRS-8, (3), pp. 1020– 1029 [30] HAUER J.F. : ‘Application of prony analysis to the determination of modal content and equivalent models for measured power system response’, IEEE Trans. Power Syst., 1991, 6, (3), pp. 1062 – 1068 [31] http://en.wikipedia.org/wiki/Box_plot
[20] HASTIE T., TIBSHIRANI R., FRIEDMAN J.: ‘The elements of statistical learning’ (Springer-Verlag, New York, 2009, 2nd edn.), 745 pp [21] SIROKY D.S.: ‘Navigating random forests and related advances in algorithmic modeling’, Stat. Rev., 2009, 3, pp. 147 – 163, http://projecteuclid.org/DPubS?service¼ UI&version¼1.0&verb¼Display&handle¼euclid.ssu [22] BANFIELD R.E., HALL L.O. , BOWYER K.W., KEGELMEYER W.P. : ‘A comparison of decision tree ensemble creation techniques’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (1), pp. 173– 180 [23] PROVOST F.J., DOMINGOS P.: ‘Tree induction for probabilitybased ranking’, Mach. Learn., 2003, 52, (3), pp. 199 – 215 [24] TRUDEL G., GINGRAS J.-P., PIERRE J.-R.: ‘Designing a reliable power system: the hydro-que´bec’s integrated approach’,
1348 & The Institution of Engineering and Technology 2010
[32] LIANG H., ZHANG H., YAN Y.: ‘Decision trees for probability estimation: an empirical study’. Proc. 18th IEEE Int. Conf. on Tools with Artificial Intelligence, (ICTAI’06), 2006, pp. 1–9 [33] ALCHEIKH-HAMOUD K., HADJSAID N., BE´SANGER Y., ROGNON J.P.: ‘Decision tree based filter for control area external contingencies screening’. 2009 Bucharest PowerTech, Bucarest, Romania, 2009 [34] Electric Advisory Committee: ‘Smart grid: enabler of the new energy economy’. US Department of Energy, December 2008, http://www.oe.energy.gov/ DocumentsandMedia/final-smart-gridreport [35] MOSLEHI K. , KUMAR A.B.R. , DEHDASHTI E., HIRSCH P., WU W.: ‘Distributed autonomous real-time system for power system operations – a conceptual overview’. IEEE Power Systems Conf. and Exposition, New York, 10–13 October 2004
IET Gener. Transm. Distrib., 2010, Vol. 4, Iss. 12, pp. 1334 – 1348 doi: 10.1049/iet-gtd.2010.0201