G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
A New Classification Pattern Recognition Methodology for Power System Typical Load Profiles G. J. TSEKOURAS1,2, F.D. KANELLOS1, V.T. KONTARGYRI1, I.S. KARANASIOU1, A.D. SALIS1, N. E. MASTORAKIS2 1 2 School of Electrical and Computer Engineering Department of Electrical Engineering & Computer National Technical University of Athens Science, Hellenic Naval Academy 9 Heroon Polytechniou Street, Zografou, Athens Terma Hatzikyriakou, Piraeus GREECE Email:
[email protected],
[email protected],
[email protected],
[email protected],
[email protected],
[email protected] Abstract: - In this paper a new pattern recognition methodology is described for the classification of the daily chronological load curves of power systems, in order to estimate their respective representative daily load profiles, which can be mainly used for load forecasting and feasibility studies of demand side management programs. It is based on pattern recognition methods, such as k-means, adaptive vector quantization, selforganized maps (SOM), fuzzy k-means and hierarchical clustering, which are theoretically described and properly adapted. The parameters of each clustering method are properly selected by an optimization process, which is separately applied for each one of six adequacy measures: the error function, the mean index adequacy, the clustering dispersion indicator, the similarity matrix indicator, the Davies-Bouldin indicator and the ratio of within cluster sum of squares to between cluster variation. This methodology is applied for the Greek power system, from which is proved that the separation between work days and non-work days for each season is not descriptive enough. Key-Words: - Load profiles, clustering algorithms, adaptive vector quantization, fuzzy k-means, hierarchical clustering, k-means, self-organized maps, pattern recognition, adequacy measures 10] and the average and Ward hierarchical methods [8-10]. These methods generally belong to pattern recognition techniques [11]. Alternatively, the customers classification problem can be solved by using data mining [12], wavelet packet transformation [13], frequency-domain data [14], stratified sampling [15] etc. For the reduction of the size of the clustering input data set Sammon map, principal component analysis and curvilinear component analysis have been proposed [8]. The most commonly used respective adequacy measures are the mean index adequacy [9], the clustering dispersion indicator [8-9], the similarity matrix indicator [9], the Davies-Bouldin indicator [7-9], the modified Dunn index [8], the scatter index [8] and the mean square error [10]. The objective of this paper is to present a new methodology for the classification of the daily chronological load curves for power systems. Specifically, the respective load curves set is organized into well-defined and separate classes, in order to successfully describe the demand behavior of a power system. The proposed methodology compares the results obtained by certain clustering techniques (k-means with special weights initialization, adaptive vector quantization (AVQ), mono-dimensional and bi-dimensional self
1 Introduction In a deregulated electricity market, load profiles categorization can be useful to the power systems and their customers. The first ones can be used for load leveling, demand side management, fluctuation smoothing and load forecasting. It is an alternative solution for various problems of the conventional electrical power transmission and distribution systems, which can reduce the respective operational cost providing their customers with satisfactory services at low cost, particularly for special cases such as supply of discontinuous villages or autonomous islands [1], emergency power source [2], etc. The customers can participate into the competitive electricity market using demand side bidding mechanism [3] based on powerful technologies of the energy storage systems [4-5]. In order to carry out this classification, the chronological load curves per year, season or month can be used. During the last years, a significant research effort has been devoted to load curves classification, in order to solve the short-term load forecasting of anomalous days [6-7] and to cluster the customers of the power systems [8-13]. The clustering methods used so far are the selforganizing map [6-8], the “modified follow the leader” [8], the k-means [8], the fuzzy k-means [8-
ISSN: 1109-2734
1090
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
organizing maps (SOM), fuzzy k-means and seven hierarchical agglomerative clustering methods) using six adequacy measures (mean square error, mean index adequacy, clustering dispersion indicator, similarity matrix indicator, DaviesBouldin indicator, ratio of within cluster sum of squares to between cluster variation). The basic aspects of this methodology are: • the estimation of the typical days through the study period and the respective representative typical daily load profiles for the power system; • the modification of the clustering techniques for this kind of classification problem, such as the appropriate weights initialization for the k-means and fuzzy k-means; • the proper parameters calibration, such as the training rate of adaptive vector quantization, in order to fit the classification needs; • the comparison of the clustering algorithms performance for each one of the adequacy measures; • the selection of the proper adequacy measures. Finally, the results of the application of the developed methodology are thoroughly presented for the Greek power system for the summer of the year 2000, while the respective ones are collectively registered for the seasons (summer-winter) and years of the time period 1985-2000. It is mentioned that the proposed methodology is applicable to any power system, leading to reliable results.
2 Proposed Pattern Recognition Methodology for the Classification of Load Curves of Power System
clustering algorithms (k-means, fuzzy k-means, adaptive vector quantization, self-organized map and hierarchical clustering) is applied. Each algorithm is trained for the set of load diagrams and evaluated according to six adequacy measures. The parameters of the algorithms are optimized, if necessary. The developed methodology uses the clustering methods that provide the most satisfactory results. PROPOSED METHODOLOGY
Data selection
Data preprocessing Main applcation of pattern recognition methods
Training process Parameters' optimization Evaluation process
Typical load diagrams & respective day classes
Feasibility studies of DSM progams
Load estimation after DSM application Short- & mid-term load forecasting, etc
Fig. 1. Flow diagram of pattern recognition methodology for the classification of daily chronological load curves of power system The results of the developed methodology can be used for power system short-term and mid-term load forecasting, energy trades, techno-economic studies of the energy efficiency and demand side management programs and the respective load estimation after the application of these programs.
The classification of daily chronological load curves of power system is achieved by applying the pattern recognition methodology, as shown in Fig. 1. The main steps are the following: • Data and features selection: The active and reactive energy values are registered (in MWh and Mvarh) for each time period in steps of 1 hour. The daily chronological load curves are determined for the study period. • Data preprocessing: The load diagrams are examined for normality, in order to modify or delete the values that are obviously wrong (noise suppression). If it is necessary, a preliminary execution of a pattern recognition algorithm is carried out, in order to track bad measurements or networks faults, which will reduce the number of the useful typical days for a constant number of clusters, if they are uncorrected. • Main application of pattern recognition methods: For the load diagrams, a number of
ISSN: 1109-2734
For each one of the next algortihms: k-means, fuzzy k-means, 7 hierchical agglomerative ones
3 Mathematical Modeling of Clustering Methods and Clustering Validity Assessment 3.1 General In the case study of the chronological typical load curves of a power system a number of N analytical daily load curves is given. The main target is to determine the respective sets of days and load patterns. Generally N is defined as the population of the input vectors, which are going to be clustered. r T The xl = ( xl1 , xl 2 ,...xli ,...xld ) symbolizes the l -th input vector and d its dimension, which equals to 24
1091
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
centers. The l -th input vector is put in the set Ω(jt ) , r for which the distance between xl and the respective center is minimum. When the entire training set is formed, the new weights of each center are calculated as:
(the load measurements are taken every hour). The r corresponding set is given by X = { xl : l = 1,..., N } . It is worth mentioning that xli are normalized using the higher and lower values of all elements of the original input patterns set, in order to have better results from the application of clustering methods. Each classification process makes a partition of the initial N input vectors to M clusters. The j-th cluster has a representative, which is the respective load profile and is represented by the vector
1 r w(jt +1) = (t ) Nj
the maximum number of iterations is used or the variation of the weights is not significant. The algorithm’s main purpose is to minimize the error function:
corresponding set is the classes set, which is defined r by W = {wk , k = 1,...M } . The subset of input r vectors xl , which belong to the j-th cluster, is Ω j and the respective population of load diagrams is N j . For the study and evaluation of classification algorithms the following distance forms are defined: a. the Euclidean distance between l1 , l 2 input vectors of the set X:
(
)
(
)
2
J=
∑
r xl ∈Ω j
1 2N j
∑
r xl ∈Ω j
r d ( xl , Ω j ) 2
M
∑u j =1
r r l
l =1
r k : xl ∈Ωk
)
(6)
lj
= 1 , ulj : 0 ≤ ulj ≤ 1, ∀j
(7)
Theoretically, the membership factor gives more flexibility in the vector’s distribution. During the iterations the following objective function is minimized:
(2)
J fuzzy =
1 N
M
N
∑∑ u j =1 l =1
lj
r r ⋅ d 2 ( xl , w j )
(8)
The membership factors and the cluster centers are calculated in each epoch as:
(3)
( t +1)
ulj
The basic characteristics of the three clustering methods being used are the following.
(
The k-means clustering method groups the set of the N input vectors to M clusters using an iterative procedure. Initially the weights of the M clusters are determined. In the classical model a random choice among the input vectors is used [8], while in the developed algorithm the wji of the j-th center is initialized as: w(0) (4) ji = a + b ⋅ ( j − 1) ( M − 1) where a and b are properly calibrated parameters. r During epoch t for each training vector xl its
( (
) ) ∑ (u( ) )
r r d xl , w(jt ) = 1 ∑ r r (t ) k =1 d xl , wk M
r ⎛ N w(jt +1) = ⎜ ∑ ul(jt +1) ⎝ l =1
3.2 K-means model
)
q
r ⎞ ⋅ xl ⎟ ⎠
N
l =1
t +1 lj
(9) q
(10)
where q is the amount of fuzziness in the range (1, ∞ ) which increases as fuzziness decreases. The weights of the clusters centers are initialized by (4), which is similar to the developed k-means.
3.4 Adaptive vector quantization This algorithm is a variation of the k-means method, which belongs to the unsupervised one-layer neural networks. It classifies input vectors into clusters by using a competitive layer with a constant number of r neurons. During epoch t each input vector xl is randomly presented and its respective Euclidean
Euclidean distances d ( xl , w j ) are calculated for all r r
ISSN: 1109-2734
2
r
c. the infra-set mean distance of a set, defined as the geometric mean of the inter-distances between the members of the set, i.e. for the subset Ω j : dˆ ( Ω j ) =
N
Each input vector xl does not belong to only one cluster, but it participates to every j-th cluster by a membership factor ulj , where:
(1)
the geometric mean of the Euclidean distances r between w j and each member of Ω j : r r d 2 ( w j , xl )
∑d (x ,w
3.3 Fuzzy k-means
b. the distance between the representative vector
1 Nj
1 N
The main difference compared to the classical model is that the process is repeated for different pairs of (a,b). The best results for each adequacy measure are recorded for different pairs (a,b).
r w j of j-th cluster and the subset Ω j , calculated as
r d ( wj , Ω j ) =
(5)
Ω (jt ) during epoch t. This process is repeated until
r w j = ( w j1 , w j 2 ,..., w ji ,...w jd ) of d dimension. The r vector w j expresses the cluster center. The
1 d ∑ xl i − xl2i d i =1 1
r xl ∈Ω(jt )
r xl
where N (jt ) is the population of the respective set
T
r r d xl1 , xl 2 =
∑
1092
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
{
(
where n is the number of input vectors, which have been presented during the current epoch, w(ji0) = 0.5, ∀j , i and η ( t ) is the learning rate according to: (12) η ( t ) = η0 ⋅ exp ( − t Tη 0 ) > ηmin
{
ni ⋅ d ( ) ( Ci , Cs ) + n j ⋅ d ( ) ( C j , Cs ) 1
where
{
}
1
− ni ⋅ n j ⋅
(n + n ) i
r r d (1) ( Cq , Cs ) = wq − ws
d ( ) ( Cq , Cs ) = 1
2
2
j
r
and wq
is the
d (1) ( Ci , Cs ) + d (1) ( Cj , Cs ) d (1) ( Ci , Cj ) (19) − 2 4
• the Ward or minimum variance algorithm (WARD): d ( 2 ) ( Cq , C s ) =
(n
+ ns ) ⋅ d ( 2 ) ( C i , C s )
i
(n + n i
(n +
j
+ ns ) ⋅ d
( 2)
where d
( 2)
(C , C i
+ ns )
j
(C , C ) − n j
s
s
⋅d(
2)
(20)
(C , C ) i
j
(n + n + n ) ) = ( n ⋅ n ) ( n + n ) ⋅ d ( ) (C , C ) . i
j
s
1
j
i
j
i
j
i
j
3.6 Self-Organized maps The Kohonen SOM [16-19] is a topologically unsupervised neural network that projects a ddimensional input data set into a reduced dimension space (usually a mono-dimensional or bidimensional map). It is composed of a predefined r grid containing M 1 × M 2 d-dimensional neurons wk , which are calculated by a competitive learning algorithm that updates not only the weights of the winning neuron, but also the weights of its neighbor units in inverse proportion of their distance. The neighborhood size of each neuron shrinks progressively during the training process, starting with nearly the whole map and ending with the single neuron. The training of SOM is divided into two phases: • rough ordering, with high initial learning rate, large radius and small number of epochs, so that neurons are arranged into a structure which approximately displays the inherent characteristics of the input data, • fine tuning, with small initial learning rate, small radius and higher number of training epochs,
(13)
(14)
(15)
• the unweighted pair group method average algorithm (UPGMA):
ISSN: 1109-2734
d ( ) ( Ci , C j ) (18)
representative center of the q-th cluster (eq. 5). • the weighted pair group method centroid algorithm (WPGMC):
• the complete link algorithm (CL) -it is obtained from (13) for ai = a j = 0.5 , b = 0 and c = 0.5 : d ( Cq , Cs ) = max d ( Ci , Cs ) , d ( C j , Cs )
1
ni + n j
where ai , a j , b and c correspond to different choices of the dissimilarity measure. It is noted that in each level t the respective representative vectors are calculated by (4). The basic algorithms, which are going to be used in our case, are: • the single link algorithm (SL) -it is obtained from (13) for ai = a j = 0.5 , b = 0 and c = −0.5 :
}
(17)
d (1) ( Cq , Cs ) =
Agglomerative algorithms are based on matrix theory [11]. The input is the N × N dissimilarity matrix P0 . At each level t, when two clusters are merged into one, the size of the dissimilarity matrix Pt becomes ( N − t ) × ( N − t ) . Matrix Pt is obtained from Pt-1 by deleting the two rows and columns that correspond to the merged clusters and adding a new row and a new column that contain the distances between the newly formed cluster and the old ones. The distance between the newly formed cluster Cq (the result of merging Ci and C j ) and an old cluster Cs is determined as:
{
}
• the unweighted pair group method centroid algorithm (UPGMC):
3.5 Hierarchical agglomerative algorithms
d ( Cq , Cs ) = min d ( Ci , Cs ) , d ( C j , Cs )
j
d ( Cq , Cs ) = 0.5 ⋅ d ( Ci , Cs ) + d ( C j , Cs )
where η0 , ηmin and Tη 0 are the initial value, the minimum value and the time parameter respectively. r The remaining neurons are unchangeable for xl , as introduced by the Kohonen winner-take-all learning rule [14-15]. This process is repeated until either the maximum number of epochs is reached or the weights converge or the appropriate error function is not improving.
+b ⋅ d ( Ci , C j ) + c ⋅ d ( Ci , Cs ) − d ( C j , Cs )
i
where ni and n j are the respective members populations of clusters Ci and C j . • the weighted pair group method average algorithm (WPGMA):
)
d ( Cq , Cs ) = ai ⋅ d ( Ci , Cs ) + a j ⋅ d ( C j , Cs )
} ( n + n ) (16)
d ( Cq , Cs ) = ni ⋅ d ( Ci , Cs ) + n j ⋅ d ( C j , Cs )
distances from every neuron are calculated. The weights of the winning neuron (with the smallest distance) are updated as: rt rt r rt w(j ) ( n + 1) = w(j ) ( n ) + η ( t ) ⋅ xl − w(j ) ( n ) (11)
1093
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
in order to tune the final structure of the SOM. The transition of the rough ordering phase to fine tuning one is happened after Ts epochs. 0
neighboring map units -given by eq. (27): N M r r ADM ( t ) = ∑∑ hi ′→ xrl , j ( t ) ⋅ d 2 ( xl , w j ) N
r
Once all vectors of neurons wk have been initialized, the SOM training starts by first choosing r an input vector xl , at t epoch, randomly from the input vectors’ set. The Euclidean distances between r r the n-th presented input pattern xl and all wk are calculated, so as to determine the wining neuron i ′ r that is closest to xl . The j-th reference vector is updated according to: rt rt r rt w(j ) ( n + 1) = w(j ) ( n ) + η ( t ) ⋅ hi ′j ( t ) ⋅ xl − w(j ) ( n ) (21)
(
N
TE = ∑ neighb ( i1′, i2′ ) N
where
It
is
noted
CDI =
are the
r ∑ d (w ,Ω ) M
2
j
j =1
j
(27)
1 M
M
∑ dˆ ( Ω )
dˆ (W )
2
k =1
k
(28)
4. Similarity matrix indicator (SMI) [9], which is defined as the maximum off-diagonal element of the symmetrical similarity matrix, whose terms are calculated by using a logarithmic function of the Euclidean distance between any kind of class representative load curves:
respective initial value and time parameter of the radius correspondingly. The case studies deal with the matters of the shape of the map, the parameters calibration and the weights initialization. Especially, the multiplicative factors φ and ξ are introduced -without decreasing the generalization ability of the parameters calibration: (23) (24) Ts = φ ⋅ Tη Tσ = ξ ⋅ Tη / ln σ 0
{(
r r SMI = max 1 − 1 ln ⎡⎣ d ( w p , wq ) ⎤⎦ p〉 q
)
−1
}
: p, q=1,…,M (29)
5. Davies-Bouldin indicator (DBI) [20], which represents the system-wide average of the similarity measures of each cluster with its most similar cluster:
0
The supposed best trained SOM is the one trained with a number of t epochs for which the following index Is get the minimum value [7]:
DBI =
(25)
where J (t ) is the respective quantization error given by eq. (6)-, ADM ( t ) is the average distortion measure -given by eq. (26)- and TE (t ) is the topographic error which measures the distortion of the map as the percentage of input vectors for which the first i1′ and second i2′ winning neuron are not
ISSN: 1109-2734
vector
3. Clustering dispersion indicator (CDI) [9], which depends on the mean infra-set distance between the input vectors in the same cluster and inversely on the infra-set distance between the class representative load curves:
0
Is (t ) = J (t ) + ADM ( t ) + TE (t )
1 M
MIA =
respective co-ordinates in the grid, σ ( t ) = σ 0 ⋅ exp ( −t / Tσ ) is the decreasing neighbor-
0
input
In order to evaluate the performance of the clustering algorithms and to compare them with each other, six different adequacy measures are applied. Their purpose is to obtain well-separated and compact clusters, in order to make the load curves self explanatory. The definitions of these measures are the following: 1. Mean square error or error function (J) [10] given by eq. 6. 2. Mean index adequacy (MIA) [9], which is defined as the average of the distances between each input vector assigned to the cluster and its center:
r
0
each
3.7 Adequacy measures
between i ′ and j neurons, rj = ( x j , y j ) are the
0
for
neighbors, either 0.
is the respective distance
0
that
neighb ( i1′, i2′ ) equals to 1, if i1′ and i2′ neurons are not
)
hood radius function where σ 0 and Tσ
(27)
l =1
where η ( t ) is the learning rate according to eq. 12. During the rough ordering phase ηr , Tη are the initial value and the time parameter respectively, while during the fine tuning phase the respective values are η f , Tη . The hi′j ( t ) is the neighborhood symmetrical function, that will activate the j neurons that are topologically close to the winning neuron i ′ , according to their geometrical distance, who will r learn from the same xl . In this case the Gauss function is proposed: hi ′j ( t ) = exp ⎡⎣ − d i2′j ( 2 ⋅ σ 2 ( t ) ) ⎤⎦ (22) r r di ′j = ri ′ − rj
(26)
l =1 j =1
⎧⎪ dˆ ( Ωp ) + dˆ ( Ωq ) ⎫⎪ 1 M max ∑ ⎨ r r ⎬ : p,q=1,…,M (30) M k =1 p≠q ⎪ d ( wp , wq ) ⎪ ⎩ ⎭
6. Ratio of within cluster sum of squares to between cluster variation (WCBCR) [21], which depends on the sum of the distance square between each input vector and its cluster representative vector, as well as the similarity of the clusters centres:
1094
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
M
WCBCR = ∑
r r r r ∑ d (w , x ) ∑ d (w , w )
different number of clusters for three cases of q={2,4,6}. The best results are given by q=4 for J, MIA , CDI and WCBCR adequacy measures, by q=6 for SMI and DBI indicators. It is noted that the initialization of the respective weights is similar to the proposed k-means.
M
2
r k =1 xl ∈Ωk
2
k
l
1≤ q < p
p
q
(32)
The success of the various algorithms for the same final number of clusters is expressed by having small values of the adequacy measures. By increasing the number of clusters all the measures decrease, except of the similarity matrix indicator. An additional adequacy measure could be the number of the dead clusters, for which the sets are empty. It is intended to minimize this number. It is noted that in eq. (6), (29) - (32), M is the number of the clusters without the dead ones.
0.95 0.90
SMI
0.85 0.80 0.75 - q=2 - q=4
0.70 0.65
4 Application Methodology
of
the
Proposed
- q=6
0.60 5
10
4.1 Case study
25
20
a. SMI indicator (similar to DBI)
WCBCR
The developed methodology is applied on the Greek power system, analytically for the summer of the year 2000 and concisely for the period of years 1985-2002 per epoch and per year. The data used are hourly load values for the respective period, which is divided into two epochs: summer (from April to September) and winter (from October to March of the next year). In the case of the summer of the year 2000, the respective set of the daily chronological curves has 183 members, from which none is rejected through data pre-processing. In the next paragraphs the application of each clustering method is analyzed.
0.030
- q=2
0.025
- q=4
0.020
- q=6
0.015 0.010 0.005 0.000 5
10 15 Number of clusters
20
25
b. WCBCR indicator (similar to J, MIA & CDI)
Fig. 2. SMI and WCBCR for the fuzzy k-means method for the set of 183 load curves of the summer of the year 2000 with q=2, 4, 6 for 5 to 25 clusters
4.2 Application of the k-means The proposed model of the k-means method is executed for different pairs (a,b) from 5 to 25 clusters, where a={0.1,0.11,…,0.45} and a+b={0.54,0.55,…,0.9}. For each cluster 1332 different pairs (a,b) are examined. The best results for the six adequacy measures do not refer to the same pair (a,b) –as it is presented in Table 1. The alternative model is the classical one with the random choice of the input vectors during the centers’ initialization. For the classical k-means model 100 executions are carried out and the best results for each index are registered. The superiority of the proposed model applies in all cases of neurons, while a second advantage is the convergence to the same results for the respective pairs (a,b), which can not be achieved using the classical model.
4.4 Application of hierarchical agglomerative algorithms In the case of the seven hierarchical models the best results are given by the WARD model for J, by the UPGMC model for MIA, by the WPGMA model for CDI, by the UPGMC and UPGMA models for SMI, by the UPGMC and WPGMC models for DBI, by the UPGMC, UPGMA, WPGMC and WPGMA models for WCBCR adequacy measure, according to Fig. 3.
4.5 Application quantization
of
adaptive
vector
The initial value η0 , the minimum value ηmin and the time parameter Tη 0 of learning rate must be properly calibrated. For example in Fig. 4 the sensitivity of the mean index adequacy MIA to the η0 and Tη 0 parameters is presented for 90 experiments. The best results of the adequacy measures are not given for
4.3 Application of the fuzzy k-means In the fuzzy k-means algorithm the results of the adequacy measures depend on the amount of fuzziness increment. In Fig. 2 SMI and WCBCR adequacy measures are indicatively presented for
ISSN: 1109-2734
15
Number of clusters
1095
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
the same η0 and Tη 0 , according to the results of Table 1. The ηmin value does not practically improve the neural network’s behavior assuming that it ranges between 10-5 and 10-6.
rough ordering phase) and Tη (time parameter of 0
learning rate), (c) the multiplicative factor ξ between Tσ (time parameter of neighborhood 0
radius) and Tη , (d) the proper initial values of the 0
learning rate ηr and η f during the rough ordering phase and the fine tuning phase respectively. These are suitably selected through extended research of the parameters’ values, like the aforementioned one for the parameters η0 and Tη of the AVQ method.
4.6 Application of self-organized map Although the SOM algorithm is theoretically well defined, there are several issues that need to be solved for the effective training of SOM. The major problems for the mono-dimensional SOM are: ♦ the proper termination of the SOM’s training process, which is solved by minimizing the index Is (eq. 27). ♦ the proper calibration of (a) the initial value of the neighborhood radius σ 0 , (b) the multiplicative factor φ between Ts (epochs of the
0
♦ the proper initialization of the weights of the neurons. Three cases are examined through preliminary executions: (a) wki = 0.5, ∀k , i (which finally presents the best behaviour) (b) the random initialization of each neuron’s weight, (c) the random choice of the input vectors for each neuron.
0
0.050
0.045 CL
SL
UP GMA
UP GMC
0.035
WARD
WP GMA
0.030
WP GMC
0.045 0.040
0.025 0.020
CL
SL
UP GMA
UP GMC
WARD
WP GMA
WP GMC
0.035
MIA
J
0.040
0.030
0.015
0.025
0.010
0.020
0.005
0.015
5
10 15 20 Numbe r of Cluste rs
25
5
25
10 15 20 Numbe r of Cluste rs
a. J indicator
b. MIA indicator
0.8 CL
SL
UP GMA
UP GMC
WARD
WP GMA
0.80
0.75
WP GMC
0.4
SMI
CDI
0.6
0.70
0.2
CL
SL
UP GMA
UP GMC
WARD
WP GMA
WP GMC
0.65
0.0 5
10 15 20 Numbe r of Cluste rs
5
25
10
3.0
0.015
CL
SL
UP GMA
UP GMC
WARD
WP GMA
WP GMC
WCBCR
0.010 DBI
25
20
d. SMI indicator
c. CDI indicator 3.5
15
Numbe r of Cluste rs
2.5
CL
SL
UP GMA
UP GMC
WARD
WP GMA
WP GMC
0.005
2.0
1.5
0.000 5
10
15
20
25
5
Numbe r of Cluste rs
10 15 20 Numbe r of Cluste rs
25
f. WCBCR indicator
e. DBI indicator
Fig. 3. Adequacy measures for the 7 hierarchical clustering algorithms for the set of 183 load curves of the summer of the year 2000 for 5 to 25 clusters
ISSN: 1109-2734
1096
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
MIA
0.034
0.036 0.033
0.034 0.032
0.032 0.031
0.03 0.03
0.028 1000 0
2000 3000 4000 5000
Time parameter of learning rate
0.029
0.5 1
Initial value of learning rate
Fig. 4. MIA for the AVQ method for the set of 183 load curves of the summer of the year 2000 −6 for 10 neurons, η 0 = {0.1, 0.15, ..., 0.9} , T = {500,1000, ...5000} , η = 5 ⋅ 10 η0
min
two major eigenvalues or can be equal to 0.5 giving equivalent results substantially. In the case of the set of 183 load curves for the summer of the year 2000 the map can have 67 ( ≅ 5 × 183 ) to 270 ( ≅ 20 × 183 ) neurons. Using the ratio between the two major eigenvalues the respective value is 22.739 (=0.26423/0.01162) and the proposed grids can be 46x2 and 68x3. Practically the clusters of the bi-dimensional map can not be directly exploited because of the size and the location of the neurons into the grid (see Fig. 5). This problem is solved by the application of a basic classification method -like the proposed kmeans- for the neurons of the bi-dimensional SOM [7].
The optimization process for the monodimensional SOM parameters is repeated for any population of clusters. In the case of the bi-dimensional SOM the additional issues that must be solved, are the shape, the population of neurons and their respective arrangement. The rectangular shape of the map is defined with rectangular or hexagonal arrangement of neurons. It must be mentioned that the two kinds of arrangement practically give the same results. The population of the neurons is recommended to be 5 × N to 20 × N [18-19]. The height / width ratio M 1 M 2 of the rectangular grid can be calculated as the ratio between the two major eigenvalues of the covariance matrix of the input vectors set. The initialization of the neurons can be a linear combination of the respective eigenvectors of the Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Cluster 6
Cluster 7
Cluster 8
Cluster 9
Cluster 10
y-axis
2 10
8
5
6
7
9
4
1
3
2
1
0 0
5
10
15
20
25
30
35
40
45
x-axis
Fig. 5. 46x2 SOM after the application of the proposed k-means method at the neurons of SOM for the set of 183 load curves of the summer of the year 2000 for 10 neurons
ISSN: 1109-2734
1097
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
The adequacy measures are calculated using the load curves of the neurons which form the respective clusters of the basic classification method and the best results are given by the 46x2 grid for all adequacy measures for different pairs (a,b) of the kmeans method. 0.04
WARD AVQ SOM 1D
0.03
4.7 Comparison of clustering models & adequacy indicators In Fig. 6 the best results achieved by each clustering method (proposed k-means, fuzzy k-means, adaptive vector quantization, hierarchical algorithms and self-organized maps) are depicted. UPGM C K-M EAN AVQ Fuzzy-q=4 SOM 1D SOM 2D
0.04
K-M EAN Fuzzy-q=4 SOM 2D
MIA
J
0.03 0.02
0.02
0.01 0.00
0.01 5
10 15 20 Number of Clusters
25
5
10 15 20 Number of Clusters
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
b. MIA indicator UPGM A K-M EAN Fuzzy-q=6 SOM 2D
0.85
WPGM A K-M EAN AVQ Fuzzy-q=4 SOM 1D SOM 2D
0.80 SMI
CDI
a. J indicator
UPGM C AVQ SOM 1D
0.75 0.70 0.65
5
10 15 20 Number of Neurons
5
25
UPGM C K-M EAN Fuzzy-q=6 SOM 2D
4.0
WPGM C AVQ SOM 1D
0.015
WCBCR
3.5
10 15 20 Number of Clusters
25
d. SMI indicator
c. CDI indicator
DBI
25
3.0 2.5
UPGM C WPGM A K-M EAN AVQ Fuzzy-q=4 SOM 1D SOM 2D
0.010
0.005
2.0 1.5
0.000 5
10
15
20
25
5
Number of Clusters
Number of dead clusters
Number of dead clusters
Hierchical AVQ - J Fuzzy-q=4 - J SOM 1D - J SOM 2D - J K-means -J
2 1 0 5
10
15 20 Number of clusters
25
f. WCBCR indicator
e. DBI indicator 3
10 15 20 Number of Clusters
25
K-means - J K-means - M IA K-means - CDI K-means - SM I K-means - DBI K-means - WCBCR
8 6 4 2 0 5
10
15 20 Number of clusters
25
g. Dead clusters for the basic clustering methods h. Dead clusters for proposed k-means method Fig. 6. The best results of each clustering method for the set of 183 load curves of the summer of the year 2000 for 5 to 25 clusters
ISSN: 1109-2734
1098
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
The improvement of the adequacy indicators is The proposed k-means model has the smallest significant until 10 clusters. After this value the values for the MIA and WCBCR indicators, the bibehavior of the most indicators is gradually dimensional SOM (with the application of the stabilized. It can also be estimated graphically by proposed k-means at the second level) for the J and using the rule of the “knee”, which gives values SMI indicator and the adaptive vector quantization between 8 to 10 clusters (see Fig. 7). In Table 1 the for DBI indicator. The proposed k-means model and results of the best clustering methods are presented the bi-dimensional SOM give equivalent results for for 10 clusters, which is the finally proposed size of the CDI indicator. All indicators -except DBIthe typical days for this case. exhibit improved performance, as the number of Having also taken into consideration that the clusters is increased. Observing the number of dead analogy of the computational training time for the clusters for the proposed k-means model (Fig. 6.h) it under study methods is 0.05:1:24:28:36:50 is obvious that the use of WCBCR indicator is (hierarchical: proposed k-means: mono-dimensional slightly superior to MIA and J indicators. It is also SOM: AVQ: fuzzy k-means: bi-dimensional SOM), noted that the basic theoretical advantage of the the use of the hierarchical and k-means models is WCBCR indicator is that it combines the distances proposed. It is mentioned that the computational of the input vectors from the representative clusters training time for the proposed k-means method is and the distances between clusters, covering also the approximately 20 minutes for a Pentium 4, 1.7 GHz, J and CDI characteristics. The behavior of DBI and 768 MB. SMI indicators for different clustering techniques appears significant variability. For the above reasons the proposed indicator is WCBCR. TABLE 1 COMPARISON OF THE BEST CLUSTERING MODELS FOR 10 CLUSTERS FOR THE SET OF 183 LOAD CURVES OF THE SUMMER OF THE YEAR 2000 FOR THE GREEK POWER SYSTEM Methods -Parameters Proposed k-means a- b parameters
Adequacy Measure CDI SMI
J
MIA
DBI
WCBCR
0.01729 0.26 – 0.39
0.02262
0.1778
0.7331
2.0606
0.002142
0.15-0.44
0.45-0.45
0.10-0.78
0.15-0.43
0.14-0.61
Classical k-means
0.01934
0.02434
0.1935
0.7549
2.7517
0.002346
AVQ
0.01723 0.5-5x10-7 -1000
0.02819 0.4-5x10-7 -4000
0.2615 0.5-5x10-7 -5000
0.7431 0.4-5x10-7 -2000
1.9973 0.8-5x10-7 -1000
0.004145 0.4-5x10-74000
0.02208 4-0.220.46
0.03036 4-0.180.62
0.25328 4-0.180.70
0.7482 6-0.120.62
2.1936 6-0.140.74
0.003894
CL SL UPGMA UPGMC WARD WPGMA WPGMC
0.01960
0.02974
0.2636
0.7465
2.4849
0.004233
0.06249
0.04435
0.2950
0.7503
2.3509
0.006103
0.02334
0.02885
0.2544
0.7423
2.2401
0.003186
0.02200
0.02847
0.2603
0.7455
2.1934
0.003412
0.01801
0.02858
0.2645
0.7635
2.5964
0.004227
0.02094
0.02743
0.2330
0.7373
2.2638
0.002619
0.02227
0.02863
0.2418
0.7378
2.1498
0.003008
Mono-dimensional SOM
0.02024
0.03043
0.3366
0.7752
3.1656
0.007126
10-1.0-0.60.15-10-31500
10-2.0-0.20.10-10-31750
10-1.0-0.60.15-10-31500
10-1.0-0.60.15-10-31500
10-1.0-0.60.10-10-31500
10-2.0-0.20.10-10-3-1750
0.01685
0.02697
0.1785
0.7271
2.2572
0.002459
46-1.0-1.00.30-10-3500-0.280.36
46-1.0-1.00.30-10-3500-0.150.58
46-1.0-1.00.30-10-3500-0.440.46
46-1.0-0.20.20-10-3500-0.100.77
46-1.0-0.20.20-10-3500-0.440.25
46-1.0-1.00.30-10-3-5000.15-0.58
η0 − η min − Tη parameters 0
Fuzzy k-means q- a- b parameters
σ 0 − φ − ξ − η f − η r − Tη
0
parameters 2D SOM 46x2 using proposed k-means for classification in a 2 nd level
σ 0 − φ − ξ − η f − η r − Tη -a- b 0
parameters
ISSN: 1109-2734
1099
4-0.18-0.62
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
WCBCR
0.015
0.010
0.005
0.000 5
7
9
11
13 15 17 Number of Clusters
19
21
23
25
Fig. 7. Indicative estimation of the necessary clusters for the typical load daily chronological curves of the summer of the year 2000 for the WCBCR adequacy indicator normal temperatures (22-28oC) and Saturdays of April, May, early June and September, while the cluster 5 includes the workdays of low demand and Sundays of high peak load demand during the hot summer days. The cluster 6 represents the workdays of medium peak load demand and Saturdays of high peak load demand, while the clusters 7 to 10 mainly involves workdays with gradually increasing peak load demand.
4.7 Representative daily load curves of the summer of the year 2000 for the Greek power system The results of the respective clustering for 10 clusters using the proposed k-means model with the optimization of the WCBCR indicator are presented in Table 2 (total number of days per cluster & number of days per cluster & per day of week Monday, Tuesday, etc-), Table 3 (calendar of the summer of the year 2000 with the kind of cluster) and Fig. 8 (representative load curves per cluster). Additionally, in Fig. 8 the confidence limits of the variations (mean value ± standard deviation) are presented and this has a probability of occurrence equal to 68.27% assuming a normal distribution.
4.8 Application of the Proposed Methodology for the Greek Power System Per Season and Per Year for the time period 1985-2002 The same process is repeated for the summers (April–September) and the winters (October-March) for years 1985-2002. The load curves of each season are qualitatively described by using 8-10 clusters. The performance of these methods is presented in Table 4 by indicating the number of seasons which achieve the best value of adequacy measure respectively. The comparison of the algorithms shows that the developed k-means method achieves a better performance for MIA, CDI and WCBCR measures, the bi-dimensional SOM model using proposed kmeans for classification in a second level for J and SMI indicators and the adaptive vector quantization for DBI adequacy measure. The methodology is also applied for each year during the period 1985-2002, where the load curves are qualitatively described by using 15-20 clusters. The respective performance is presented in Table 5 by indicating the number of years which achieves the best value of adequacy measure respectively. By comparing the algorithms it is obvious that the developed k-means method achieves a better performance for MIA, CDI, DBI and WCBCR measures, the 2-D SOM model using the proposed k-means for classification in a second level for J indicator and the unweighted pair group method centroid algorithm for SMI index.
TABLE 2 RESULTS OF THE PROPOSED K-MEANS MODEL WITH OPTIMIZATION TO WCBCR FOR 10 CLUSTERS FOR A SET OF 183 LOAD CURVES OF THE SUMMER OF THE YEAR 2000 FOR THE GREEK POWER SYSTEM
Load cluster 1 2 3 4 5 6 7 8 9 10
1 0 1 0 9 4 4 4 4 0 0
Day (1 for Monday, 2 for Tuesday etc.) 2 3 4 5 6 7 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 2 13 8 9 8 7 12 2 3 2 3 4 4 8 6 6 4 3 7 1 3 4 6 6 0 1 3 2 3 3 2 0 2 3 1 2 0 0 0 0 1 0 0 0
Days per cluster 1 2 16 55 28 31 24 17 8 1
Specifically, the cluster 1 represents Easter, the cluster 2 Holy Friday and Monday after Easter, the cluster 3 the Sundays of April, May, early June and September, Holy Saturday and Labor day. The cluster 4 contains the workdays of very low demand (during April, early May and September) with
ISSN: 1109-2734
1100
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
Active Power (MW)
Active Power (MW)
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
5000 4000 3000
5000 4000 3000 2000
2000
0 0
6
12 Time (h)
18
6
12 Time (h)
24
18
24
18
24
18
24
18
24
18
24
b. Cluster 2
6000
Active Power (MW)
Active Power (MW)
a. Cluster 1
5000 4000 3000 0
6
12 Time (h)
18
6000 5000 4000 3000
24
0
6
d. Cluster 4
7000
Active Power (MW)
Active Power (MW)
c. Cluster 3
6000 5000 4000 0
6
12 Time (h)
18
7000 6000 5000 4000
24
0
6
8000 7000 6000 5000 4000 6
12 Time (h)
18
8000 7000 6000 5000 4000
24
0
6
9000 8000 7000 6000 5000 6
M ean value M ean +standard dev.
12 Time (h)
12 Time (h)
h. Cluster 8 Active Power (MW)
Active Power (MW)
g. Cluster 7
0
12 Time (h)
f. Cluster 6 Active Power (MW)
Active Power (MW)
e. Cluster 5
0
12 Time (h)
18
9000 8000 7000 6000 5000 0
24
6
M ean value M ean +standard dev.
M ean - standard dev.
12 Time (h)
M ean - standard dev.
k. Cluster 10
i. Cluster 9
Fig. 8. Typical daily chronological load curves for the set of 183 curves of the summer of the year 2000 for the Greek power system using the proposed k-means model with optimization to WCBCR
ISSN: 1109-2734
1101
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
TABLE 3 CALENDAR FOR 10 CLUSTERS FOR A SET OF 183 LOAD CURVES OF THE SUMMER OF THE YEAR 2000 FOR THE GREEK POWER SYSTEM Month Day of Month 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
April Day Kind of of week cluster 6 4 7 3 1 4 2 4 3 4 4 4 5 4 6 4 7 3 1 4 2 4 3 4 4 4 5 4 6 4 7 3 1 4 2 4 3 4 4 4 5 4 6 4 7 3 1 4 2 4 3 4 4 4 5 2 6 3 7 1
May Day Kind of of week cluster 1 2 2 3 3 4 4 4 5 4 6 3 7 3 1 4 2 4 3 4 4 4 5 4 6 4 7 3 1 4 2 4 3 4 4 4 5 4 6 4 7 3 1 4 2 5 3 4 4 5 5 5 6 4 7 3 1 5 2 5 3 6
June Day Kind of of week cluster 4 6 5 5 6 4 7 3 1 5 2 6 3 6 4 7 5 6 6 5 7 4 1 6 2 6 3 7 4 8 5 8 6 5 7 3 1 4 2 6 3 6 4 6 5 7 6 6 7 5 1 8 2 7 3 7 4 7 5 7
July Day of week 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1
Kind of cluster 6 5 8 9 9 10 9 8 7 8 9 9 8 8 6 5 7 8 8 7 7 6 5 7 8 9 9 9 8 6 8
August Day Kind of of week cluster 2 7 3 7 4 7 5 7 6 6 7 5 1 7 2 7 3 7 4 7 5 7 6 5 7 5 1 6 2 4 3 6 4 7 5 7 6 6 7 5 1 7 2 8 3 8 4 8 5 8 6 6 7 5 1 6 2 6 3 6 4 6
September Day Kind of of week cluster 5 6 6 5 7 4 1 6 2 6 3 5 4 5 5 5 6 4 7 3 1 5 2 5 3 5 4 5 5 5 6 4 7 3 1 5 2 6 3 6 4 6 5 6 6 4 7 3 1 4 2 4 3 4 4 4 5 4 6 4
TABLE 4 COMPARISON OF THE CLUSTERING MODELS FOR THE SETS OF LOAD CURVES OF THE GREEK POWER SYSTEM PER SEASON FOR THE TIME PERIOD 1985-2002 Methods Proposed k-means Classical k-means AVQ Fuzzy k-means UPGMA UPGMC WPGMA WPGMC CL / SL / WARD / Mono-dimensional SOM Bi-dimensional SOM using proposed kmeans for classification in a 2 nd level
ISSN: 1109-2734
Adequacy Measure J
MIA
CDI
SMI
DBI
WCBCR
1 0 2 0 0 0 0 0 0
24 0 0 0 3 7 0 3 0
31 0 0 0 0 0 0 0 0
7 0 7 0 0 2 0 0 0
12 0 16 0 0 5 1 2 0
29 0 0 1 1 3 0 2 0
34
0
6
21
1
1
1102
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
TABLE 5 COMPARISON OF THE CLUSTERING MODELS FOR THE SETS OF LOAD CURVES OF THE GREEK POWER SYSTEM PER YEAR FOR THE TIME PERIOD 1985-2002 Methods Proposed k-means Classical k-means AVQ UPGMA UPGMC WPGMA WPGMC Fuzzy k-means / CL/ SL / WARD / Mono-dimensional SOM Bi-dimensional SOM using proposed kmeans for classification in a second level
Adequacy Measure J
MIA
CDI
SMI
DBI
WCBCR
0 0 1 0 0 0 0
8 0 0 1 6 1 1
18 0 0 0 0 0 0
0 0 1 0 14 0 0
13 0 3 0 1 0 0
14 0 0 0 2 0 2
0
0
0
0
0
0
17
1
0
3
1
0
-winter and summer). It is practically impossible to describe the load curves satisfactory dividing the respective days into work days and non-work days, as it has been done until now. It is also concluded that, generally, the optimal clustering technique is the developed k-means, while the optimal adequacy measure is the ratio of within cluster sum of squares to between cluster variation. The proposed methodology is applicable to any power system, either to active power, or reactive power, in any time period (day, week, etc.) and time step (15 minutes, 1 hour, etc.), leading to reliable results.
The main disadvantage of the load curves classification per year is that each cluster does not contain the same family of days during the time period under study. I.e. if 20 clusters are selected to represent the load demand behavior of the Greek power system per year, the 20th cluster will contain the workdays with the highest peak load demand of the winter for the years 1985-1992 and that of summer for the rest years. In order to avoid this problem, the classification per season is proposed.
5 Conclusions This paper presents an efficient pattern recognition methodology for the study of the load demand behavior of power systems. The unsupervised clustering methods can be applied, such as the kmeans, fuzzy k-means, adaptive vector quantization (AVQ), mono-dimensional and bi-dimensional self organizing maps (SOM) and hierarchical methods. The performance of these methods is evaluated by six adequacy measures: mean square error, mean index adequacy, clustering dispersion indicator, similarity matrix indicator, Davies-Bouldin indicator, the ratio of within cluster sum of squares to between cluster variation. Finally the representative daily load diagrams along with the respective populations per each typical day are calculated. This information is valuable for the electric companies, because it facilitates the load forecasting and the techno-economic studies of demand side management programs. By applying the proposed methodology to the Greek power system the classification per season is suggested, where 8 to 10 clusters are necessary in order to describe satisfactory the daily load curves of each season (describing the year with two seasons
ISSN: 1109-2734
Acknowledgements The authors want to express their sincere gratitude to C.A. Anastasopoulos, D. Voumboulakis and P. Eustathiou from PPC for the supply of all the necessary data for the training of the model and M. Tsaroucha and S. Gialabidis for their contribution on the original version of this paper. References: [1] PacifiCorp buys time with storage. Energy prospects, Vol. 37, 2004, p. 5, (http:// www.vrbpower.com) [2] J. Kondoh, I. Ishii, H. Yamaguchi, A. Murata, K. Otani, K. Sakuta, N. Higuvhi, S. Sekine, M. Kamimoto. Electrical energy storage systems for energy networks. Energy Conversion & Management, Vol. 41, 2000, pp. 1863-1874. [3] Task VΙ of International Energy Agency, Demand Side Management Programme, Research Report No. 1, October 1998, Research Report No. 2, July 1999, Research Report No. 3, August 2000.
1103
Issue 12, Volume 7, December 2008
G. J. Tsekouras, F. D. Kanellos, V. T. Kontargyri, I. S. Karanasiou, A. D. Salis, N. E. Mastorakis
WSEAS TRANSACTIONS on CIRCUITS and SYSTEMS
[12] V. Figueiredo, F. Rodrigues, Z. Vale, J. B. Gouveia. An electric energy consumer characterization framework based on data mining techniques. IEEE Transactions on Power Systems, Vol. 20, No. 2, May 2005, pp. 596-602. [13] M. Petrescu, M. Scutariu. Load diagram characterization by means of wavelet packet transformation. 2nd Balkan Conference, Belgrade, Yugoslavia, June 2002, pp. 15-19 [14] E. Carpaneto, G. Chicco, R. Napoli, M. Scutariou. Electricity customer classification using frequency-domain load pattern data. Electrical Power and Energy Syst. Vol. 28, No. 1, 2006, pp. 13-20. [15] C. S. Chen, J. C. Hwang, C. W. Huang. Application of load survey systems to proper tariff design. IEEE Transactions on Power Systems, Vol. 12, No. 4, November 1997, pp. 1746-1751. [16] S. Haykin. Neural Networks: A Comprehensive Foundation, Prentice Hall, NJ, 1994. [17] T. Kohonen. Self–organization and Associative Memory, New York: Springer-Verlag, 1989. [18] K.F. Thang, R.K. Aggarwal, A.J. McGrail, D.G. Esp. Analysis of power transformer dissolved gas data using the self-organizing map. IEEE Transactions on Power Delivery, Vol. 18, No. 4, October 2003, pp. 1241-1248. [19] SOM Toolbox for MATLAB 5. Helsinki, Finland: Helsinki Univ. Technology, 2000. [20] D.L. Davies, D. W. Bouldin. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 2, 1979, pp. 224-227. [21] D. Hand, H. Manilla, P. Smyth. Principles of data mining, The M.I.T. Press, Cambridge, Massachusetts, London, England, 2001.
[4] Task VIΙΙ of International Energy Agency, Demand-Side Bidding in a Competitive Electricity Market, Research Report No. 1, Ver. 2, January 2002, with title “Market participants’ views towards and experiences with Demand Side Bidding”. [5] T. Sels, C. Dragu, T. Van Craenenbroeck, R. Belmans. New Energy Storage Devices for an Improved Load Managing on Distribution Level. ΙΕΕΕ Power Tech Conference, Porto, Portugal, 10th-13th September, p.6. [6] R. Lamedica, A. Prudenzi, M. Sforna, M. Caciotta, V.O. Cencelli. A neural network based technique for short-term forecasting of anomalous load periods. IEEE Transactions on Power Systems, Vol. 11, No. 3, August 1996, pp. 1749-1756. [7] M. Beccali, M. Cellura, V. Lo Brano, A. Marvuglia. Forecasting daily urban electric load profiles using artificial neural networks. Energy Conversion and Management, Vol. 45, 2004, pp. 2879-2900. [8] G.Chicco, R. Napoli, F. Piglione. Comparisons among clustering techniques for electricity customer classification. IEEE Transactions on Power Systems, Vol. 21, No. 2, May 2006, pp. 933-940. [9] G. Chicco, R. Napoli, F. Piglione. Application of clustering algorithms and self organizing maps to classify electricity customers. Presented at the IEEE Power Tech Conference, Bologna, Italy, June 23-26, 2003. [10] D. Gerbec, S. Gasperic, F. Gubina. Determination and allocation of typical load profiles to the eligible consumers. IEEE Power Tech Conference, Bologna, Italy, June 23-26, 2003. [11] S. Theodoridis, K. Koutroumbas, Pattern Recognition, 1st Edition, Academic Press, New York, 1999.
ISSN: 1109-2734
1104
Issue 12, Volume 7, December 2008