Change Detection of Electric Customer Behavior Based on AMR Measurements Tao Chen, Antti Mutanen, Pertti J¨arventausta
Hannu Koivisto
Department of Electrical Engineering Tampere University of Technology Tampere, Finland
[email protected]
Department of Automation Science and Engineering Tampere University of Technology Tampere, Finland
[email protected]
Abstract—Smart Grids technology is emphasized a lot in the future power system worldwide. Nowadays, the widely used Automatic Meter Reading (AMR) technology in Finland makes it possible to collect customers’ hourly load measurements and to use data analysis methods for customer clustering and load prediction purposes. This paper addresses the detection of possible changes in customers’ behavior. This could for example be a result of changed habitation, heating solution change, installation of solar panels or other equipment. Basic classification and regression methods like K-means and Fuzzy C-means are utilized to analyze the electric customer behavior. The developed method successfully detects various obvious load pattern changes on different customer types. It also offers rough time information regarding at which week the change happens. This behavior change detection method can be applied in improving load modeling accuracy by considering the most recent consumption information after the change. Index Terms—Automatic Meter Reading (AMR), Change detection, Classification, Fuzzy C-means (FCM).
I. I NTRODUCTION Smart Grids depend heavily on various load modeling, state estimation and load forecasting techniques, which benefit the whole network control and analysis. Among these techniques, load modeling plays a crucial role and offers a good fundamental for other analysis methods [1]. The aim of this paper is to improve the load modeling accuracy by detecting the customer electricity consumption behavior changes and update the customer load models to correspond the new changed behavior. In practice, this customer behavior change comes from various resources. Generally, customer behavioral factors identified in [2] that influence the load profiles are the main reasons for customer behavior change. They are affected by two root causes: behavioral determinants which are habit driven and relatively flexible; and physical determinants which are driven by environmental factors and building design (e.g. heating solution). Another example is that, with the advent of smart grids, the ways of operating distribution networks are changing. The integration of distributed generation (DG), like photovoltaic (PV) panel installation, must be considered and detected by the network company due to its bringing This paper reports work sponsored by the Smart Grids and Energy Markets project (Cleen Oy, Finland).
some energy production balance to its electricity consumption. Additionally, the idea of demand response (DR) also brings some customer behavior change caused by the response to electricity price or network congestion. Recently more and more network companies emphasize automatic network control and consider more about financial matters to reduce the costs and keep the operation margins low. In such situation, network planning and operation must be made more carefully in order to keep the distribution networks within reduced operating margins. In order to achieve this goal, customer class load profiles are widely used in Finland [3] to forecast the short-term and long term loads and are helpful in distribution network analysis. It has been shown that load profiles have a big effect on the accuracy of distribution network state estimation [4]. In 2009 the Finnish Government passed a new act, which states that at least 80% of the customers of each Distribution System Operator (DSO) must have AMR implemented by December 31, 2013. In practice almost all customers are provided with a new AMR meter. The law requires that AMR meters feature hourly energy measurement as well as registrations of quality of supply and demand response functionality. This provides huge amount of detailed information on customer hourly electricity consumption. This data makes customer behavior change detection possible [5]. The goal of our research is to detect possible changes in customers’ behavior and this article documents the research result presented in [6]. II. C USTOMER BEHAVIOR CHANGE BETWEEN YEARS Generally, there are two categories of customer behavior change. One is defined as intra-year change or abrupt change, which means the AMR measurement data suddenly changes its magnitude or distribution at some time points [7]. It is similar to the changes defined in many other fields and used in literature [8]. Another one is defined as between-years change. The between-years change can not be detected if we are given only one year time series data, for example one year AMR data. We must compare two or more different years to judge whether some changes indeed happen or not. In other words, this kind of change can only be detected based on yearly comparison. Because any customer can just repeat his behavior in previous year even some intra-year changes
Classification is a widely used method in load modeling [9], [10]. The basic idea is that different customers can be clustered into similar behavior groups to make it convenient for electricity distribution companies to model and predict customer loads. These clustering techniques can be divided into several categories, such as centroids based clustering, distribution based clustering and density-based clustering. Some clustering methods have already been applied to obtain good enough results, like K-means, Iterative Self-Organizing Data Analysis Technique (ISODATA), and Gaussian Mixture Model (GMM) studied in [3], [5]. K-means and Fuzzy C-means are two basic algorithms to cluster different input data into different groups. Intuitively, we can do clustering based classification for every year and observe how the classification results of individual customers change over time, namely whether they belong to different groups in different years. This does not offer accurate time information for the behavior change and may neglect some changes from the intra-year point of view. However, this approach still offers a good foundation for detecting betweenyear changes. In this paper, this is named as ”reclassification method” and is used as a baseline for change detection method development. In this section, the basic clustering methods K-means and Fuzzy C-means are introduced. The improved change detection methods based on these clustering algorithm will be presented in the next section.
electricity consumption (kW)
25
20
15
10
5
0
1000
2000
3000
4000
5000
6000
7000
8000
hour
(a) Real AMR measurements for a customer in year 2009
30
electricity consumption (kW)
III. C LASSIFICATION METHOD
30
25
20
15
10
5
0
1000
2000
3000
4000
5000
6000
7000
8000
hour
(b) Real AMR measurements for the customer in year 2010 (no change compared with year 2009)
6
electricity consumption (kW)
happen. For instance, in Fig.1b, although some sudden intrayear change can be observed at about 2800th hour, there is no between-years change compared with year 2009. In contrast, if a consumption figure for this customer appears as shown in Fig.1c, although there is no obvious intra-year change, it will be claimed that there is between-years change observed since this customer changes the behavior pattern in year 2010 compared with year 2009. It should be pointed out that in the following sections the term “change” is mainly referring to between-years change. Because when the load modeling and prediction is based on yearly load profiles, as in [1], [3], the detection of betweenyear changes is more important than the detection of intra-year changes. Thus, in this paper we detect the changes always based on the comparison between AMR measurements of two or more different years. We further divide this between-years change into load level change and load shape change, which implies the comparison of customer consumption behavior in at least two different years will always be implemented either based on their load level or load shape.
5
4
3
2
1
0
1000
2000
3000
4000
5000
6000
7000
8000
hour
(c) Artificial load profile for the customer in year 2010 (change compared with year 2009) Fig. 1. Intra-year change vs Between-years change
A. K-means The aim of the clustering problem is to divide a set of objects into different groups such that objects in the same group are more similar to each other than to those in other groups. K-means is exactly such an easily implemented and widely used clustering algorithm, which divides the input data set into K groups by their similarity [11]. Suppose a data set {x1 , x2 , . . . , xN } consisting of N independent input vectors
with D-dimension. Our goal is to partition the data set into K groups, so called clusters. A set of D-dimensional vectors µi , where i = 1, . . . , K, can be introduced to represent centres of K clusters. Our goal is then to find an assignment of data points to clusters, as well as a set of vectors {µi }, such that the sum of the squares of the distances of each data point to
its closest vector µi is minimized in (1). In this paper, xj stands for AMR measurement of jth customer given as input vector and every element in this vector is an hourly electricity consumption value. J=
N X K X
rij k xj − µi k2
(1)
j=1 i=1
( 1 if i = argmini k xj − µi k2 rij = 0 otherwise.
(2)
B. Fuzzy C-means with membership The basic idea of Fuzzy C-means is similar to K-means, it just offers some additional information about the probability of a specific point belonging to a certain cluster. So the objects on the boundaries between several classes are not forced to fully belong to one group, but rather are assigned membership degrees between 0 and 1 indicating their partial membership. Thus, the cluster centre is the mean of all data points in the same data set, weighted by their degree of belonging to the cluster [12]. In every iteration of the classical K-means procedure, each data point is assumed to be in exactly one cluster. But in Fuzzy C-means, we can relax this condition and assume that each sample xj has some graded or ”fuzzy” cluster membership in every cluster. (1/dij )1/(b−1) , dij =k xj − µi k2 . (3) P (ωi | xj ) = PK 1/(b−1) (1/d ) rj r=1
Pi : is the ith weekly average energy consumption (kW), Ti : is the ith weekly average temperature (◦ C), Di : is the ith weekly average daytime length (hour), β0 , β1 , β2 : are multivariable regression parameters.
We use one year AMR measurement data to train this regression model to obtain the parameters β0 , β1 , β2 . Then these regression parameters are used to predict consumption in another year with 95% confidence interval as the level change detection band. If the actual weekly average energy consumption exceeds the band more than 10 times, it is claimed that some load level change indeed happen. B. Load shape change detection 1) Temperature normalization and data scaling: Before running any clustering algorithms, some data preprocessing is required to make the data in different years comparable. Here we mainly consider removing the effect of temperature and scaling the AMR data to the same level to focus on the shape change detection regardless of any differences in consumption level. Due to wide use of electric heating and other temperature sensitive loads, electricity consumption is usually sensitive to outdoor temperature. This phenomenon can be observed in Fig. 2.
where: b: is a free parameter chosen to be 2 here, µi : is the ith cluster centre vector (centroid), ωi : is the notation for the ith cluster. However, the probabilities of cluster membership for each point should be normalized as (4) to ensure that the sum of all the possibilities belonging to each cluster will be exactly 1. K X P (ωi | xj ) = 1, j = 1, . . . , N. (4) i=1
Fig. 2. Measured load and temperature (24 hour averages) for one customer in year 2009
IV. C HANGE DETECTION FROM AMR DATA Behavior change can be divided into load level change and load shape change. In this paper, the shape change detection requires consumption level normalisation and therefore the load level change detection must be done separately. A. Load level change detection The load level change detection is done by analysing the weekly energy consumption through the multivariable regression method. A following linear regression model is assumed: Pi = β0 + β1 Ti + β2 Di where:
(5)
Thus, temperature normalization of AMR measurements should be implemented to remove the effect of temperature. It can be done in such a way that we assume that temperature sensitive part of the load is linearly dependent on the temperature. In [3], a linear regression model is proposed to obtain temperature dependence parameter α. The temperature dependency parameters are calculated with linear regression analysis for every four seasons separately. The percent error between the daily energy consumption and the average daily consumption on a similar day during that month is chosen as the dependent variable (regressand). And the difference between the daily average temperature and the average temperature on a similar day during that month is chosen as the determining variable (regressor). The significance of relationship between
the daily energy and outdoor temperature is assessed with the correlation coefficient and the Student’s t-test in [3]. Hence the temperature normalized load can be calculated as following: P (t)T N =
P (t) , 1 + α(Td,ave − Tm,ave )
start AMR data preprocess (Temperature normalization, Standardization)
(6)
where:
Cluster with K-means
P (t)T N : is the temperature normalized power consumption at hour t, P (t): is the measured power consumption at hour t. Td,ave : is the daily average of outdoor temperature, Tm,ave : is the long term monthly average of outdoor temperature, α: is the temperature dependency parameter %/◦ C.
The temperature normalization is made according to daily average temperatures. The temperature dependency parameter α is assumed to be the same for all hours of the same day. After this temperature normalization process, the temperature normalized AMR data will be obtained to be used as input for K-means clustering algorithm. This allows us to compare electricity consumption in different years with different temperatures. Additionally, function zscore in MATLAB is used for normalization purpose. In such way, we can make sure the obtained centroids in next subsection are mostly at the same level (i.e. standard normal distribution) but with different shape. 2) Shape change detection after data preprocessing: The shape change method introduced in this paper might be called “weekly load profiling” since it determines the behavior change based on weekly information. When some customers change their electricity consumption behavior but still remain within the same cluster, it is not possible to just use the clustering index to indicate temporary change or small change from the intra-year point of view. To some extent, the idea of this method is to decompose the customer behavior into several bases, which are chosen from centroids produced in K-means algorithm. After the bases are obtained, we assign a coefficient to every base (i.e. clustering centroid) to measure the grade of how much a specific customer behavior matches against this base. This idea can be interpreted as follow: Customer Behavior = a1 × cluster1 + a2 × cluster2 + a3 × cluster3 + . . . + aK × clusterK
(7)
The subscript K depends on the number of clusters which we choose. For different years, we can build different representations of this customer behavior by assigning different sets of ai values. These ai values are determined similarly as the membership values in Fuzzy C-means algorithm [12]. Then by comparing these different sets of ai values in different years, we can detect the behavior change through observing the change of ai values during specific time interval. The flowchart of the whole method is shown in Fig. 3.
Divide every centroid into 50 weekly segments Divide customer’s AMR data into 50 weekly segments
For i th week, calculate K memberships for customer i th weekly AMR segment
Compare K memberships of every week in different years for one specific customer
Sum membership differences and compare them to chosen threshold to define the membership change as behavior change
end
Fig. 3. Flowchart of load shape change detection method
The first step is to cluster all the non-empty customers’ hourly measured AMR data in one complete year to obtain K centroids using conventional K-means algorithm. Before clustering, the AMR data of every customer is preprocessed by temperature and load level normalization to make sure the measurements are comparable. We choose K = 30 as the number of clusters here. The second step is to produce weekly load profiles, where each of the 30 produced centroids is divided into 50 weekly segments. Every segment is 168 dimensional (i.e. 7 days×24 hours). Thus 50 integral weekly load profiles can be obtained from one complete year. It should be pointed out that when we divide every annual load profile into weekly segments, every weekly segment should begin from Monday. The third step is dividing the customer’s annual AMR measurements in different years into 50 weekly AMR data segments. Before this, the customer annual AMR measurements go through data preprocessing that includes temperature normalization and load level normalization. Then we use (3) in Fuzzy C-Means (FCM) algorithm to obtain memberships for every weekly AMR data segment of the customer in different years. Specifically, the membership is obtained by comparing every weekly AMR data segment in different years to all the corresponding 30 weekly load profiles calculated based on cluster centres. For each week, the customer weekly consumption behavior (i.e. weekly AMR data segment) will be
represented by 30 memberships. After we have such a bunch of memberships, the intuitive way to detect the change based on these memberships is to calculate the absolute difference of every cluster membership in every specific week. Then a threshold is set to determine what degree of membership change can be seen as the customer behavior change. Here we set the threshold as constant 0.2 and sum up the 30 absolute differences of memberships in every week. If the sum of membership differences exceeds the threshold on five or more consecutive weeks, it is claimed that this customer has a nontemporary changed behavior. In summary, based on these memberships we transform the change detection of high dimensional AMR measurements to the change detection of these low dimensional membership curves. The benefit of such transformation is not only dimensionality reduction but it also helps to reduce the effect of random variance, measurement outliers and temporary abnormal behavior. All of them are extremely harmful for analyzing the customer behavior change. V. T EST CASES The first test case is an artificial one to give an obvious result, followed by two realistic test cases including AMR data from two Finnish Distribution System Operators (DSOs). A. Case 1 This artificial test case combines the consumption behavior of one customer with another one in second year to show an easily observed change as shown in Fig. 4. Totally different consumption behavior patterns can be observed bfore around the 3500th hour, even with the similar consumption level.
AMR data (6.2010-6.2011) Mean of weekly consumption
AMR data (6.2011-6.2012) Mean of weekly consumption
12
electricity consumption (kW)
electricity consumption (kW)
12
10
8
6
4
10
2
8
2000
3000
4000
5000
6000
7000
8000
1000
2000
3000
hour
4000
5000
6000
7000
8000
hour
(a) AMR measurements in the first year (left) and second year (right) Membership difference Threshold to detect shape change
0.9
TABLE I N UMBER OF CUSTOMERS WHO HAVE LOAD SHAPE CHANGE Change detected Only by reclassification Only by weekly load profilling By both mehods No change
#Customers 346 1146 603 1482
Percentage 9.7% 32.0% 16.9% 41.4%
It can be observed that in the situation where reclassification of the same customer in different years produces no changed result, changes in some certain weekly time intervals are still detected by weekly load profilling method. That is why there are 1146 customers’ behavior changes only detected by weekly load profilling rather than reclassification. C. Case 3 This case includes 7398 non-empty low voltage customers in a small region, measured from June 2010 to June 2012. The Fig. 5 shows the behavior change detection of customer No. 4033 between two years. We can observe that the shape change is detected on this customer as shown in Fig. 5c. The results of checking all the 7398 customers with the load level change detection method and the load shape change detection method are shown in table II. TABLE II N UMBER OF CUSTOMERS WITH BEHAVIOR
4
0 1000
In this case, each customer is labelled with a customer class number to show to which traditional class one customer belongs (38 classes in total). In the original data set, there are 3584 customers in total and they are measured through two complete years, 2009 and 2010. After removing the customers with empty consumption, 3577 customers are left to run the weekly load profiling method and the results are compared with the reclassification method, which observes how the classification results of individual customers change over time, namely whether they still belong to the same groups as in previous years. Results are listed in table I.
6
2
0
B. Case 2
Forecast(6.2011-6.2012) Weekly mean(6.2011-6.2012) 95% Confidence interval
6
Change type Only load level change Only load shape change Both changes No change
#Customers 1536 906 1100 3856
CHANGE
Percentage 20.8% 12.2% 14.9% 52.1%
Average consumption (kW)
0.8 0.7 0.6 0.5 0.4 0.3
5
4
VI. D ISCUSSION ON APPLICATION
3
Change detection method studied in this paper depends heavily on the utilization of AMR measurements partially because of the fact that many distribution networks already have vast numbers of AMR meters. Originally, AMR meters have been installed primarily for just recording of electricity consumption at customer-end, but their remote reading capabilities can also be made use of in state estimation and online monitoring [4]. However, it is not economically viable
2
0.2 1 0.1 0
0 5
10
15
20
25
week
30
35
40
45
50
5
10
15
20
25
30
35
40
45
50
week
(b) Load shape (left) and load level (right) change detection Fig. 4. Change detection for an artificial customer in two different years
VII. C ONCLUSIONS AMR data (6.2010-6.2011) Mean of weekly consumption
12
electricity consumption (kW)
electricity consumption (kW)
12
AMR data (6.2011-6.2012) Mean of weekly consumption
10
8
6
4
10
2
8
6
4
2
0
0 1000
2000
3000
4000
5000
6000
7000
8000
1000
2000
3000
hour
4000
5000
6000
7000
8000
hour
(a) AMR measurements in 6/2010-7/2011 (left) and 7/2011-6/2012 (right) Memberships from 6.2010 to 6.2011
Memberships from 6.2011 to 6.2012
0.14
0.12
0.12
0.1
0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02
0.02
0
0 0
10
20
30
40
50
0
10
20
week
30
40
50
week
(b) Memberships in 6/2010-7/2011 (left) and 7/2011-6/2012 (right) Membership difference Threshold to detect shape change
0.45
R EFERENCES
Forecast(6.2011-6.2012) Weekly mean(6.2011-6.2012) 95% Confidence interval
8 7
Average consumption (kW)
0.4 0.35 0.3 0.25 0.2 0.15 0.1
6 5 4 3 2 1
0.05 0
0 5
10
15
20
25
week
30
35
40
45
50
5
10
15
20
25
30
35
40
45
The customer behavior change detection is not an easy problem since most customer behaviors are quite irregular and accompanied with a number of random variations. The weekly load profiling method in this paper is proposed based on the work of customer classification to detect the load shape change. For load level detection, a multivariable regression model is formed to detect the consumption level difference. Both of them depend heavily on our own definition of customer change, namely the customer behavior has some difference in different years. Since the customer behavior is not stable as we emphasized, the weekly profiling method developed in this paper can just offer the weekly time information regarding at which week the behavior change happens compared with the previous year. It is hard to offer any further daily or hourly information due to the limit of clustering methods. This method works well for customers with obvious changes but may neglect some non-obvious changes of lowconsumption customers. In future, some other methods taking AMR data as time series with intrinsic patterns [13] instead of high dimensional vectors should be developed to detect more trivial electric customer behavior changes.
50
week
(c) Load shape (left) and load level (right) change detection Fig. 5. Change detection for customer No.4033 from June 2010 to June 2012
(or technically possible) to read every single AMR meter in real-time and load profiles are still needed. Luckily, now that AMR measurements are available, it is easy to improve the load profiles. The first step is to utilise clustering methods and create clustering based customer class load profiles or even individual load profiles. The second step is to make load profiles more dynamic so that they can adapt to changing customer behavior. This is where change detection methods are needed. It is in our interest to forecast the electricity consumption using load profiles that correspond to present electricity consumption. If there is a significant change in electricity consumption, the pre-change measurement data should be forgotten and only the post-change data should be used to model the new changed behavior. Thus change detection helps us to create more accurate load profiles. With more accurate load profiles and forecasts DSOs can utilise their active resources more effectively. For example, demand response based congestion management can be focused to correct location, during right hours and in right magnitude.
[1] A. Sepp¨al¨a, “Load research and load estimation in electricity distribution,” Ph.D. dissertation, Dept. Electrical Eng., Helsinki Unv. Technol, Espoo, Finland, 1996. [2] R. Yao, and K. Steemers, “A Method of Formulating Energy Load Profile for Domestic Buildings in the UK,” Energy and Buildings, New York, vol.37, pp.663-671, 2005. [3] A. Mutanen, M. Ruska, S. Repo, and P. J¨arventausta, “Customer classification and load profiling method for distribution systems,” IEEE Trans. Power Del., vol. 26, no. 3, pp. 1755-1763, Jul. 2011. [4] A. Mutanen, S. Repo, and P. J¨arventausta, “AMR in Distribution Network State Estimation,” Proc. 2008 The 8th Nordic Electricity Distribution and Asset Management Conf., Bergen, Norway. [5] B. Stephen, A. Mutanen, S. Galloway, G. Burt and P. Jarventausta, “Enhanced Load Profiling for Residential Network Customers,” IEEE Trans. Power Delivery, vol. 29, pp. 88-96, Feb 2014. [6] T. Chen, “Customer Behavior Change Detection Based on AMR Measurements,” M.Sc. Thesis, Dept. Electrical Eng., Tampere University of Technology, Finland, 2014. [7] B. Stephen, F. R. Isleifsson, S. Galloway, G. M. Burt and H. W. Bindner, “Online AMR Domestic Load Profile Characteristic Change Monitor to Support Ancillary Demand Services,” IEEE Trans. Smart Grid, vol. 5, pp. 888-895, March 2014. [8] R. P. Adams and D. J. MacKay, “Bayesian online changepoint detection,” Tech. Rep., University of Cambridge, Cambridge, UK, 2007. [Online]. Available: http://hips.seas.harvard.edu/content/ bayesian-online-changepoint-detection [9] Vera Figueiredo, Fatima Rodrigues, Zita Vale and Joaquim Borges Gouveia. “An Electric Energy Characterization Framework based on Data Mining Techniques,” IEEE Trans. Power Systems, vol. 20, nr. 2, pp. 596-602, May 2005. [10] G. Chicco, R. Napoli and F. Piglione, “Comparison among clustering techniques for electricity customer classification,” IEEE Trans. Power System, vol.21, no.2, pp.933-940, May 2006. [11] Christopher M. Bishop, Pattern Recognition and Machine Learning, U.S., Springer press, 2006, p. 424. [12] Richard O. Duda, Peter E. Hart and David G. Stork, Pattern Classification, 2nd Edition, UK, Wiley, 2000, p. 526. [13] M. Espinoza, C. Joye, R. Belmans and B. De Moor, “Short-Term Load Forecasting, Profile Identification, and Customer Segmentation: A Methodology Based on Periodic Time Series,” IEEE Trans. Power System, vol.20, issue.3, pp.1622-1630, Aug. 2005.