Band Depth Clustering for Nonstationary Time Series

Technometrics

ISSN: 0040-1706 (Print) 1537-2723 (Online) Journal homepage: http://www.tandfonline.com/loi/utch20

Band Depth Clustering for Nonstationary Time Series and Wind Speed Behavior Laura L. Tupper, David S. Matteson, C. Lindsay Anderson & Luckny Zephyr To cite this article: Laura L. Tupper, David S. Matteson, C. Lindsay Anderson & Luckny Zephyr (2017): Band Depth Clustering for Nonstationary Time Series and Wind Speed Behavior, Technometrics, DOI: 10.1080/00401706.2017.1345700 To link to this article: http://dx.doi.org/10.1080/00401706.2017.1345700

View supplementary material

Accepted author version posted online: 28 Jun 2017.

Submit your article to this journal

Article views: 40

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=utch20 Download by: [Cornell University Library]

Date: 14 September 2017, At: 07:39

ACCEPTED MANUSCRIPT

Downloaded by [Cornell University Library] at 07:39 14 September 2017

Band Depth Clustering for Nonstationary Time Series and Wind Speed Behavior Laura L. Tupper∗ Department of Mathematics and Statistics, Williams College David S. Matteson Department of Statistical Science, Cornell University C. Lindsay Anderson Department of Biological and Environmental Engineering, Cornell University Luckny Zephyr Department of Biological and Environmental Engineering, Cornell University June 7, 2017 Abstract We explore the behavior of wind speed over time, using a subset of the Eastern Wind Dataset published by the National Renewable Energy Laboratory. This dataset gives modeled wind speeds over three years at hundreds of potential wind farm sites. Wind speed analysis is necessary to the integration of wind energy into the power grid; short-term variability in wind speed affects decisions about usage of other power sources, so that the shape of the wind speed time series becomes as important as the overall level. To assess differences in intra-day time series, we propose a functional distance measure, the band distance, which extends the band depth of Lopez-Pintado and Romo (2009). This measure emphasizes the shape of time series or functional observations relative to other members of a dataset, and allows clustering of observations without reliance on pointwise Euclidean distance. We show a method for adjusting for seasonal effects in wind speed, and use these standardizations as input for the band distance. We demonstrate the utility of the new method in simulation studies and an application to the MOST power grid algorithm, where the band distance improves reliability over standard methods at a comparable cost.

Keywords: Depth statistics; Distance metrics; Cluster analysis; Time series analysis; Wind power. technometrics tex template (do not remove)

∗

The authors gratefully acknowledge the SOPF working group at Cornell University, led by Ray Zimmerman, Timothy Mount, and Carlos Murillo-Sanchez; Colin Ponce, Department of Computer Science, Cornell University; and the editor, associate editor, and anonymous reviewers whose comments have greatly improved the content and presentation of the paper. Support was provided from Cornell University Institute of Biotechnology and the New York State Division of Science, Technology and Innovation (NYSTAR), a Xerox PARC Faculty Research Award, NSF Grant DMS-1455172, and the Consortium for Electric Reliability Technology Solutions and the Office of Electricity Delivery and Energy Reliability, Transmission Reliability Program of the U.S. Department of Energy.

1

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT 1

Introduction

A key concern in power engineering is the characterization of the behavior of wind speed, or of power output from wind generators, over time. Pinson (2013) gives an overview of wind markets, and describes the necessity of obtaining probabilistic wind forecasts, both to inform energy contracts and to guide system-wide decisions about power generation. As the grid derives a greater proportion of its energy from wind and other renewable, but variable, sources, probabilistic forecasting becomes more important. Characterization of Downloaded by [Cornell University Library] at 07:39 14 September 2017

wind patterns is also essential for making decisions about future wind farm locations; for example, Goddard et al. (2015) analyze past wind measurements against projections from regional models to assess where climate change may affect wind generation. Many techniques have been proposed for forecasting wind behavior on the scale of minutes or hours ahead. An overview of several model-based methods, ranging from simple persistence to more sophisticated time-series and spatio-temporal techniques, appears in Zhu and Genton (2012). Pinson (2013) summarizes forecasting approaches as yielding output of three types: point predictions of a single value for a particular time and location; predictive densities, which give CDFs in place of a single estimate; and trajectories, which give time-series or spatio-temporal curves for the entire window of interest. Our focus in this paper is on producing trajectories: describing the set of possible wind speed curves using only a small number of representative scenarios. Such representatives can be analyzed directly as in Goddard et al. (2015), and are often used in stochastic optimization problems, such as training power systems algorithms that incorporate wind generation (Pinson, 2013). One such algorithm is the MOST (Matpower Optimal Scheduling Tool) system described by Zimmerman et al. (2011), which aims to make optimal decisions about activating generators and dispatching power. The number of representatives is generally constrained by the computational complexity of the algorithm using the data, so we treat this number as fixed, with further discussion in Section 2. Wind speed data for hundreds of potential wind farm locations in the Eastern United States are produced by NREL using the MASS v.6.8 model. The data are available through the EWITS database at http://www.nrel.gov/electricity/transmission/eastern_ wind_dataset.html, and a description of the dataset’s preparation appears in Brower 2

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT (2010). Each site has measurements of wind speed at a height of 80m or 100m, depending on the hub height of the appropriate turbine for the site, at ten-minute time increments for three years. We can thus consider the data as either a high-frequency time series or functional data. In this analysis, we examine wind speeds from a subset of these locations in units of days, each day being a single twenty-four-hour time series. We choose this time frame in the context of making day-ahead decisions about unit commitment (generator activation); but the methods described here can be applied to time series of different

form forecasting with a one-hour time resolution and a six-hour window, citing six hours as a typical cutoff for using data-driven approaches over meteorologically-based models. Pourhabib et al. (2015) handle the nonstationarity of wind speed time series by defining “epochs” such as “6pm to 12 am, all days in January,” while we take a model-free approach. An effective set of representative days must cover the range of wind behaviors, with information about the relative probabilities of seeing each type of day. Clustering the data, then, is a natural way to capture recurrent behaviors: cluster centers provide a reasonable representative of each behavior type, while cluster sizes indicate which types are most likely. But a simple breakdown between high-wind and low-wind scenarios, obtained by clustering on average wind level, is largely uninformative. For example, wind power often cannot be stored, and backup generators have constraints on activation speed and increasing output; so the shape of the wind speed curve affects optimal decision-making. This shape can vary

25 15 5 0

0

5

15

Wind speed, m/s

25

widely, even among days with similar average levels, as shown in Figure 1, left panel. Wind speed, m/s


length, or different resolution on the time axis. For example, Pourhabib et al. (2015) per-

12am

6am

12pm

6pm

12am

Time, h

6am

12pm

6pm

Time, h

Figure 1: Left, three sample days from the dataset. The mean level for each of these days is approximately 11 m/s, despite their different shapes. Right, the representatives obtained by applying k-means with six clusters to the dataset. Mean behavior is dominant.

3

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT Methods for calculating the similarity or distance between two time series have been reviewed in several sources, for example Liao (2005). Many of these techniques, such as the Time Warp Edit Distance of Marteau (2009) or Dynamic Time Warping and its extensions described in M¨ uller (2007), are designed to allow shifting or stretching behavior along the time axis. If we do not wish to allow time warping, perhaps because we are basing timesensitive decisions on our results, we may approach the time series as high-dimensional data, with each time point as a dimension. The dimension of a discrete time series is then


equal to its length: not as extreme as, for example, some text-matching datasets, but still high enough to require specialized methodology. Much work has been done on the choice of distance functions in high-dimensional spaces. A typical and intuitive choice is to extend Euclidean spatial distance to higher dimensions Pn using the Lp norm: for two n-dimensional observations x and y, Lp (x, y) = i=1 |xi − 1/p yi |p . For example, p = 2 leads to the the familiar RMSE (root mean squared error). This approach, however, is sensitive to small differences in level between observations, emphasizing the mean behavior. For example, applying k-means clustering to the dataset yields remarkably uninformative groups, and has the additional disadvantage that the cluster centers, as pointwise means of all cluster members, are unrealistically smooth (see Figure 1, right panel). Indeed, Kusiak and Li (2010) generate clusters based on various parameters using k-means to create predictions of short-term power generation, and find that performing this clustering with wind speed does not lead to better models. The Lp norm is also dominated by observations’ behavior on dimensions, or at times, where differences are large; if there is heteroskedasticity across times, those times with higher variation will tend to contribute the most to the distance calculation. Kazor and Hering (2015) report this behavior in their work on identifying wind regimes. More generally, Beyer et al. (1999) describe the problem of loss of contrast as the number of dimensions grows: that is, the distance from an observation to its nearest neighbor becomes, relatively, about the same as the distance to its farthest neighbor. Aggarwal et al. (2001) demonstrate that the Euclidean norm is particularly subject to this problem. In a similar vein, the Lp norm relies on absolute distance between elements of x and y, and may thus also be sensitive to skew in the distribution of observations; and, more broadly, it does not

4

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT consider the observations in the context of the rest of the dataset. While we might attempt to adjust for heteroskedasticity and skew with transformations or dimensional weighting, these approaches would require expert choices of functions and parameters. To emphasize differences in shape, or to define the similarity of pairs of observations in a way that takes the rest of the dataset into account, a different approach is needed. Note that these methods could be of use in other applications where the shape of time series or functional data is a concern, such as tracking meteorological data, growth curves, or the


number of users of a system over time. Feng and Ryan (2013) explore selecting scenarios for the time series of future energy demand and fuel cost, in the context of generation expansion planning (GEP); while they use a multistage selection method instead of singlestage clustering, their method still relies on finding the pairwise distances between scenarios. We turn to the methods of depth statistics, which express the centrality of observations relative to a dataset, without reference to external scales or parameters. In the onedimensional case, the depth statistic corresponds to the median and quantiles of a dataset, with the median being the deepest point. Mosler (2013) provides an excellent overview of depth statistics and their extension to different types of data. For example, Liu (1990) gives a version, the simplicial depth, for the centrality of points in