Document not found! Please try again

Important variables in explaining real-time peak price in the ...

2 downloads 414 Views 275KB Size Report
This non-linear relationship combined with the low variable-observation ratio rule out ..... 6 Domain knowledge was obtained after talking extensively with.
Utilities Policy 13 (2005) 27e39 www.elsevier.com/locate/jup

Important variables in explaining real-time peak price in the independent power market of Ontario Ismael E. Arciniegas Ruedaa, Achla Maratheb,) a Pacific Economics Group, LLC, 22 East Mifflin Street, Madison, WI 53703, USA Los Alamos National Laboratory, Computer and Computational Science (CCS-3), MS B265, Los Alamos, NM 87545, USA

b

Received 27 October 2003; received in revised form 23 March 2004; accepted 10 April 2004

Abstract This paper uses support vector machines (SVM) based learning algorithm to select important variables that help explain the realtime peak electricity price in the Ontario market. The Ontario market was opened to competition only in May 2002. Due to the limited number of observations available, finding a set of variables that can explain the independent power market of Ontario (IMO) real-time peak price is a significant challenge for the traders and analysts. The kernel regressions of the explanatory variables on the IMO real-time average peak price show that non-linear dependencies exist between the explanatory variables and the IMO price. This non-linear relationship combined with the low variable-observation ratio rule out conventional statistical analysis. Hence, we use an alternative machine learning technique to find the important explanatory variables for the IMO real-time average peak price. SVM sensitivity analysis based results find that the IMO’s predispatch average peak price, the actual import peak volume, the peak load of the Ontario market and the net available supply after accounting for load (energy excess) are some of the most important variables in explaining the real-time average peak price in the Ontario electricity market.  2004 Elsevier Ltd. All rights reserved. Keywords: Support Vector Machines; Real time peak price; Ontario electricity market; Sensitivity analysis

1. Introduction Ontario’s electrical power system is one of the largest in North America, serving the power needs of more than 12 million people. In line with the liberalization of electricity market in North America, Ontario’s government undertook a program to deregulate its electricity industry in May 2002. Under the new market design, the energy prices in the Ontario market are allowed to reflect actual supply and demand conditions. The competitive prices are expected to provide appropriate signals for supply additions and voluntary demand side responses. Ontario has several kinds of electricity markets: (i) Financial marketsdthese markets are meant for hedging the price and financial risk. They

) Corresponding author. Tel.: C1-505-667-9034; fax: C1-505-6671126. E-mail addresses: [email protected] (I.E. Arciniegas Rueda), [email protected] (A. Marathe). 0957-1787/$ - see front matter  2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.jup.2004.04.006

do not affect the actual delivery and use of electricity. (ii) Real-time marketsdthe real-time markets for energy and operating reserve form the core of the independent power market of Ontario (IMO)-administered markets. They do affect the actual delivery and use of electricity. (iii) Procurement marketsdthese markets enable the IMO to ensure the reliability of Ontario’s power system by acquiring services such as black-start capability.1 (iv) Physical forward marketsdthese markets are also used for heading risk. The IMO market is an important part of the northeastern power markets and has interties with the NY, PJM, and ECAR region (e.g., Cinergy). Traders play across these regional markets to take advantage of the arbitrage opportunities. However, there are important differences in the design and maturity between the

1

Black-start capability is a generator’s ability to help restore the province’s power system without relying on an external supply of electricity.

28

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

IMO and the other wholesale power markets of the surrounding regions. For instance, PJM and NY markets have a two-settlement system (day ahead and real time) whereas Ontario has a single-settlement system. Differences in the design can have important effects on the efficiency of the market. For example, the study in Arciniegas Rueda et al. (2003) shows that the markets with two-settlement systems are more efficient than the markets with the single-settlement system. In this paper, the goal is to find out the most important variables from a list of potential explanatory variables that explain the IMO real-time average peak price. Our analysis focuses on weekdays. Weekends are not usually seen as interesting periods by the traders. In the newly liberalized market, IMO’s average peak2 price is an important variable for the market participants. The real-time average peak price helps the generators, loads, distributors, retailers and financial traders in deciding how to participate into the IMO. There are several financial products at IMO that are traded in the over the counter (OTC) market that settle against the average peak price. The peak price is also useful in determining whether the traders should play only in the financial market and ride the real-time prices; whether there are speculative positions that can be taken in the OTC market; whether the cash traders could take advantage of the arbitrage opportunities across neighboring markets etc. In NY, PJM and Cinergy, there are products traded on the Intercontinental Exchange (ICE) that settle against the average spot peak prices. The competitive real-time average peak price also drives the value of forward contracts, options, generation assets, transaction decisions etc. Greater understanding of the real-time price would result in more informed bids and help companies and traders in managing risk more effectively. Better models of forecasting average realtime price can help refine transaction decisions, lead to more accurate pricing of forward and financial contracts, better risk management and higher market efficiency. Traditional methods of variable selection, based on the time series approach, require several years of historical data. Given that the IMO market opened up only in May 2002, meaningful time series analysis is a significant challenge in this market. During the data period considered here, the IMO was also characterized by regulatory uncertainty caused by a price cap of ¢ 4.3 KWh in November 2002 in the retail market. The classical algorithms based on linear regression techniques assume a linear relationship between the dependent and independent variables. However, the kernel regressions of the explanatory variables on the IMO price

2 In this paper, the peak time period is defined from 8 AM to 9 PM inclusive.

show that the relationship between these variables is often non-linear. The classical statistical methods for variable selection do not work well for the problems characterized by non-linearity and a limited number of observations (Bowerman and O’Connell, 2001).3 Another major criticism of all the classical variable selection method is that the selected set of variables may work quite well for within sample data but may perform poorly for out of sample data. These methods do not have a cross-validation step which means that the selected variables could have low predictive power. The classical variable selection methods are usually based on regression models that try to minimize some measure of within sample error by using all the available observations. To overcome the above stated problems, this work uses an algorithm based on support vector machines (SVM) sensitivity analysis (Kewley et al., 2000) for selecting the most significant variables. SVM is a flexible algorithm, which can ‘‘learn’’ in relatively small data sets (low variable-observation ratio)4, and recognizes the existence of inherent non-linear relations among the variables. The term ‘‘learn’’ means that the knowledge acquired from the estimation set can be generalized to yet unseen observations (Herbrich et al., 1999). This work differs from the past empirical research in the economic literature in at least two respects. First, a datadriven analysis is done to determine the relevance of a large number of potential variables in explaining the IMO real-time average price. Second, our approach evaluates a variable’s relevance in explaining peak price by comparing its sensitivity with the sensitivity of a random variable. The data used in this research comes from the NY-ISO, PJM-ISO, IMO and Platts websites.5

2. Description of variables The dependent variable in this study is the daily average peak price in the IMO power market. It is measured as the average price over the peak period of 8 AM to 9 PM on the weekdays. Understanding the average peak price is useful for cash and term traders in IMO as there are several financial products that settle against the average peak price in Ontario. It also helps traders in taking advantage of the arbitrage opportuni3

Examples of some classical algorithms include backward elimination, forward selection, leaps and bounds etc. See Christensen (1990) and Venables and Ripley (1994) for details on these algorithms. 4 As a rule of thumb, many experts suggest to have at least 30 observations per variable for statistical inference purposes. Our raw data set has 270 observations and 36 potential explanatory variables. This results in a variable observation ratio of 7.5. 5 The web sites are: www.nyiso.com, www.pjm.com, www.theimo. com, www.platts.com.

29

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

ties across neighboring markets. Schedulers can use this knowledge in designing better bidding strategies at IMO. Domain knowledge and theory (Stoft 2002) suggest several variables that could explain the real-time prices.6 This paper uses a machine learning based algorithm to identify the most significant variables in explaining the IMO real-time peak price. In addition, it attempts to rank all suggested independent variables based on their explanatory power. We analyze 36 potential explanatory variables that characterize the demand and supply conditions in the IMO. Weekday data was collected from May 1, 2002 to May 21, 2003. In total, our raw data have 270 observations. We do not use any a priori knowledge to bias selection of the variables, instead, our approach is entirely data driven.7 (Also see Table 1). 1. Dawn indexdthis measures the daily gas price for delivery in Ontario (US$/MBTU) as reported by Platt’s Gas Daily. Since gas fired plants are the marginal plants and set the price in periods of high demand, this variable could be important. 2. Ontario peak load (MW)dit is measured as the average actual load during the on-peak period. Higher demand, ceteris paribus, leads to higher prices. 3. NY zone A and M real-time average peak price (US$/MWh)dNew York is divided into 11 zones. These zones correspond to 10 major interties that have the potential for congestion. Each zone has an associated zonal price. NY zone A and M are associated with the interfaces with Ontario. Power flows between Ontario and NY depend on the size of price spreads between these regional markets. The flows continue between markets until no more arbitrage opportunities exist between the two markets. Real-time NY zone A and M prices are obtained from the NY-ISO (Independent System Operator) web site. 4. Forecast supply energy (MWh)dthis is the forecasted amount of energy available from the generation sources in Ontario, averaged over the peak period as given in the SSR report of IMO. It is an important variable since it sets up expectations about the amount of energy available to the IMO system. A high forecast of energy supply is associated with low prices.

6 Domain knowledge was obtained after talking extensively with traders in the IMO market. 7 Details are shown on only 27 variables because some of the variables are self-explanatory such as summer, spring and fall. Also, some variables are represented jointly such as NY zone A and M prices while others are shown in summed up form such as outages total is the sum of outages east and outages west. All variables here correspond to the averages during the on-peak period unless specified otherwise.

Table 1 Potential explanatory variables No.

Variable

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

Dawn index Ontario peak load NY zone A peak price NY zone M peak price Forecast supply energy Capacity Intermittent energy Self-scheduling energy Energy limited energy Energy limited capacity Intermittent capacity Self-scheduled capacity Outages east Outages west Outages total Energy excess Capacity excess Summer Ontario temperature Heat index Humidity Sun exposure Wind speed Ontario air quality Import actual peak Import scheduled peak TTC out peak TTC in peak Cinergy price Predispatch price Forecast demand Dispatchable load Fall Spring Operational reserves Wind chill index

5. Capacitydthis is the net amount of generation capacity in service in Ontario averaged over the peak period as reported in the SSR. 6. Inflexible generation energy (MWh)dinflexible generators are non-dispatchable and therefore are price takers. Inflexible generation is part of the base supply curve and therefore cannot set the marginal price. Inflexible generation includes all self-scheduled and intermittent energy. The data comes from the SSR report of IMO. The higher the portion of inflexible generation in the supply stack, the lower is the expected price. 7. Energy limited energy (MWh)dit measures the energy available from energy limited facilities averaged over the peak period. An energy limited facility is a generation resource that is unable to supply energy equal to the capacity for each of the hours of the day (e.g., a hydro-electric facility with limited water in the forebay does not produce energy at its rated output for each of the 24 h in a day). The data source is the SSR report.

30

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

8. Energy limited capacity (MW)dit measures the nominal capacity of those facilities that are energy limited. On any day, the list of facilities that may be energy limited may change. 9. Inflexible generation capacity (MW)dthis is the sum of self-scheduled capacity and intermittent capacity. 10. Total outages (MW)dthis is the sum of total MWs of outages across all fuel sources. Since outages affect the supply function, they have the potential to affect the IMO price. 11. Energy excess (MWh)dthis is a measure of energy adequacy. It is calculated as the average over the onpeak period, of the sum of generating capacity in service, estimated imports, dispatchable load, and energy limited resource energy minus demand forecast, outages, and capacity of energy limited resources. If energy excess is less than 1, there is a shortage of energy. Data is provided by the SSR report. 12. Capacity excess (MW)dthis is a measure of capacity adequacy. It is calculated as the average over the on-peak period, of the sum of generating capacity in service, estimated imports, dispatchable load minus demand forecast, outages, generation reserve holdback and intra-hour margin. If capacity excess is less than 1, there is a shortage. Data are provided by the SSR report. 13. Temperature (C )dthis variable has an effect on price through its effect on demand for power. Extreme low and high temperature lead to increase in load. In addition, temperature could have an effect on congestion as transmission lines have thermal limits. 14. Heat indexdthis is a measure of how hot it really feels. Demand may be more sensitive to changes in heat index than to temperature. High heat index is commonly associated with high load and high prices. 15. Humiditydhumidity affects price through its impact on demand. It may also have an impact on supply. Humidity may lower the output of power plants because more moisture in the air can make the combustion process in the power plants less efficient. 16. Sun exposuredthe level of cloudiness affects the production of solar power and hence heat index, temperature and humidity. 17. Wind speeddlike other weather related variables, wind speed affects the demand and generators’ efficiency too. 18. Air qualitydjust like sun exposure, air quality can affect the demand and generators’ efficiency. 19. IMO import actual peak (MW)dthis is the actual value of imports in the peak period. Imports are called when the available supply is not enough to meet the demand.

20. IMO import scheduled peak (MW)dthis is the projected value of imports from other control areas. 21. Available transfer capability (ATC) in/out of OntariodATC is a measure of transfer capability remaining in the physical transmission network for further commercial activity. The higher the value of ATC, the lower the physical restrictions and therefore the lower the likelihood of congestion. Congestion affects prices since it leads to non-economic dispatch of generators. 22. Cinergy pricedthis is the price of electricity in the Cinergy hub (US$/MWh). It is an approximation of the price in ECAR region. High prices in ECAR region may increase exports out of Ontario to ECAR resulting in higher prices in Ontario. 23. Predispatch price (US$/MWh)dit is the projection of the IMO peak price at 8 AM. Predispatch price is not the settlement price. It is calculated from an algorithm that uses projections of price fundamentals as inputs and maximizes the expected economic gains of the market participants. Predispatch price is calculated by IMO and updated hourly. 24. Forecast demand (MW)dthis is the forecasted value of demand energy by the IMO. This value can affect the real price through its effect on market expectations. 25. Dispatchable load (MW)dthis represents the elastic part of the demand curve and is sensitive to price. The more elastic the demand curve, the more sensitive is the demand to changes in price. 26. Operational reserves (MW)dthis variable can affect prices as operational reserves (spinning plus nonspinning) are used to smooth price spikes caused by sudden trips in generation plants. 27. Wind chill indexdit is defined by the National Weather Services as 35:74C0:165T35:75V0:16 C0:4275TV0:16 , where V is wind velocity in miles per hour and T is the temperature in Fahrenheit.

2.1. Objective selection of variables In this step, we identify pairs of variables that are highly correlated or contain the same information. Two variables are said to have the same information if their correlation in absolute value is greater than or equal to 0.95 (Wessel et al., 1998). Table 2 identifies all pairs of variables that show a correlation of 0.70 or higher. The first six pairs have a correlation of 0.95 and higher. From each of these six pairs, only one variable was chosen for the analysis. The variables that were dropped are: wind chill, average import scheduled peak, average TTC in peak, dispatchable load and forecast demand. Table 3 provides the summary statistic of all the variables used. Variables that were a linear combination of other variables were also dropped since the information

31

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39 Table 2 Identification of highly correlated variables Variable number 1

Variable number 2

Ontario temperature Average import actual peak Average TTC out peak Forecast demand Ontario peak load Dispatchable load Capacity excess NY zone A peak price Ontario peak load Ontario temperature Outages total Predispatch price Outages total Heat index Heat index Dawn index Outages total Heat index

Wind chill index Average import scheduled peak Average TTC in peak Ontario peak load Dispatchable load Forecast demand Energy excess NY zone M peak price Forecast supply energy Heat index Forecast supply energy IMO average peak price Dispatchable load Ontario Peak load Dispatchable load Operational reserves Ontario peak load Forecast supply energy

Table 3 Summary statistics of the variables Correlation 0.98 1.00 0.97 0.97 0.99 0.98 0.90 0.88 0.88 0.86 0.82 0.78 0.74 0.74 0.74 0.73 0.72 0.70

they contain can be retrieved from the other variables. ‘‘Outages east’’ and ‘‘outages west’’ variables were dropped from the analysis because ‘‘outages total’’ represents the sum of ‘‘outages east’’ and ‘‘outages west’’.

Variable

Minimum

Mean

Maximum

Standard deviation

Ontario peak load Dispatchable load IMO import actual peak Ontario temperature Heat index Energy limited energy Energy excess Outages total Dawn index NY zone A peak NY zone M peak Forecast supply energy Capacity Inflexible generation energy Energy limited capacity Humidity Sun exposure Average wind Air quality Average TTC out peak Cinergy Predispatch Operational reserves Capacity excess IMO average peak

15,390 15,840 3238

19,820 20,190 982

24,060 24,390 1068

1728.42 1774.70 773.79

12.50 64 2801 2083 3935 2.68 12.22 45.98 16,720 28,210 701

56.59 114.8 4230 1254 8592 4.70 46.71 48.12 21,090 31,390 1258

95.50 274 6307 5263 12,590 21.93 258.20 285.70 25160 31,660 1542

22.09 48.93 567.88 1087.35 1863.49 2.08 22.51 26.82 1736.17 374.99 99.38

5021 40 0 3 3 0

6606 68.47 36.91 10.13 19.8 5345

7676 92 100 23 60 6161

363.35 9.73 28.65 3.83 8.66 1047.95

14.02 27.97 65 1122 25.81

35.18 111 101.7 2103 72.18

128.50 1080 190 5933 259.60

15.00 142.14 51.47 1160.39 38.17

2.2. Exploratory analysis of explanatory variables In this section, we analyze each variable in detail. We first identify and remove outliers from each variable. Unjustified outliers can significantly distort the explanatory power of the variables. Some outliers may be justified but they should be checked for transcription errors. We use the method of Spreads for identifying observations that appear to be different from the bulk of the data. Each of these observations are then checked against the fundamentals to see if they are justified or not. If the outliers are a result of changes in the fundamentals, e.g. an unusually high price associated with an unusually hot day, they are included in the analysis otherwise they are dropped from the analysis (Hoaglin et al., 1983).8 After doing the outlier analysis, we characterize the variable’s distribution and identify possible transformations that may increase the variable’s symmetry.9 Table 4 shows the transformations performed on each 8 The following variables consisted of outliers: IMO peak price , predispatch price, NY zone A peak price, Cinergy, inflexible generation energy, energy excess, heat index, air quality, capacity and energy limited capacity. Our original data had 270 observations. Altogether, 70 observations were identified across variables as outliers and removed from the analysis. 9 For purpose of statistical analysis, symmetry is a desirable property in a variable’s distribution (Hoaglin et al., 1983).

variable. Each variable is also tested for non-stationarity using the Augmented DickeyeFuller (ADF) unit root test. Details on stationarity tests applied to the power price data can be found in Arciniegas Rueda et al. (2003). Table 5 shows the ADF test results. Nonparametric methods are used to explore the association of each explanatory variable with the seasonally adjusted IMO price, see Allen and Fildes (2001), DiNardo and Tobias (2001), Hoaglin et al. (1983) and Banerjee et al. (1997). The dependent variable is seasonally adjusted so that if an association between two variables is found, it cannot be assigned to the fact that the two variables have a seasonal component (Chatfield, 1996). For every explanatory variable selected, a kernel regression was performed to determine the association between that variable and the real-time price. The kernel regression was done to check if the association between the dependent and the independent variable is linear or non-linear. Our results show that almost all explanatory variables have non-linear relationship with the real-time IMO price. It is difficult to judge from the kernel regression, the specific form of some of the nonlinearities. For instance, Figs. 1 and 2 show the kernel regressions of the IMO peak price versus two other variables, i.e. inflexible generation energy and the dawn

32

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

Table 4 Summary of transformations applied to variables No.

Variable

Transformation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Average peak price Dawn index Ontario peak load NY A peak price average NY M peak price average Forecast supply energy Capacity Inflexible generation energya Energy limited energy Energy limited capacity Outages total Energy excess Capacity excess Summer Ontario temperature Heat index Humidity Sun exposure Average wind speed Ontario air quality Average import actual peak Average TTC out peak Cinergy price Predispatch price Fall Spring Operational reserves Inflexible generation capacityb

Square root Logarithmic None Square root None None None None None None None None None None None Logarithmic None Square root Square root Square root None Cubic Logarithmic Square root None None None None

a

This is the sum of ‘‘intermittent generation’’ and ‘‘self-generation’’. b This is the sum of ‘‘intermittent capacity’’ and ‘‘self-capacity’’.

index, which are believed to be important in explaining the prices. Both graphs show a non-linear association with the IMO average peak price. The correlation values of these variables with IMO peak price is 0.11 and 0.16, respectively.10

Table 5 Summary of stationarity testingdAugmented DickeyeFuller (ADF) test results. Null hypothesis is that each series has a unit root. PhillipsePerron tests yielded the same results No.

Variablea

P-value

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Average peak price Dawn index Ontario peak load NY A peak price average NY M peak price average Forecast supply energy Capacity Inflexible generation energy Energy limited energy Energy limited capacity Outages total Energy excess Capacity excess Ontario temperature Heat index Humidity Sun exposure Average wind speed Ontario air quality Average import actual peak Average TTC out peak Cinergy price Predispatch price Operational reserves Inflexible generation energy

0.000 0.094 0.000 0.000 0.000 0.015 0.000 0.027 0.037 0.001 0.006 0.000 0.000 0.005 0.000 0.000 0.000 0.000 0.000 0.000 0.400b 0.001 0.000 0.600b 0.300b

a Fall, Spring and Summer are dummy variables therefore no ADF test is done for them. b Variable is integrated of order one.

3.1. Statistical machine learning Let a vector of inputs x˛Rn be drawn by a data generation process G from a fixed but unknown probability distribution F(x). Let us assume that the input vector x undergoes an economic process S, which returns a real value y (IMO price) to every input x

3. Statistical machine learning for variable selection The following discussion motivates why a statistical machine learning algorithm is more appropriate for variable selection as compared to the conventional statistical methods, given the characteristics of the problem such as non-linearity and low variable-observation ratio. It starts with a brief introduction to the concept of statistical machine learning and the SVM with sensitivity analysis algorithm (SVMSA). Next, it describes why the SVM algorithm is a better tool for selecting variables than the classical statistical models. 10 The IMO peak price in the graphs is seasonally adjusted and transformed by a square root to improve symmetry. The kernel regression results for the other variables are available from the authors upon request.

Fig. 1. Kernel regression of IMO average peak price and inflexible generation energy.

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

33

Eq. (1) is replaced by the empirical risk function Remp(a) constructed from the estimation set (Empirical Risk Minimization Inductive Principle (ERM)): Remp ðaÞ ¼

n 1X Lðyi ;fðxi ;aÞÞ n i¼1

ð2Þ

Solely minimizing Eq. (2) can lead to overfitting which means that the selected regression function fits well the estimation set but has a low predictive power in out of sample observations. It has been proved that R(a) has an upper bound given by the sum of two terms with probability ð1dÞ, see Vapnik and Chernovenkis (1971): RðaÞ%Remp ðaÞC½capacity term Fig. 2. Kernel regression of IMO average peak price and dawn index.

according to a conditional distribution function FðyjxÞ, which is fixed but unknown. Let us say one has an estimation (training) set, which is a collection of pairs (x, y). Also, assume that there is an algorithm (learning machine) capable of implementing a set of functions f(x; a), a˛L, where a does not have to be a vector of parameters, it can be any abstract coefficient. L can be any set of functions and is defined a priori. The objective of the learning machine is to choose from the given set of functions f(x; a), a˛L, a regression function that approximates the best response of S for any input x (see Fig. 3). The selection of the regression function by the algorithm is based on the collection of pairs (x, y) in the estimation set (Vapnik, 1995). The ideal machine would select a regression function f(xi; a0) with a˛L (over the class of functions f(xi; a), a˛L) such that it minimizes the expected risk function R(a): Z RðaÞ ¼ Lðy;fðx;aÞÞdFðx;yÞ ð1Þ where L() is the discrepancy between the observed response y and the response given by the learning machine. Given that F(x, y) is unknown and the only information available is in the estimation set, R(a) in

ð3Þ

where the capacity term characterizes the complexity of the set of functions f(x; a), a˛L, implemented by the algorithm in relation to the number of observations in the estimation set. For details on the concept of complexity, see Vapnik (1995). From Eq. (3), one can see that the application of ERM over a set of functions with high complexity may lead to a low Remp(a) but to a high bound of R(a). The goal of the machine learning algorithm is to find the regression function that minimizes the upper bound of R(a), given by the right-hand side of Eq. (3), using the structural risk minimization inductive principle (SRM) (Vapnik, 1995). SRM consists of a trade-off between the quality of the approximation computed from the given estimation set (Remp) and the capacity term of the approximating function. SRM defines a nested subsets of functions Qk ¼ fðxi ;aÞ, a˛Lk with capacity term hk, where Q1 3Q2 3.3Qn and h1 %h2 %.%h3 , and applies the ERM to select the function f(xi; ak) that minimizes Remp(a) in each subset Qk. Next, from the selected set of functions, SRM chooses a regression function which is associated with the lowest capacity term. 3.2. Support vector machines SVM is commonly associated with the idea of SRM. It is a machine learning algorithm that fixes at 3 (defined a priori) the maximum allowed value of Remp(a), and searches within each subset Qk the function with the lowest capacity that satisfies Remp ðaÞ%3. SVM for regression is characterized by: 1. SVM implements a nested subset of functions Qk of the form: fðxÞ ¼

n X

ai kðxi ;xÞCb

ð4Þ

i¼1

Fig. 3. Statistical machine learning algorithm.

where xi is a vector of inputs from the estimation set, ai is a constant associated with xi, k(xi, x) are kernel functions defined a priori, n is the number of

34

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

observations, and b is the bias. The power of SVM comes from the kernel that allows the transformation of a non-linear problem to a linear one by means of a non-linear mapping of the input space into a higher feature space. 2. As shown in Eq. (2), the Remp(a) depends on the loss function L(). SVM for regression problems uses L() the 3-insensitive loss function given by: ( ) 3 if Ky  fðx;aÞK%3 Ky  fðx;aÞK3 ¼ : Ky  fðx  aÞK otherwise The above equation states that if the difference between the actual value of y and the predicted value by the machine is less than 3, then the loss is set equal to 3, otherwise it is equal to the actual difference. 3. SVM implements the SRM by solving the following quadratic optimization problem: maximize :

 12

n X

ðai  a)j Þðaj  a)j Þkðxi ;xj Þ

i;j¼1

3

n X 

n  X   ai Ca)i C yi ai  a)i

i¼1

subject to :

n X

i¼1

ðai  a)i Þ ¼ 0

i¼1

ai ;a)i ˛½0;C

ð5Þ

where a)i and ai, are the coefficients for i ¼ 1.n that maximize Eq. (5). 3 is the predetermined maximum allowed deviation between the actual value of y and the estimated one. C is a predetermined trade-off term that says how much deviations (x) from 3 are tolerated to reduce the complexity of the approximating functions. The first term in Eq. (5) stands for the complexity of the approximating functions given in Eq. (4). The remaining terms stand for the quality of the approximation. The solution of Eq. (5) requires the following a priori knowledge: the kernel function and its associated parameters, the trade-off constant C, and the value of 3. Later, we will briefly discuss one of the available algorithms, i.e. pattern search to find the optimal predetermined parameters. The solution of Eq. (5) is a vector of coefficients a)0 and a0 which provide the regression function that satisfies Remp ðaÞ%3Cx with the lowest complexity within the predetermined subset of functions defined by Eq. (4). 4. The quadratic optimization problem Eq. (5) is solved not in the original input space but in a 2n dimensional real Euclidian space F. The dimension of this space depends only on the number of observations and not on the number of variables.

As in classical statistics, SVM requires some prior knowledge (kernel parameters, C and 3). In classical statistics, the prior knowledge comes from assumptions about F(x; y). In practice, the prior knowledge required by SVM is less restrictive than the assumptions required by the traditional statistics (Herbrich et al., 1999). SVM is a powerful non-linear algorithm for the regression analysis of IMO real price because there is evidence of non-normality in the data. SVM allows different distributions for the additive error h in the regression function. Also, there is evidence of non-linear associations between the explanatory variables and the dependent variable. For a similar discussion on application of SVM to currency crises, please see the companion paper (Arciniegas Rueda et al., 2004). 3.3. Machine learning sensitivity analysis If one thinks about a typical linear regression model, estimated by least squares, the sensitivity of each variable is given by the coefficient that corresponds to the variable xi. This coefficient expresses how much the predicted output would change when xi changes in one unit, in a ceteris paribus comparison (the other variables values are constant). Likewise, in machine learning models, which are algorithms rather than explicit equations, the sensitivity of each variable xi can be defined as the absolute maximum change in the SVM’s output prediction (maximum minus the minimum values), when the value of xi is varied in its allowable range and all other descriptive variables are kept constant at their mean/median value. This measurement ˆ will be referred as Dy=Dx i. In classical regression analysis, usually the variable’s sensitivity, bˆ i , is computed using all the observations in the sample. Given some prior knowledge (e.g., normality and homoscedasticity), t-statistics are then computed to determine whether xi is relevant or not. In statistical ˆ machine learning, the variable’s sensitivity, Dy=Dx i , is not computed over the complete sample; instead, it is computed over a split of the sample called the estimation set (training set); the remaining sample, called the validation set, is used to measure the predictive power of the statistical machine learning regression. As opposed to a t-test on b in machine learning, a different approach has to be used for deciding if a variable xi is relevant or not. Under machine learning, one is interested in keeping the variables that display high sensitivity. In sensitivity analysis, a variable’s significance is decided by adding a random variable RV to the original data set. A random variable’s effect over the output is expected to be insignificant. A variable xi is called significant if its sensitivity value ˆ ˆ Dy=Dx of i is higher than the sensitivity value Dy=DRV the random variable. This comparison can be done with a t-test as in classical regression; however, it was found

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

that the random variables in our data set were often ˆ ˆ skewed. Due to the skewness (Dy=Dx the i  Dy=DRV), standard statistical tests could not be performed. Here, a variable xi is determined insignificant if its sensitivity measurement is below the sensitivity measurement of the random variable that was added to the original data set. ˆ ˆ Values of Dy=Dx are computed for several i and Dy=DRV estimation sets randomly drawn from the data set. This criterion is expressed by Dyˆ Dyˆ  cwDyˆ=Dxi > CcwDyˆ=DRV Dxi DRV

ð6Þ

where c is a scalar, w stands for standard deviation, Dyˆ=Dxi is the average sensitivity of the variable xi and Dyˆ=DRV is the average sensitivity of the random variable. The variable selection process described above is implemented in an iterative mode. In each iteration, all variables with sensitivity below the sensitivity of the random variable are dropped, and the remaining set of variables is used for the next iteration. The process is halted when no more variables can be dropped. The predicted power of the remaining set of variables in each iteration is computed for monitoring purposes (Arciniegas Rueda et al., 2004). 3.4. SVM regression implementation of IMO price Given that the variables in the data are measured in different units, variables are scaled by variance (Mahalanobis scaling). There are several types of Kernels (e.g., linear, polynomial, and radial basis functions (RBF)) that could be used in Eq. (4). The kernel used here implements RBF with parameter s as shown in Eq. (7) and is the most commonly used kernel in the SVM literature. SVMTorch is used as a subroutine to solve problem (5), see Collobert and Bengio (2001). kðxi ;xj Þ ¼ expðKKðxi xj ÞKK

2

=2s2 Þ

predict the value of yi corresponding to LOO sample xi. The predictive capabilities of a set of given parameters (C1, s1, 31) is evaluated using the Q2 defined as: n P

Q2 ¼ i¼1 n P

ðyˆ i  yi Þ

2

ð8Þ ð y  yi Þ

2

i¼1

where yˆ i is the predicted value for the LOO ith sample, yi is the actual value for the LOO ith sample, n is the number of observations in the estimation set, and y is the average value of y in the estimation set. The computed value of Q21 and the 3-D point (C1, s1, 31) are the starting values in the PS algorithm. Next, the algorithm samples points (C1, s1, 31) at a given radius D from the starting point (C1, s1, 31). For each single point (C1, s1, 31), a Q2j is computed, following the same procedure used for Q21 . If one of the sampled points has a smaller Q2, it defines a new searching center and iterates the process. If all the points at a radius D from the current center have a larger Q2, then the radius D is reduced in half. The search continues until D is small enough to ensure convergence (see Fig. 4). The final outcome of the PS method is the optimum set of parameters (C), s), 3)) in a minimum Q2 sense. Finally, with the best set of parameters (C), s), 3)), SVMTorch (Collobert and Bengio, 2001) solves Eq. (5) for the whole estimation set, providing the best regression function. 3.5. SVM sensitivity analysis implementation The SVM sensitivity analysis is implemented in six stages. The first stage extends the original data set with a random variable RV to gauge the sensitivities of the potential explanatory variables. The random variable can come from any distribution. In this paper, a normal

ð7Þ

The solution of (5) requires the following a priori knowledge: the kernel parameter s, the trade-off constant C, and the value of 3. These parameters are computed using a pattern search (PS) algorithm developed by Momma and Bennett at the DDASSL group (Momma and Bennett, 2001). PS is a direct search method that instead of using derivatives uses a direct function evaluation to find the parameters that maximize the predictive power of the SVM. The algorithm starts by defining a wide 3-D search space for the parameters (C, s, 3). Next, it randomly chooses a point (C1, s1, 31) in the predefined search space. Then, it solves (5) with leave-one-out (LOO) cross-validation. This means that for every point xi in the estimation set, (5) is solved with SVMTorch leaving out the ith observation. The resulting SVM is used to

35

Fig. 4. Pattern search.

36

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

distribution was used for the random variable. The last five stages are implemented in an iterative mode. The second stage randomly splits the extended data set into 100 estimation and validation sets. Each split consists of an estimation set of 180 and a validation set of 20 observations.11 The third stage estimates a SVM regression of IMO real-time average peak price on the estimation set. The regression is used to predict the price for each observation in the validation set. The fourth stage performs the sensitivity analysis. Stages three and four are carried on in each split. At the beginning of the fifth stage, an aggregated measure of the predictive power for the current subset of variables is estimated, for monitoring purposes. The predicted power is estimated using either the Q2 (Eq. (8)), or the root mean squared error (RMSE): vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uN uP u ðyi  yˆ i Þ2 t RMSE ¼ i¼1 ð9Þ N where N¼ 2000 is equal to the number of observations in the validation set (20) times the number of random splits (100); yi is the actual value and yˆ i is the predicted value for the ith sample. The fifth stage continues with the computation of an aggregated (i.e. mean) sensitivity measurement for each explanatory variable, including the random variable. This measurement is calculated from the results of the sensitivity analyses in stage four. In the sixth stage, variables with aggregated sensitivity below the aggregated sensitivity of the RV are dropped. The remaining variables define a new set of explanatory variables for the next iteration (stage 2). The iterative process ends when either no more variables can be dropped or there is a decrease in the predictive power of the reduced subset. Sixty-six extended data sets are built by adding to the original data set a random variable coming from a standard normal distribution. Different seeds are used for each one of the extended data sets. Therefore, the SVMSA is performed 66 times (SVMSA refers to SVM sensitivity analysis). The SVM sensitivity analysis algorithm is implemented in StripMiner. StripMiner is a script-based shell scientific program that manages and integrates the execution of several different machine learning and statistical methods such as artificial neural networks (ANN), support vector machines (SVM), genetic algorithms (GA), and local learning (LL) as well as a statistical method called partial least squares regression (PLS).12 11

As a rule of thumb, the number of observations in the validation set should be at least 10% of the available observations. 12 StripMiner is written in a tight classic C-code (!15,000 lines) and it runs on all platforms. Mark J. Embrechts, Fabio Arciniegas, and Muhsin Ozdemir at the DDASSL group developed StripMiner.

3.6. SVM sensitivity analysis results for IMO price In general, each SVMSA consisted of the estimation of 500 SVM regressions distributed over five iterative cycles. Therefore, 33,000 SVM regressions are computed for this paper (66 extended data sets times 500). Table 6 illustrates the variables that are found to be relevant in at least one of the 66 runs. The number of hits represents how many times a variable is found to be relevant. Table 5 ranks the explanatory variables accordingly to the total number of hits. It also defines the relative explanatory power (REP) of each variable based on the number of times it was found significant in explaining the IMO real-time average peak price. A variable with an REP of 100% (e.g., energy excess) means that it was always found significant. A variable with an REP of 80% (e.g., inflexible generation energy) means that 20% of the times; it was not found to be significant in explaining the IMO price. From Table 6, it can be seen that every variable is found significant at least two times. This is not surprising since by the law of chance one could expect the random measurement of sensitivity to be high enough at least once for each variable so that the variable would be selected. Fig. 5 shows the explanatory power of each variable following the rankings in Table 5. After 66 runs, a set of 66 variable subsets is obtained; therefore, a criterion to select a unique variable subset has to be used. Fig. 5 shows a smooth variation in the number of total hits from the ‘‘IMO average predispatch price’’ (variable No. 1) until ‘‘inflexible generation energy’’ (variable No. 12). There is a clear gap in the number of hits between ‘‘inflexible generation energy’’ (variable No. 12) and ‘‘summer’’ (16%). This suggests that the variable ‘‘inflexible generation energy’’ would be a good cut-off point. As shown in Fig. 3, all variables ranked above ‘‘inflexible generation energy’’ (cut-off) are found significant at least 80% of the time. Therefore, all variables above and including ‘‘inflexible generation energy’’ are selected as the set of relevant explanatory variables. 3.7. Discussion of SVM sensitivity analysis results for IMO price Table 6 and Fig. 5 rank all variables based on their explanatory power. An analysis of the results in Table 6 shows that with the exception of predispatch price, the variables associated with real time (e.g., energy excess) are more important than the variables associated with forecasted values (e.g., forecast supply energy). This is a reasonable result as the real-time price should be more sensitive to changes in real-time variables than to changes in forecasted variables. After the first 12 variables, we see a significant drop in the explanatory power of the variables. Of these 12 variables identified as

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

37

Table 6 SVM sensitivity analysis results Ranking

Variable

Total hits

Explanatory power (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

IMO predispatch average price IMO average import actual Energy excess Ontario peak load NY zone M peak price Outages total NY zone A peak price Cinergy average peak price Capacity excess Ontario temperature Dawn index Inflexible generation energy Summer Spring Average wind speed Humidity Ontario air quality Sun exposure Fall IMO average TTC out peak Energy limited energy Energy limited capacity Heat index Operational reserves Forecast supply energy Forecast demand

66 66 66 66 65 65 64 64 62 60 54 53 42 32 23 14 12 13 11 10 11 8 7 2 2 2

100 100 100 100 98 98 97 97 94 91 82 80 64 48 35 21 18 20 17 15 17 12 11 3 3 3

significant in explaining price, the most important ones are: predispatch price, imports, energy excess, load, outages, and real-time prices in the surrounding power markets. In real world, also these variables are actively tracked by traders and domain experts during their decision analysis process. This shows that the variables selected by the SVMSA method are relevant and useful. Our results show that the predispatch price is as important as imports, energy excess, and load. This suggests that the projected price by IMO is not the most important variable in predicting the real-time price as perceived by the analysts. Imports, peak load and energy excess are equally important variables. Also, all prices in surrounding regions are important because traders take advantage of arbitrage opportunities across regions. The significance of outages and imports is not surprising since they are important determinants of the supply curve. The expected load, an important determinant of demand, was also found to be significant. Hence, the most important variables identified by SVMSA are in line with theory (Stoft 2002) and practice. Dawn index, a variable tracked closely by the traders, was found to be important but not as important as the ones mentioned before. This could be due to the fact that most of the power plants have long-term gas contracts and therefore have only an indirect effect on the real-time IMO price. Weather variables are also

Fig. 5. Relative explanatory power of the independent variables.

actively observed by the traders, however, only temperature was found to be a significant variable. Inflexible generation energy is significant in explaining real-time peak price because it is a part of the base supply curve. 4. Results from linear regression methods We compared the performance of the SVM algorithm with some of the conventional statistical linear regression techniques for variable selection. First, we use the Splus leaps function to find the best subset of the explanatory variables. The function leaps takes the set of potential explanatory variables and based on adjusted R2 or Mallow’s Cp statistic, finds the subset of independent variables that provide the best possible model. The selected model would have a large adjusted R2 value and a small Mallow’s Cp value. For more on the leaps, see Furnival and Wilson (1974). We ran the leaps function with both methods, i.e. adjusted R2 and Mallow’s Cp. The results from the adjusted R2 method are shown in Fig. 6 and the results from Mallow’s Cp statistic are shown in Fig. 7. Our analysis shows that the adjusted R2 method found Ontario peak load, IMO import actual peak, NY zone A peak price, forecast supply energy, air quality and predispatch price to be the most important explanatory variables. However, Mallow’s Cp criteria found Ontario peak load, IMO import actual peak, temperature, dawn index, NY zone A peak price, forecast energy and predispatch price to be important. The variables selected by leaps with the adjusted R2 criterion would raise doubts as this method found air quality to be more important than the outages. In addition, this method found forecasted variables like forecast supply energy to be more important than the real-time variable such as energy excess. The results of leaps using Mallow’s Cp statistic also look suspicious. Even though this method found predispatch price,

38

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

Fig. 6. Adjusted R2.

Fig. 7. Mallow’s Cp statistic value.

imports, load, and NY zone A to be important, it did not find outages, energy excess, and Cinergy price to be crucial. As mentioned earlier, outages and energy excess are important real-time variables. Cinergy price is important since it determines the arbitrage opportunities between Ontario and ECAR regions. We also used Akaike Information Criterion (AIC) and Schwartz Bayesian Criterion (SBC) to select the variables. The best fitting model had the lowest values of the AIC and SBC among all competing models.13 A linear regression of the real-time IMO peak price on the 28 explanatory variables was performed. Based on AIC and SBC, the best model selected five variables. These are the Ontario peak load, IMO import actual peak, NY zone A peak average, forecast supply energy and the predispatch price. The results of AIC and SBC based models also raise concerns since they do not include critical real-time variables such as energy excess, outages and Cinergy.

(using the training set) selected by the SVM method. We then use the estimated model to do an out of sample forecast (with the validation set) of the real-time peak price. Similarly, we estimate a regression with the explanatory variables identified by the Mallow’s Cp criteria. This estimated model is then used to perform an out of sample forecast using the 20 observations in the validation set. Table 7 compares the forecast results from the two models. The forecast evaluation is done with the variables listed in the table. The root mean square error and the mean absolute error depend upon the scale of the dependent variable. These are used as relative measures to compare forecasts of the same series across different models. The smaller the error, the better the forecasting ability of that model. Mean absolute percentage error and Theil inequality coefficient are scale invariant. The Theil inequality coefficient lies between 0 and 1 where zero indicates a perfect fit. The bias (variance) proportion tells us how far the mean (variance) of the forecast is from the mean (variance) of the actual series. The covariance proportion measures the remaining unsystematic forecasting errors. See Eviews 4.0 User’s guide (2000), Pindyck and Rubinfeld (1991) for more details. Note that all proportions add up to 1. For good forecasts, the bias and variance proportions are small but the covariance proportion is

4.1. Forecasting power We finally use the results of SVM and one of the linear regression models, i.e. Mallow’s Cp, to determine if SVM outperforms the traditional method in model selection for forecasting the real-time average peak price. For both methods, we first divide the data into a training set of 180 observations and a validation set of 20 observations. For SVM, we estimate a regression of the real-time peak price using the explanatory variables 13

The Akaike Information Criterion and the Schwartz Bayesian Criterion are defined as: AIC ¼ T lnðresidual sum of squaresÞC2n SBC ¼ T lnðresidual sum of squaresÞCn ln T T is the number of observations and n is the number of coefficients estimated.

Table 7 Forecast comparison of SVM and Mallow’s Cp based model Variable

SVM

Cp

Root mean square error Mean absolute error Mean absolute percentage error Theil inequality coefficient Bias proportion Variance proportion Covariance proportion

0.54 0.43 5.83 0.03 0.0007 0.018 0.98

0.64 0.51 6.87 0.04 0.44 0.02 0.53

I.E. Arciniegas Rueda, A. Marathe / Utilities Policy 13 (2005) 27e39

high. Based on all of the above measures, SVM significantly outperforms the Cp based model. 5. Summary and conclusions This paper uses SVM based learning algorithm to select important explanatory variables in explaining real-time average peak electricity price in the Ontario market. Given that the Ontario market was opened to competition only in May 2002, the number of observations available on each variable is quite low. The kernel regressions of the explanatory variables on the real-time price show that most of the dependencies are non-linear. We use a machine learning SVM algorithm that is known to work well with data sets characterized by nonlinearity and low variable-observation ratio. Under SVM, a variable’s significance is decided by adding a random variable to the original data set. Explanatory variables that display higher sensitivity to the response variable than the random variable are selected. The variable selection process is implemented in an iterative mode. In each iteration, all variables with sensitivity below the sensitivity of the random variable are dropped, and the remaining set of variables is used for the next iteration. The process is halted when no more variables can be dropped. Based on SVM sensitivity analysis, we find that the IMO’s predispatch average price, the actual import peak price at IMO, the peak load of the Ontario market and the net available supply after accounting for expected load (energy excess) are some of the most important variables in explaining the real-time price in the Ontario market. This model also found outages and the prices of the surrounding regions of Ontario to be important. A comparison of the SVMSA results with the traditional method shows that the explanatory variables selected by SVM are more in line with the theory and lead to better forecasting models.

6. Future work In future work, one could look at the methods that provide alternatives to the cut-off approach for selecting the set of relevant variables. It would also be interesting to apply the SVM approach to other recently restructured markets such as Salvador, Guatemala and Peru in Latin America. All these markets are characterized by low variable-observation ratio like IMO. In addition, once the IMO market matures, a follow up study can be performed using the same approach and differences could be interpreted in light of a learning process.

39

Acknowledgements The authors would like to gratefully acknowledge the comments and suggestions by an anonymous referee.

References Allen, G., Fildes, R., 2001. Econometric forecasting. In: Armstrong, S. (Ed.), Principles of Forecastingda Handbook for Researchers and Practitioners, pp. 334. Arciniegas Rueda, I., Marathe, A., Barrett, C.L., 2003. Assessing the efficiency of US electricity markets. Utilities Policy 11 (2), 75e86. Arciniegas Rueda, I., Arciniegas, F., Embrechts, M., 2004. SMV sensitivity analysis: an application to currency crises aftermaths. IEEE Transactions on Systems Man and Cybernatics, Part A. Systems and Humans 31 (3), 387e398. Banerjee, A., Dolado, J., Galbraith, J., Hendry, D., 1997. Co-integration, Error-Correction, and the Econometric Analysis of NonStationary Data. Oxford, pp. 11e13. Bowerman, B., O’Connell, R.T., 2001. Time Series Forecasting: an Applied Approach. Duxbury Press. Chatfield, C., 1996. The Analysis of Time Series: an Introduction, fifth ed Chapman and Hall. Christensen, R., 1990. Log-Linear Models. Springer-Verlag New York Inc. Collobert, R., Bengio, S., 2001. SVMTorch: support vector machines for large scale regression problems. Journal of Machine Learning Research 1, 143e160. DiNardo, J., Tobias, J., 2001. Nonparametric density and regression estimation. Journal of Economic Perspectives 15 (4), 11e29. Eviews 4.0 User’s guide. Quantitative Micro Software, 2000. Furnival, G.M., Wilson Jr., R.W., 1974. Regressions by leaps and bounds. Technometrics 16, 499e511. Herbrich, R., Keilbach, M., Graepel, T., Obermayer, K., 1999. Neural networks in economics: background, applications, and new developments, advances in computational economics, 11, 169e196. Hoaglin, D.C., Mosteller, F., Tukey, J.W., 1983. Understanding Robust and Exploratory Data Analysis. Willey, New York, pp. 447. Kewley, R., Embrechts, M., Breneman, C., 2000. Data strip mining for the virtual design of pharmaceuticals with neural networks. IEEE Transactions on Neural Networks 11 (3), 668e679. Momma, M., Bennett, K., 2001. A pattern search method for model selection of support vector regression. Presented at SIAM, Arlington, VA. Pindyck, R., Rubinfeld, D.L., 1991. Econometric Models and Econometric Forecasts, 3rd ed McGraw-Hill. Stoft, S., 2002. Power System Economics. IEEE/Wiley, pp. 383. Vapnik, V., 1995. The Nature of Statistical Learning. Springer, New York. Vapnik, V., Chernovenkis, A., 1971. On the uniform convergence of relative frequency of events to their probabilities. Theory of Probability and its Applications 16 (2), 264e280. Venables, W.N., Ripley, B.D., 1994. Modern Applied Statistics with S-Plus. Springer-Verlag New York Inc. Wessel, Jurs, M.D., Tolan, J.W., Muskal, S.M., 1998. Prediction of human intestinal absorption of drug compounds from molecular structure. Journal of Chemistry, Informatics, and Computer Science 38, 726e735.

Suggest Documents