2 Two-Stage Stochastic Optimization for Optimal ... - ACM Digital Library

Two-Stage Stochastic Optimization for Optimal Power Flow Under Renewable Generation Uncertainty DZUNG PHAN and SOUMYADIP GHOSH, IBM T.J. Watson Research Center

We propose a two-stage stochastic version of the classical economic dispatch problem with alternatingcurrent power flow constraints, a nonconvex optimization formulation that is central to power transmission and distribution over an electricity grid. Certain generation decisions made in the first stage cannot further be changed in the second stage, where the uncertainty due to various factors such as renewable generation is realized. Any supply-demand mismatch in the second stage must be alleviated using high marginal cost power sources that can be tapped in short order. We solve a Sample-Average Approximation (SAA) of this formulation by capturing the uncertainty using a finite number of scenario samples. We propose two outer approximation algorithms to solve this nonconvex program to global optimality. We use recently discovered structural properties for the classical deterministic problem to show that when these properties hold the sequence of approximate solutions obtained under both alternatives has a limit point that is a globally optimal solution to the two-stage nonconvex SAA program. We also present an alternate local optimization approach to solving the SAA problem based on the Alternating Direction Method of Multipliers (ADMM). Numerical experiments for a variety of parameter settings were carried out to demonstrate the efficiency and usability of our method over ADMM for large practical instances. Categories and Subject Descriptors: G.1.6 [Optimization] General Terms: Algorithms, Theory Additional Key Words and Phrases: Stochastic optimization, optimal power flow, decomposition algorithms, sample-average approximation, global solution ACM Reference Format: Dzung Phan and Soumyadip Ghosh. 2014. Two-stage stochastic optimization for optimal power flow under renewable generation uncertainty. ACM Trans. Model. Comput. Simul. 24, 1, Article 2 (January 2014), 22 pages. DOI: http://dx.doi.org/10.1145/2553084

1. INTRODUCTION

Energy generation and supply-demand mediation in a power grid is currently planned for in two steps. The first step takes place in a day-ahead market and decides which bulk generation sources (typically thermal, nuclear, and hydro sources) are switched on and available to supply energy in the next day. This base generation capability is augmented by additional smaller capacity “peaker” thermal generators and external sources of energy (“spot markets”) to hedge against unplanned excess demand. The second operational planning step, which is this article’s subject, is at a smaller time scale, typically 5 to 15 minutes. It decides how the active generators are dispatched, which involves deciding the power output level of the bulk generators (power is the rate Authors’ addresses: Dzung Phan and Soumyadip Ghosh, Department of Business Analytics and Mathematical Sciences, IBM T.J. Watson Research Center, P.O. Box 218, 1101 Kitchawan Road, Yorktown Heights, New York 10598; email: {phandu, ghoshs}@us.ibm.com. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2014 ACM 1049-3301/2014/01-ART2 $15.00 DOI: http://dx.doi.org/10.1145/2553084

ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2

2:2

D. Phan and S. Ghosh

Fig. 1. An IEEE standard 30-bus example.

at which energy is produced), and how the produced energy is routed through the grid to consumption (also called load) nodes. Transmission occurs between multiple buses (network nodes) that are interconnected via electrical transmission lines: Figure 1 provides an illustration. The power flow between each pair of nodes obeys certain nonlinear equations that arise out of Kirchoff ’s law (see, e.g., Glavitsch and Bacher [1991] for a review). An Economic Dispatch (ED) problem is said to have been solved to determine an Optimal Power Flow (OPF) solution when the dispatch and transmission decisions minimize the total cost of generation needed to meet demand [Carpentier 1962]. This article studies a stochastic version of the ED problem. Modeling uncertainty in dispatching is becoming more critical as renewable energy technologies play an increasing role in the portfolio mix of electricity generation. Uncertainty in demand has always been present, but demand at the scale of this large-scale planning problem has a large base component that can be predicted well, and the uncertainty is comparatively muted due to the large aggregation of largely independent, moderately uncertain demand elements (i.e., residential customers). System operators have thus been able to moderate demand uncertainty by reserving adequate peaker capacity. But aggregation of renewable generation capacity does not reduce the variability of generation output. In any large installation of windmills in a region, the output of the wind turbines will be correlated, depending primarily on regional weather conditions. However, forecasting near-term wind availability and velocity is an imperfect science. (The rest of the article will use wind power generation as the primary motivating example, but the methods developed here apply transparently to all sources of uncertainty, be it in power generation [i.e., supply] or in load [i.e., demand].) But renewable sources like wind generation have negligible operational costs (in the hourly time scale) and thus should be the first supply to be dispatched. Thus, we model wind power as a intermittent source that is connected in an always-on state to the grid. Indeed, regulations in multiple U.S. states require the use of wind power if it is being generated. The intermittent nature of wind ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

Optimal Power Flow Under Uncertainty

2:3

leads to mismatches in energy supply and demand that can cause local grid failures, which in turn can quickly lead to blackouts in large parts of the regional grid. Grid operation agencies require that this renewable generation uncertainty be hedged against. Various approaches have been proposed to tackle this: Dragoon and Milligan [2003] analyze the impact of forecasting uncertainty for wind power production on incremental reserve requirements (such as peakers) and imbalance costs. Xue et al. [2007] present a balancing algorithm in a distributed generation network setting that actively manages a group of small distributed generations and converts them into one large and more quickly controllable logical generation station. Stochastic versions of the ED problem have been studied in the literature. Existing methods make a crucial simplification to the OPF problem in order to gain tractability at the expense of optimality to the original problem: they use the Direct-Current (DC) power flow approximations [Shahidehpour et al. 2002; Yong et al. 2009a], a linearized version of the AC power flow equations. Hatami et al. [2009] propose a stochastic programming framework for the DC-power ED problem to determine the optimal procurement of interruptible load (defined as responsive demand that can be changed within a very short time lag) in order to minimize the risk of a shortfall over multiple periods. The stochastic DC OPF can be solved by imposing a set of risk constraints, in the form of chance constraints [Fu and McCalley 2001] or mean-excess constraints [Ghosh et al. 2011], to balance risk of shortfalls against cost of provisioning corrective generation sources such as peakers. In this article, we propose a two-stage stochastic formulation to address renewable uncertainty. Our main point of departure from current literature on stochastic ED problems is in our consideration of the full nonlinear AC power balance equations in each stage of this program. Each stage models dispatching and transmission decisions that are made in subsequent time periods separated by about 5 to 15 minutes. Certain dispatching decisions cannot be changed in the subsequent period; for example, mechanical stability considerations require that large diesel/coal generators change their generation levels more gradually than the length of the period. So, these decisions are made in the first stage and remain fixed at the second stage. The second stage realizes the actual wind generation. Any resulting supply-demand mismatch must then be alleviated using additional high marginal cost power sources that can be tapped in short order. We define this second class of generators collectively as the spot market, which can include (1) thermal peakers, which are active, quick-response hydro or fossil-fuel sources; (2) external power sources, such as neighboring grids or power aggregators that are willing to supply extra power at spot market prices; and (3) sources of virtual generation, such as interruptible loads of large commercial users, retail operations, or consumer homes that can be influenced to shift or reduce their demand in response to incentives (refer to Ghosh et al. [2010] for further details). Effectively, our formulation may dispatch adequate first-stage generation capacity to hedge against the risk of a large unforeseen shortfall in total supply in both stages, and any realized shortfall may be alleviated using quickly dispatchable, albeit costly, spot market sources. This formulation must be seen as a stepping stone toward the full multistage version of the same problem that may allow the large generator’s outputs to be varied in a small interval in subsequent stages, governed by ramp rate constraints. We anticipate that our algorithms and findings will extend to the multistage case. Our approach to solving the stochastic formulation is via Sample-Average Approximation (SAA), where the uncertainty is captured using a finite number of scenarios; Shapiro et al. [2009] is an excellent reference. A major departure from the SAA literature is that the power flow constraints are nonlinear, and usually nonconvex. The nonlinearity arises from the Kirchhoff law for power flow in each stage and generally makes the two-stage program nonconvex. Previous work on linearization of the ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:4


nonlinear AC constraints to the DC constraints allowed the use of the powerful decomposition techniques designed for the linear stochastic programs [Yong et al. 2009a]. However, the DC formulation captures the physical power flows less realistically than its AC counterpart, and the full nonconvex AC formulation is more desirable. The nonconvex two-stage SAA program is hard to solve using standard techniques, and the literature has little to offer as a general prescription. In Section 4, we adopt the Alternating Direction Method of Multipliers (ADMM) approach [Glowinski and Marrocco 1975; Gabay and Mercier 1976; Eckstein and Bertsekas 1992] to solve the SAA formulation. The ADMM is a general-purpose direct-search algorithm that promotes decomposed solution approaches to nonlinear programs that have a block separable structure, such as the SAA formulations that we study. The ADMM approach has been successfully employed in obtaining locally optimal solutions to the deterministic ED problem [Kim and Baldick 2000], and our ADMM procedure can be viewed as another application of this method in power systems. A key limitation of the ADMM approach, however, is that its direct-search nature only yields locally optimal solutions. It is commonly observed that key to solving such nonconvex problems is the ability to exploit structural properties of the specific formulation. Lavaei and Low [2012] describe a method to obtain local optimal solutions to the standard ED problem (the singlestage deterministic ED problem) using a specific convex approximation and observe a strong structural property in their numerical experiments: the candidate local optimal solutions to practical instances of the problem also often satisfy a zero duality gap condition, thus proving to be globally optimal. This property has since been extensively tested and found to hold in most real-world examples of transmission grids. Phan [2012] introduce a global optimization algorithm to solve the deterministic ED problem starting with a Lagrangian dual reformulation and conduct extensive experiments to show that real-world grid test instances with several thousand buses support this zero duality gap property. We make use of this observation, namely that global optimal solutions with zero duality gap can be identified by particular approaches to solving the deterministic ED problem. This property has been proven to hold for simple networks such as trees [Gan et al. 2012], but to date (to the best of our knowledge), a general result has eluded the strenuous efforts of a number of researchers. We provide two algorithms to obtain globally optimal solutions to the nonconvex SAA program. Our convergence analyses of the algorithms make use of two assumptions. First, we assume that the renewable generation farms have been sized and that their placement on the network has been designed adequately to admit a feasible secondstage solution (including power exchange with the spot market to satisfy demand and transmission of power through the network) for any value of first-stage variables (bulk generation power outputs). Second, we assume that each second-stage problem has a Lagrangian dual formulation with a zero duality gap; this assumption has been commonly observed for practical grid instances. Specifically, our contributions are: —We formulate a two-stage SAA program for the ED problem under generation uncertainty, modeling the complete nonconvex AC power flow dynamics (Section 2). —Under the assumption that a zero duality gap global solution can be identified, the recourse function that represents the second-stage optimal cost in the firststage problem is shown to be convex in the first-stage decision variables. Moreover, the second-stage optimal solutions yield subgradient information for this convex recourse function (Theorem 3.1 in Section 3). —Two algorithms are introduced to solve the nonconvex SAA program. Each scheme generates a sequence of lower-approximation nonconvex problems for the original nonconvex problem (Section 3). ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:5

—Further, we establish that if each nonconvex approximation is solved to global optimality, then under the two assumptions, the sequence of solutions converge to a global optimal solution to the original nonconvex SAA problem (Theorem 3.2 in Section 3). —A law-of-large-numbers type of sampling consistency result on the quality of the SAA estimate of the true stochastic formulation is established (Theorem 3.5 in Section 3.1). —We also investigate the ADMM approach to decomposing the SAA problem. Note that in contrast to the proposed algorithms, the ADMM decomposition technique is a local-search method and so does not guarantee global optimality (Section 4). —We performed numerical experiments for a variety of electricity grid instances and parameter settings that illustrate the efficiency and usability of proposed methods for large practical instances. Numerical results show that the outer approximation decompositions work an order of magnitude faster than the ADMM approach. In particular, the decomposable feature facilitates a parallel implementation of these algorithms (Section 5). Section 2 introduces the basic notation and our model formulation. Section 3.2 is a short note on the implementation of the algorithms that we describe in real-world systems. 2. MODEL DESCRIPTION

An electric grid management entity controls the dispatching of active generation units over a network of multiple buses interconnected via transmission lines. Define N as the set of all buses (or nodes) in the grid network, G as the set of buses to which conventional generators connect, W as the set of wind farm buses, D as the set of energy demand or load buses, and S as the set of buses with access to the spot market. Let L denote the set of branches (e.g., transmission lines) in the power grid. To keep the exposition readable, we will assume that the sets G, W, D, and S are pairwise disjoint and that N = G ∪W ∪D ∪S. Figure 1 provides an illustration of these on a 30-bus example taken from the IEEE standard test suite found at http://www.ee.washington.edu/research/pstca/. The alternating current nature of electricity supply mandates that power analysis be conducted using complex-valued quantities. The real and imaginary parts of power are referred to as the active and reactive power, respectively. Power flow over a network of transmission lines that connect all of the buses is determined by the voltages set at each bus. Following notation standard to power flow literature, let v be the vector of (complex-valued) voltages vi at all buses i ∈ N . The complex-valued vector of currents I injected at each system bus can be computed from the voltage v as I = Yv, where Y is the (complex-valued) bus admittance matrix [Wood and Wollenberg 1996], a physical property of the network. The vector of net complex power injections S is governed by the equation S = v ◦ I∗ = v ◦ (Yv)∗ , (1) where ◦ denotes elementwise vector multiplication and ∗ denotes complex conjugation. Power flow equations (1) contain separate sets of equations for the active and reactive power parts, which are the real and imaginary parts, respectively, of S. Depending on the polar or rectangular coordinate representations of the complex-valued quantities voltage v and admittance matrix Y, the AC power flow equations can be expressed in several equivalent forms. For example, in rectangular coordinates, each function is quadratic in the real and imaginary parts of the state variable v and is nonconvex. We note that our theoretical results presented later hold in all coordinate systems. The distinction of constraints into real and imaginary sets holds no significance to our description in this article, so we will compress our notation here for the sake of brevity. ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:6


In our formulation, the first-stage generation control decisions are the power gi extracted from conventional generators i ∈ G. (This variable is assumed to contain both the real and imaginary parts of the power extracted.) Let functions fi (gi ) represent the cost of generation. Typically, fi is only a function of the real part of gi and convex. The generators are limited to producing within [g i , gi ]. Let g represent the collection {gi , i ∈ G} and define by f (g) the total generation cost i∈G fi (gi ). Demand at node i ∈ D is represented by Di . The second-stage uncertainty realization ξ ξ observes wind farm power extraction at level Wi at bus i ∈ W. Demand Di could also be allowed to vary at each stage and even with each realization without disturbing any of our approach and conclusions. The second-stage recourse decision variables are the ξ extra power si that can be exchanged with the spot market at bus i ∈ S, allowing for the possibility that surplus or deficient powers may also be sold to or purchased from ξ the spot market. The functions Ci (si ) give the total cost of purchasing from or selling

to the spot market under realization ξ . Define s ξ = {siξ , i ∈ S} and let C(sξ ) be the total spot market access cost i∈S Ci (siξ ) under realization ξ . Both stages make power dispatch decisions by setting the voltages vi0 and viξ at each node i ∈ N for the first-stage and second-stage realizations ξ . Then, the following power balance equations are implied for all ξ: ⎧ ξ ⎪ s ∀ i∈S ⎪ ⎨ i gi ∀ i∈G ξ (2) Si (v ) = ξ ⎪ W ∀ i∈W ⎪ i ⎩ −Di ∀ i ∈ D.

Note that the power flow functions Si (·) remain the same for both the first stage and each of the scenarios in the second stage. Various physical and safety considerations require that the vector of all voltages |v ξ | ∈ [v, v]. The total power injection Si at node i can be written as a sum of terms i j that represent the interchange of power between

node i and neighbor j. The power transmission (v ξ ) = {i j (v), i, j ∈ N } obeys physical limits on the transmission line of form (v ξ ) ≤ . The objective of the two-stage OPF problem is to minimize, over the variables g, v ξ , and sξ , the total aggregated expected costs: min g, sξ v ξ

s.t.

f (g) + E [C(sξ )] power flow balance constraints (2) for the first stage and all second-stage realizations ξ, ξ

i j (v ) ≤ i j , gi ≤ gi ≤ gi , v i ≤ |viξ | ≤ v i ,

(3)

∀ (i, j) ∈ L, ξ ∀ i∈G ∀ i ∈ N , ξ.

In practice, other physical and operational constraints, such as phase angle constraints [Frank et al. 2012] and spinning reserve requirements [Xie and Song 2000], can also be imposed, and the solution approaches that we describe remain valid. We allow the random realization ξ to have a continuous distribution. In what follows, we will study the SAA of problem (3), where a set of scenarios n ∈ {1, . . . , N} of secondstage realizations is sampled and scenario n has an associated probability pn. We shall abuse this notation slightly by referring to the first stage as the scenario n = 0 (with p0 = 1 where required). The SAA problem for finitely many renewable generation ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:7

output scenarios sn is min g, sn,v n

f (g) + C(s0 ) +

N

pn C(sn)

(4a)

n=1

s.t. power flow balance constraints (2) for ∀ n ∈ {0, . . . , N} i j (v n) ≤ i j , gi ≤ gi ≤ gi , v i ≤ |vin| ≤ v i ,

∀ (i, j) ∈ L, n ∈ {0, . . . , N} ∀ i∈G ∀ i ∈ N , n ∈ {0, . . . , N}.

(4b) (4c) (4d) (4e)

The sample probabilities pn, n = 1, . . . , N, will depend in general on the sampling scheme used; for example, under i.i.d. sampling, they are all identically 1/N. Our formulation allows si0 = 0 in the first stage; however, we expect typical cost curves fi and Ci and available generation capacity to ensure that the first stage does not exchange power with the spot market. The first stage may overprovision generator capacities gi for the base generation to handle uncertainties in the second stage, and any power excess (i.e., si0 < 0) may be sold to the spot market in the first stage. 3. ALGORITHMS TO SOLVE SAMPLE-AVERAGE APPROXIMATION

The nonconvex optimization (4) can be compactly cast in the classical two-stage stochastic program form: min

g,v 0 ,s0

(g) = f (g) + C(s0 ) +

N

pn ωn(g),

subject to

(g, v 0 , s0 ) ∈ ,

(M)

n=1

where the feasible region of the first-stage decision variables is defined by the constraints (4b) through (4e) associated with the first stage (n = 0). The second-stage cost ωn for all n = 1, . . . , N is ωn(g) =

min v n, sn

C(sn)

s. t. Si (v n) = gi ,

∀i ∈ G

(Sn )

Rn(v n, sn) ≤ 0. The equalities in (Sn) arise from power balance equations (2), whereas the constraints Rn(·) include all other second-stage constraints (4b) through (4e) that do not contain the variables g. These constraints include power balance constraints (2) from nodes in B \ G, and hence they are scenario dependent. Any equality constraints are represented by inequalities in opposing directions in the set Rn. We define the vector-valued function SG as {Si (·), i ∈ G}. In addition, note that the first set of equalities that depend on g contain only linear terms in g. We will present two outer approximation algorithms to solve the problem (M). Our algorithms are similar in nature to the methods that solve convex nonlinear program´ ming problems [Geoffrion 1972; Grothey et al. 1999; Ruszczynski 2003] or mixedinteger nonlinear programs [Duran and Grossmann 1986; Fletcher and Leyffer 1994; Bonami et al. 2008] but are adapted to solve this nonconvex problem. The key idea is that under certain reasonable assumptions, the recourse function ωn(g) is convex in g even though the subproblems (Sn) are nonconvex. Further, Theorem 3.1 shows that the global solution to the subproblems (Sn) for any specified first-stage variable g provides subgradient information for this convex function. The epigraph of ωn(g) will be iteratively approximated by an intersection of a collection of affine inequalities formed from ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:8


these subgradients. This outer approximation is in line with the generalized Benders decomposition approach of Geoffrion [1972] but differs in the method that is utilized to construct the outer linearizations of ωn(g). Our approach uses subgradients of the function generated from the dual solution of the second-stage subproblems (Sn), whereas the earlier approach merely obtains an outer cover from the optimal primal values of (Sn) at iterates of g. We solve these sequence of approximating nonconvex problems using any suitable global optimizer to obtain a sequence of candidate solutions that Theorem 3.2 establishes has a limit point that is globally optimal for the original problem (M). Our proposed algorithms rely on the following two assumptions that have been shown to be widely shared in practical power grid balance problems (Lavaei and Low 2012; Phan 2012). ASSUMPTION 1. For every g : {gi ∈ [gi , gi ], ∀i ∈ G}, the subproblem (Sn) is feasible. Assumption 1 says that any value of the generation power output g chosen in the first stage admits a feasible solution for each (Sn) by buying (or selling) the deficiency (or surplus) of energy from (to) the spot market. This assumption seems unjustified in the presence of constraints such as (4d) and (4e), but note that it is being made only in the context of the operational control of the electricity grid. The design of renewable generation farms, namely the sizing and placement of such resources on the electrical grid, and the provisioning of the spot market capacity, which typically consists of more than one source, has to explicitly ensure that this assumption holds for the operational control problem (4). This is a difficult and interesting problem on its own but is outside the scope of this article. Our methods to solve (4) use this assumption to sidestep the usual second-stage feasibility checking and feasible-set estimation part of decomposition-based algorithms. This is crucial to our approach, since this ensures that our algorithms do not face the possibility that any subregions of the primal feasible nonconvex region could be cut off during the algorithm’s feasibility-set-estimation iterations. For cutting-plane–based approaches to solving nonlinear problems, feasibility cuts (especially linear cuts) could prune subregions of the (nonconvex) domain containing the global optimum, yielding only a locally optimal solution (see, e.g., Sahinidis and Grossmann [1991]), and extra machinery is required to prevent this eventuality. Let us split the set of inequality constraints {Rn(·) ≤ 0} in subproblem (Sn) into two sets indexed by I and J : RnI = {Rin(·) ≤ 0, i ∈ I} and RnJ = {Rin(·) ≤ 0, i ∈ J}, where J can n n be empty. Let denote the feasible set {(v, s)|RJ (v, s) ≤ 0}. We define a Lagrangian dual function for (Sn) by bringing the equalities SG and the first set of inequalities RnI into the objective: C(sn) + γ T SG (v n) + λT RnI(v n, sn) . hn(λ, γ ) = inf (v n,sn) ∈ n

The parameters γ and λ are Lagrangian multipliers of the functions SG and RnI, respectively. The function hn(λ, γ ) utilize only the SG terms of the first set of equalities in (Sn) and is independent of the first-stage variables g. The standard Lagrangian dual formulation associated with subproblem (Sn) is max hn(λ, γ ) − γ T g.

λ≥0,γ

(5)

The following assumption is essential to designing our decomposition technique to solve the nonconvex problem to optimality. ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:9

ASSUMPTION 2. For every g : {gi ∈ [gi , gi ], ∀i ∈ G}, the Lagrangian dual (5) of subproblem (Sn) has a zero duality gap with the associated primal solution. In subproblem (Sn), the uncertainty has been realized, and so (Sn) is a classical deterministic OPF problem. Assumption 2 seems like a strong condition to make; however, practical electricity grids are observed to satisfy this always. Lavaei and Low [2012], in a landmark paper, exploit a specific equivalent reformulation of the dual of the OPF problem to obtain an easy check for the zero duality gap condition of local optimal solutions of this reformulation. Their approach takes the Lagrangian function with J = ∅; refer to Theorem 1 in Lavaei and Low [2012]. Experiments show that this condition is satisfied by the local optimal solutions of their reformulation of a wide range of power grid test instances, including all IEEE benchmark systems. Further, they provide algebraic and geometric justifications to argue heuristically that this is to be expected for most real-world power systems. In Phan [2012], we introduce a different Lagrangian dual problem, where we propose to retain simple bounds such as box and sphere constraints in the set of inequalities RnJ . In contrast to the local optimization approach of Lavaei and Low [2012], we exploit this dual problem to provide an efficient global optimization algorithm for this OPF-equivalent formulation. The identified global solutions were then numerically checked for the zero gap of Assumption 2, and results show that this was achieved for all solutions of an expanded set of test problems. Assumption 2 thus says that the subproblem (Sn) is expected to inherit this property of classical OPF problems. We now characterize the recourse functions ωn(g) associated with second-stage scenarios n = 1, . . . , N. THEOREM 3.1. If Assumption 2 holds, then the recourse cost function ωn(g) is convex. Furthermore, if γˆ is the Lagrangian multiplier corresponding to the equality constraints {SG (v n) = g} from the subproblem (Sn), then −γˆ is a subgradient of ωn at g. PROOF. By Assumption 2, we have ωn(g) = max λ≥0,γ

min

C(sn) + γ T SG (v n) − g + λT RnI(v n, sn) .

(v n,sn) ∈ n

In Boyd and Vandenberghe [2004; p. 67], ωn is convex if we can show that the function α(t) = ωn(g+ td) is convex with respect to t ∈ R for any g ∈ [g, g] and d ∈ R|G| . Indeed, let t1 , t2 ∈ dom(α) and β ∈ [0, 1]. We have min max C(sn) + λT RnI(v n, sn) λ≥0,γ

+ γ T SG (v n) − (g + (βt1 + (1 − β)t2 )d)

= min max β C(sn) + λT RnI(v n, sn) + γ T SG (v n) − (g + t1 d) (v n,sn) ∈ n λ≥0,γ

+ (1 − β) C(sn) + λT RnI(v n, sn) + γ T SG (v n) − (g + t2 d)

≤ β min max C(sn) + λT RnI(v n, sn) + γ T SG (v n) − (g + t1 d)

α(βt1 + (1 − β)t2 ) =

(v n,sn) ∈ n

(v n,sn) ∈ n

λ≥0,γ


2:10


+ (1 − β)

min

(v n,sn) ∈ n

max C(sn) + λT RnI(v n, sn) + γ T SG (v n) − (g + t2 d)

λ≥0,γ

≤ β α(t1 ) + (1 − β) α(t2 ), which leads to the convexity of α(t). The first step switches the min and max operators, which is possible if the zero gap condition of Assumption 2 is true. This completes the first part. For the second part, suppose that the strong duality posited by Assumption 2 holds for the primal (Sn) and its Lagrangian dual formulation (5). Further, suppose that the primal and corresponding dual solutions at master-stage variables g(1) and g(2) are (ˆv n(1) , sˆ n(1) , λˆ (1) , γˆ (1) ) and (ˆv n(2) , sˆ n(2) , λˆ (2) , γˆ (2) ), respectively. We have

T ωn(g(1) ) + γˆ T(1) g(1) = C sˆ n(1) + λˆ (1) RnI vˆ n(1) , sˆ n(1) + γˆ T(1) SG vˆ n(1)

T ≤ C sˆ n(2) + λˆ (1) RnI vˆ n(2) , sˆ n(2) + γˆ T(1) SG vˆ n(2)

T ≤ C sˆ n(2) + λˆ (2) RnI vˆ n(2) , sˆ n(2) + γˆ T(1) SG vˆ n(2) = ωn(g(2) ) + γˆ T(1) g(2) . The first equality is by the strong duality principle holding at g(1) , the next inequality is due to the definition of the dual Lagrangian function hn(λ, γ ), and the last two are by weak duality at g(2) . This completes the proof. Theorem 3.1 tells us that the recourse function ωn associated with the n-th subproblem (Sn) is convex in g, and thus the recourse cost functions do not introduce any additional source of nonconvexity in (M). This part of the proof is crucially aided by the fact that the equalities that depend on g in the subproblem (Sn) contain only linear terms in g. Moreover, Theorem 3.1 tells us that the dual solution of the subproblem (Sn) at a specific master-variable g value provides a subgradient to ωn(g). This suggests that given a set of subgradients {π n, j , j = 1, . . . , k} of ωn, a piecewise linear lower-approximation function can be constructed for ωn that obeys a set of linear constraints of the form ωn(g j ) + (π n, j )T (g − g j ) ≤ ηn,

∀ j = 1, . . . , k.

(6)

We exploit (6) to iteratively solve a sequence of lower approximations of the problem (M): min k(g) = f (g) + C(s0 ) + pn ηn n=1,...,N

s.t.

(g, v , s ) ∈ and ηn, g satisfy linear constraints of form (6), 0

0

(Mk )

where the k-th iteration yields subgradients π n,k centered around gk. The algorithm is as follows: OUTER APPROXIMATION ALGORITHM 1 1. Set ηn,1 = 0, ∀ n = 1, . . . , N. Solve the following to get g1 min f (g) + C(s0 )

s.t.

(g, v 0 , s0 ) ∈ .

2. For k = 1, 2, . . . a. For each n = 1, . . . , N: Solve the subproblem (Sn) associated with gk to get the optimal value ωn(gk) and a subgradientπ n,k. n n,k b. Terminate the algorithm if = n pn ωn(gk). n p η ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:11

c. Solve the k-th lower-approximation (M k) to obtain an optimal solution gk+1 and an augmented lower approximation ηn,k+1 . In this algorithm, we add N new linear constraints per outer iteration. Notice that ηn,k n k is the approximation ) given by the k-th piecewise-linear lower approximation of ω (g from (6), and so n pn ηn,k ≤ n pn ωn(gk) for every k. Thus, the algorithm terminates when the k-th master problem (Mk) finds an optimal g where the lower-approximation matches the true function ωn. The k-th iteration of this algorithm solves N second-stage ED problems (Sn) and one master-stage approximation (Mk) using global optimization techniques. For the subproblems (Sn), we can use the dual formulation from Phan [2012] or Lavaei and Low [2012], both of whom seem to satisfy Assumption 2 in real-world instances. Each of the lower-approximation problems (Mk) are themselves the classical nonconvex ED problem augmented with a set of linear inequalities defining the subgradients. Efficient global optimization algorithms, such as the ones described in Phan [2012] or Tawarmalani and Sahinidis [2005], can be used to solve the problem to optimality. We are thus able to give the following convergence result for this algorithm. THEOREM 3.2. Suppose that the set of subgradients {π n,k} is uniformly boundedthat is, supn,k π n,k 2 < ∞—and Assumptions 1 and 2 hold. Then, Algorithm 1 either reaches a global optimal solution in a finite number of iterations or generates a sequence {gk}k=1,... such that lim (gk) = ∗ ,

k→∞

where ∗ is the global optimal value of (M). PROOF. First, note that for every k, (g k, (s0 )k) is a feasible solution to problem (M). Set μnj(g) = ωn(g j ) + (π n, j )T (g − g j ) for all j = 1, . . . , k , and note from Theorem 3.1 that each μnj(g) is a lower bound of ωn(g). If the algorithm terminates in a finite number of iterations, this implies that it stops at Step 2(b). We now show that (g k) = ∗ if Step 2(b) is satisfied by g k. We have for every feasible point (g, s0 ) of problem (M), f (g k) + C((s0 )k) +

pn ωn(gk) = f (g k) + C((s0 )k) +

n

≤ f (g) + C(s0 ) +

≤ f (g) + C(s ) +

ηn,k

n

pn

n

0

pn

pn

max μnj(g)

j=1,...,k−1

ωn(g).

n

The first inequality is by definition of the master problem (M) and the fact that at optimality ηn,k = max j=1,...,k−1 {μnj(gk)}. The last inequality is because μnj are lower bounds on ωn. This implies that gk is a global optimal solution of (M). Now suppose that the sequence of the master problem solutions {gk}k=1,... is infinite. It suffices to show that for any > 0, the set I = {k : ∗ < (gk) − } is finite, and thus a limit of the postulated form exists. Let k1 , k2 ∈ I and k2 > k1 . At k1 -th iteration, we use Theorem 3.1 to define the optimality cut ωn(gk1 ) + (π n,k1 )T (g − g k1 ) ≤ ηn. ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

(7)

2:12


Since k2 > k1 , plugging (g k2 , ηn, k2 ) into (7) yields ωn(gk1 ) pn + (π n,k1 )T (gk2 − gk1 ) pn ≤ ηn,k2 pn n

⇒

n

n

(g ) + (π n,k1 )T (g k2 − g k1 ) pn + f (g k2 ) ≤ ηn,k2 pn + f (g k1 ) + f (g k2 ) k1

n

n

≤ f (g k1 ) + ∗ , since the optimal value of the master is a lower bound of ∗ . Hence, −(g k1 ) − (π n,k1 )T (g k2 − g k1 ) ps − f (g k2 ) + f (g k1 ) n

≥ ∗ > − (gk2 ), since k2 ∈ I . It follows that n(ωn(g k2 ) − ωn(g k1 ) − (π n,k1 )T (g k2 − g k1 )) pn > . Note that |ωn(g k2 ) − ωn(g k1 )| ≤ π n,k2 g k2 − g k1 . Because of the uniform boundedness condition on π n,k , there exists a constant T such that π n,k ≤ T for any n and k. Thus, we get g k2 − g k1 >

2T

maxn pn

for all k1 , k2 ∈ I . Since {g k}k=1,... is contained in a compact set, this implies that I is finite. Notice that we do not need to impose any convexity assumption for function components of f and C to obtain the global convergence. The key ingredient that helps our algorithm work is the zero duality gap property of the nonconvex second-stage subproblem (Sn). This leads to the convexity of the recourse functions ωn in Theorem 3.1, which in turn lets the global optimal solutions to the sequence of outer approximation problems in Algorithm 1 have a limit point that is globally optimal to (M), as shown in Theorem 3.2. Remark 3.3. The presented convergence result and proof do not change if we weaken Assumptions 1 and 2 to hold only for the sequence of points g visited by the algorithm rather than the entire feasible region . Thus, Algorithm 1 requires that in any iteration k, the subproblem (Sn) associated with the optimal solution g k of (Mk) is feasible. Furthermore, there exists a dual problem for (Sn) related to g k that has a zero duality gap. If the zero duality gap condition is not found to hold, we can only guarantee that a locally optimal solution is given by the algorithm. We will now describe a variant of the approximation problem (Mk). An alternative cut can be derived by aggregating optimality cuts (6) for all scenarios:

η=

N n=1

pn ηn ≥

N n=1

ωn(g j ) pn +

N

(π n, j )T (g − g j ) pn, ∀ j = 1, . . . , k.

(8)

n=1

The variant (Mk ) of the lower-approximation problem (Mk) is obtained by replacing the variables {ηn, n = 1, . . . , N} with the single variable η and replacing the corresponding constraints (6) by constraints (8). The problem (Mk ) is also a lower approximation of the master problem (M) and is an ED problem augmented with a set of linear constraints. The following alternate algorithm uses the variable η and the variant (Mk ): ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:13

OUTER APPROXIMATION ALGORITHM 2 1. Set η1 = 0, ∀ n = 1, . . . , N. Solve the following to get g1 min f (g) + C(s0 ) s.t. (g, v 0 , s0 ) ∈ . 2. For k = 1, 2, . . . a. For n = 1, . . . , N: Solve the subproblem (Sn) associated with g k to get the optimal n,k value ωn(g k) and a subgradient π . b. Terminate the algorithm if ηk = n pn ωn(g k). c. Solve the k-th lower-approximation (Mk ) to obtain an optimal soluk+1 k+1 tion g and η . Steps 2(a) and 2(c) solve the ED problems with an appropriately chosen global optimizer, just as in Algorithm 1. The linear approximation constructed from this algorithm for the second-stage recourse function in the master problem (Mk ) is slightly different from the previous one. We add only one cut per major iteration, so its size is smaller. A convergence result akin to Theorem 3.2 holds for this algorithm. The proof follows with minor modifications of the arguments in the proof of Theorem 3.2 and so is omitted. THEOREM 3.4. Suppose that {π n,k} is uniformly bounded—that is, supn,k π n,k 2 < ∞— and properties 1 and 2 hold. Then, Algorithm 2 either reaches an optimal solution in a finite number of iterations or generates a sequence {gk}k=1,... such that lim (gk) = ∗ ,

k→∞

where ∗ is the optimal value of (M). 3.1. Consistency of SAA Formulation

To recap, this section has so far described algorithms to solve the SAA (4) of the stochastic formulation (3). Along the way, we utilized Assumption 2 in Theorem 3.1 to show that the recourse function ωξ (·) that represents the second-stage problems in the master stage is convex for any realization ξ of uncertainty. Theorems 3.2 and 3.4 state that under these conditions, both algorithms that solve the SAA problem produce a limit point that is an optimal solution to the SAA program. These results also turn out to be sufficient for the SAA to produce consistent estimates of the stochastic formulation (3) as the size N of the sample set grows: THEOREM 3.5. Let ϑ ∗ and ϑ N be the optimal objective values to the stochastic formulation (3) and its N-sample approximation (4). Similarly, Let D∗ and DN be the set of optimal solutions of the two problems. If (i) the feasible set of the first-stage program (M) is compact, (ii) the expected value function E[ωξ (g)] is finite valued and continuous N for (g, v 0 , s0 ) ∈ , and (iii) the LLN holds pointwise n=1 ωn(g)/N → E[ωξ (g)] with probability 1, then ϑN → ϑ ∗

and

d(DN , D∗ ) → 0

w.p. 1 as N → ∞,

where d(·, ·) denotes the Haussdorf set deviation metric. Theorem 3.5 states that the sets DN are nonempty. The feasible set is compact often because of the upper and lower bounds placed on each of the decision variables. The proof follows directly from Theorem 5.3 in Shapiro et al. [2009] that applies to global optimization of continuous functions of the form E[ωξ (g)] over compact sets. ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:14


3.2. Implementation in Practice

Today’s real-world power system operations utilize the solution of the classical deterministic ED problem as a basic decision support tool in maintaining a continuous close balance between generation and load throughout the network. The classical ED problem, however, replaces uncertain quantities such as the load with either a forecast or estimated value. Moreover, present formulations consider only a linear DC approximation of the nonlinear AC transmission constraints. Power system engineers take cognizance of these limitations with respect to real-time operations and keep some very fast response controls in place that can dynamically adjust to the evolving state of the grid. For example, NERC requires the system to maintain a minimum operating reserve level [North American Electric Reliability Corporation (NERC) Standard 2005] especially in the presence of renewables. (Utilities sometimes anticipate that 1 megawatt (MW) backup is stored for each MW wind capacity installed on their grid [Yong et al. 2009b].) One such option is described in Bevrani and Hiyama [2011], where some low-capacity fast-responding generators are designated as Automatic Generation Control (AGC) generating units. Dynamic response to changes in the load is effected by setting the output of these AGC units to track the real-time measurements of the system frequency (an observable surrogate of voltage and hence supply-demand mismatch) at the buses. This control frequently makes small adjustments to the output of the AGC units, thus effectively controlling voltage magnitude in real time. Our formulation explicitly takes into account many of the uncertainty parameters ignored by the deterministic formulation as well as the nonlinear power flow constraints. The solutions identified by our algorithms, especially the first-stage variables that represent slow-to-change large generator outputs, are thus conditioned better to these constraints and can be utilized to greater effect in the presence of renewable uncertainties. In a real-world system, we propose that the operator use the optimal values for g from the first stage. Additionally, our formulation for the second stage closely tracks the current practice of adjusting the voltage parameters (via variables (sn, vn)) according to the realized scenario W n. Thus, the output from the the scenarios associated with the sampled value W n that is closest to the actual value W ξ can also be used as a dynamic control guidance. Alternately, a second-stage problem with the realized value W ξ can be quickly solved to obtain better guidance. 4. ALTERNATING DIRECTION METHOD OF MULTIPLIERS

This section investigates an alternative approach to decomposing the large-scale program defined by the SAA problem (4) into smaller subproblems. We will reformulate the original problem to apply the ADMM [Glowinski and Marrocco 1975; Gabay and Mercier 1976; Eckstein and Bertsekas 1992] to derive our algorithm. ADMM belongs to the class of first-order primal dual algorithms that updates both primal and dual variables at each iteration. The method has been successfully applied to solve for various real-world applications, including image and signal processing [Wang et al. 2008; Afonso et al. 2010; Esser 2009], statistics and machine learning [Boyd et al. 2011], and power system analysis [Kim and Baldick 1997, 2000] to solve the classical OPF problem. We present the general ADMM approach first: consider the general optimization problem with a block separable structure min {F(x) + G(y) : Ax + By = d, x ∈ X, y ∈ Y }, x,y

(9)

where X ⊂ Rn, Y ⊂ Rm, A ∈ R p×n, B ∈ R p×m, and d ∈ R p. We assume that F and G are closed, proper, convex, and differentiable. Let us form the augmented Lagrangian Lβ (x, y, λ) = F(x) + G(y) + λT (Ax + By − d) +

1 β Ax + By − d 2 , 2

(10)



2:15

where β > 0. The classical augmented Lagrangian multiplier method [Hestenes 1969; Powell 1969] involves a joint optimization and a multiplier update step: (xk+1 , yk+1 ) = argminx∈X, y∈Y Lβ (x, y, λk) λk+1 = λk + β(Axk+1 + Byk+1 − d). In many applications, the first optimization problem is difficult to solve. The ADMM [Glowinski and Marrocco 1975; Gabay and Mercier 1976] consists of the iterations xk+1 = argminx∈X Lβ (x, yk, λk) yk+1 = argminy∈Y Lβ (xk+1 , y, λk)

(11)

λk+1 = λk + β(Axk+1 + Byk+1 − d). In adapting the approach (11) to the classical OPF problem, Kim and Baldick [1997] split the power grid into a number of separate regions and duplicated the variables representing new power flow in the overlap between regions. This allows each region to solve its subproblem and hence solve the OPF in a distributed fashion. We now show how we can apply the method to solve for (4). Observe that the optimization problem (4) can be reformulated by introducing auxiliary variables gn = {gin, i ∈ G}, n = 0, 1, . . . , N in a form suitable for ADMM (recall that n = 0 represents the first stage and assumes p0 = 1): min

f (g0 ) +

N

pn C(sn)

(12)

n=0

subject to

⎧ n s ⎪ ⎨ in gi n Si (v ) = n ⎪ ⎩ Wi −Di

∀ ∀ ∀ ∀

i i i i

∈S ∈G ∈W ∈D

,

n = 0, 1, . . . , N

(13)

i j (v ) ≤ i j ∀ (i, j) ∈ L gi ≤ gin ≤ gi ∀ i ∈ G v i ≤ |vin| ≤ v i ∀ i ∈ N n

gn = g0

n = 1, . . . , N.

(14)

We bring the constraints (14) into the objective in creating the following Lagrangian function for this reformulation: N 0 0 0 0 N N N N 0 pn C(sn) Lβ ((g , v , s , λ ), . . . , (g , v , s , λ )) = f (g ) + n=0

+ (λn)T (g0 − gn) +

β 0 g − gn 2 . 2

By applying (11) at each iteration, we are able to decompose the original into N + 1 subproblems and solve each scenario independently, where each scenario in turn solves a classical OPF formulation, just as in the decomposition Algorithms 1 and 2 presented in Section 3. The resulting ADMM algorithm is as follows: ADMM ALGORITHM ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:16

D. Phan and S. Ghosh Table I. Test Systems Characteristics Cases CH9 IEEE14 IEEE30 NE39 IEEE57 IEEE118 IEEE300

|N | 9 14 30 39 57 118 300

|G| 3 5 6 10 7 54 69

|L| 9 20 41 46 80 186 411

|W| 2 3 4 5 6 7 7

|S| 2 2 2 2 2 3 3

N 9 27 81 243 729 2187 2187

Variables 276 1338 6792 26718 102048 787304 1648982

1. Initialize gn,0 , λn,0 , ∀n = 1, . . . , N. 2. For k = 1, 2, . . . a. Set (g0,k, v 0,k, s0,k) = N n,k−1 T argmin f (g) + C(s) + n=1 (λ ) (g − gn,k−1 ) + β2 g − gn,k−1 2 s.t. (g, v, s)obeys (13) for n = 0. b. For n = 1, . . . , N : argmin pnC(s) + (λn,k−1 )T (g0,k − g) + β2 g0,k − g 2 (gn,k, v n,k, sn,k) = s.t. (g, v, s)obeys (13) for n. c. Set λn,k = λn,k−1 + β(gn,k − g0,k). d. Terminate if gn,k+1 − g0,k+1 ≤ and gn,k+1 − gn,k ≤ , n = 1, . . . , N. The stopping criterion used in Step 2(d) is popular in ADMM implementations [Boyd et al. 2011]. 5. NUMERICAL EXPERIMENTS

We use the following electrical power systems to test our proposed algorithms. Theircharacteristics are summarized in Table I. —CH9: the 9 bus example from [Chow et al. 2003, p. 70] —NE39: the New England system [Pai 1989] —IEEE14, IEEE30, IEEE57, IEEE118, and IEEE300: five IEEE system examples, with more details found at http://www.ee.washington.edu/research/pstca/ The first column in Table I shows the abbreviations of the systems, whereas the second and third columns show the total number of buses and the number of generators in each system. The fourth column reports the number of branches interconnecting the buses. We artificially added different sources of renewable generation and spot markets, as presented in the fifth and sixth columns. This is in keeping with the handful of articles on renewable integration with ED [Yong et al. 2009a; Jabr and Pal 2009] and is due to the lack of detailed design data and publicly accessible network topology information for networks that incorporate renewable generation at a large scale. The numbers of scenarios and decision variables are given in the seventh and the last columns, respectively. The algorithms were coded in Matlab, and experiments were carried out on a PC using Matlab 7.10 with an Intel Xeon X5570 2.93GHz under the Linux operating system. The Lagrangian dual algorithm in Phan [2012] was used to for solving subproblems (Sn). The master problems and subproblems of ADMM were solved by the global optimization algorithm in Phan [2012]. We terminated the algorithms when = 10−3 . In particular, we used

ωn(gk+1 ) − ηn,k+1 pn ≤

(Algorithm 1) (15) n



2:17

Table II. Performance of Algorithm 1 Averaged over 50 Instances CPU Time is in Seconds

Cases CH9 IEEE14 IEEE30 NE39 IEEE57 IEEE118 IEEE300

Negative Correlation CPUTime Iter. 0.16 ± 00.00 2.0 ± .0 1.09 ± 00.06 4.6 ± .3 3.16 ± 00.22 4.1 ± .2 4.31 ± 00.12 2.0 ± .0 22.65 ± 01.14 3.3 ± .1 256.65 ± 97.05 3.8 ± .7 267.23 ± 13.06 3.7 ± .1

and

Algorithm 1 Zero Correlation CPUTime Iter. 0.16 ± 00.00 2.0 ± .0 1.09 ± 00.07 4.6 ± .3 3.08 ± 00.24 4.1 ± .2 6.01 ± 00.52 2.4 ± .1 22.92 ± 01.26 3.3 ± .1 225.14 ± 40.33 3.8 ± .4 264.73 ± 18.50 3.4 ± .1

ωn(gk+1 ) pn − ηk+1 ≤

Positive Correlation CPUTime Iter. 0.16 ± 00.01 2.0 ± .0 1.12 ± 00.07 4.6 ± .3 2.99 ± 00.12 3.9 ± .1 7.10 ± 00.53 2.6 ± .1 22.89 ± 01.10 3.2 ± .1 211.56 ± 12.49 3.8 ± .1 287.17 ± 20.74 3.4 ± .1

(Algorithm 2).

(16)

n

In our numerical experiments, we assume that the generation cost function fi is convex quadratic, and the cost parameters are obtained from Matpower 4.1 [Zimmerman et al. 2011]. The spot market costs Ci (si ) are convex quadratic over si > 0 and linear over si < 0. The cost coefficients for the imported-energy function Ci (si ) (i.e., when si > 0) were chosen to be 1.5 to 2.0 times higher than the largest coefficients of the generation cost function fi to enforce a higher cost than bulk generation cost if the energy has to be purchased from the spot market. Meanwhile, the coefficients of the cost of exported energy function Ci (si ) (i.e., when si < 0) were taken to be 0.5 to 0.7 times the smallest linear coefficients of the fi to discourage incentivizing excess bulk generation. The wind farm outputs are assumed to obey multivariate Gaussian distributions. We set the mean power output of each wind farm to 10% of the average of the upper limits gi of the capacity of conventional generators in the local neighborhood. (This is defined as the shortest depth subnetwork connected to the wind farm node that contains at least one conventional generator.) The standard deviation for each generator is set to 15% of its mean value. Further, we study the effect of correlated wind-output on the performance of the algorithms. One can reasonably expect the effect of the correlation to change with the physical scale of the electrical network, where geographically close generation sources experience higher correlation, whereas more diverse locations tend to be more weakly correlated. In the absence of adequate historical data to derive correlated input models, we test the algorithms under these three settings of equal correlation between all of the wind farms: (1) The lowest negative equal-correlation configuration that is feasible for the Gaussian random vector of wind farm outputs which depends on the dimension of the random vector and in our different test cases was within the range [−0.35, −0.2]), (2) The independent case of zero correlations, and (3) An equal positive moderate correlation of +0.5. For each of these settings, the two algorithms were run over 50 instances, where each instance samples the 3|W| scenarios independently from the multivariate Gaussian distribution with the specified correlation matrix. Table II reports the CPU time spent and number of outer iterations in Algorithm 1 as a 95% confidence interval. Tables III and IV provide the same metrics for the second outer approximation algorithm and the ADMM described in Section 4. The tables together show that our proposed outer approximation algorithms converge quickly, requiring from two to six iterations to satisfy the error tolerance. In the smaller-sized network cases, the two algorithms use ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:18

D. Phan and S. Ghosh Table III. Performance of Algorithm 2 Averaged over 50 Instances CPU Time is in Seconds


Negative Correlation CPUTime Iter. 0.15 ± 00.00 2.0 ± .0 1.05 ± 00.06 4.6 ± .3 3.10 ± 00.21 4.1 ± .2 4.29 ± 00.12 2.0 ± .0 22.20 ± 01.10 3.3 ± .1 174.07 ± 38.93 4.0 ± .8 201.46 ± 07.76 3.7 ± .1

Algorithm 2 Zero Correlation CPUTime Iter. 0.15 ± 00.00 2.0 ± .0 1.06 ± 00.06 4.6 ± .3 3.04 ± 00.23 4.1 ± .2 5.98 ± 00.52 2.4 ± .1 22.44 ± 01.25 3.3 ± .1 192.13 ± 37.43 4.4 ± .7 205.22 ± 11.72 3.4 ± .1

Positive Correlation CPUTime Iter. 0.15 ± 00.00 2.0 ± .0 1.09 ± 00.07 4.6 ± .3 2.96 ± 00.12 3.9 ± .1 7.02 ± 00.52 2.6 ± .1 22.50 ± 01.10 3.2 ± .1 156.84 ± 07.99 3.8 ± .1 232.69 ± 16.41 3.5 ± .2

Table IV. Performance of ADMM Averaged over 50 Instances CPU Time is in Seconds


Negative Correlation CPUTime Iter. 2.13 ± 00.16 28.4 ± 05.2 8.86 ± 01.36 48.2 ± 08.7 29.65 ± 05.54 45.2 ± 07.5 74.01 ± 10.64 47.5 ± 04.9 232.40 ± 39.75 36.4 ± 12.0 974.61 ± 112.5 27.6 ± 05.8 3180.3 ± 348.2 59.3 ± 08.1

ADMM Zero Correlation CPUTime Iter. 2.14 ± 00.13 28.6 ± 05.4 8.73 ± 01.29 46.5 ± 08.6 34.57 ± 05.94 53.8 ± 07.7 80.42 ± 12.57 51.1 ± 05.3 240.66 ± 36.30 37.8 ± 10.3 991.02 ± 104.4 28.4 ± 06.2 2967.8 ± 305.1 56.1 ± 07.5

Positive Correlation CPUTime Iter. 2.14 ± 00.15 28.4 ± 05.4 8.75 ± 01.27 46.6 ± 08.7 38.02 ± 06.76 62.3 ± 13.2 87.05 ± 12.33 55.2 ± 05.4 209.67 ± 25.45 34.6 ± 09.5 945.73 ± 93.72 26.3 ± 06.4 3055.8 ± 312.0 58.1 ± 07.7

the same number of iterations and similar CPU time to solve the problem. Moreover, the large number of scenarios do not seem to affect the number of outer iterations in both algorithms, although some variance does show up. For these large cases, the running time for Algorithm 2 is observed to be progressively less than those of Algorithm 1, whereas the number of iterations is less for Algorithm 1. The outer approximation provided by the individual linear cuts from Algorithm 1 provides a tighter bound to ωn(·), and so the required iterations never exceed those of Algorithm 2. On the other hand, Algorithm 2 uses fewer linear functions to provide the outer approximation (Mk ), and hence solvers take less time solving the Algorithm 2 problems. The presence of correlation does seem to have a minor affect on the runtimes of the algorithms, but no clear pattern is evident in the runtime and the magnitude of the correlation over all of the test cases. Finally, the ADMM approach requires an order of magnitude more iterations and CPU time but still scales well with the size of problems. This is in line with the fact that ADMM is a general purpose algorithm that places few restrictions on the problem structure. Figure 2(a) takes a closer look at the effect on the runtimes of the algorithms of the size of the scenario sets generated for the second stage. The runtimes exhibit a nearlinear growth with the size of the scenario set, and the second algorithm has a slightly lower runtime. In Figure 2(b), we report the effect of varying the standard deviation of the uncertainty of renewable generation on the total expected cost for the 30-bus test case. We use the average of 100 runs for each deviation. As the standard deviation increases, so does the cost, and the increase seems linear. Figure 2(c) plots the errors as defined on the left-hand side of (15) and (16) versus the number of iterations for two outer approximation algorithms to solve the 30-bus test case. The serial implementation of our proposed algorithms are clearly suitable for solving large-scale power systems with large numbers of scenarios. Moreover, the running ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.


2:19

Fig. 2. Performance of Algorithms 1 and 2. Table V. Estimated Running Time (in seconds) of Parallel Implementations Cases CH9 IEEE14 IEEE30 NE39 IEEE57 IEEE118 IEEE300

Algorithm 1 timem time p 0.04 ± 0.00 0.06 ± 00.00 0.09 ± 0.01 0.15 ± 00.01 0.11 ± 0.01 0.24 ± 00.02 0.13 ± 0.01 0.39 ± 00.03 0.27 ± 0.02 1.28 ± 00.08 26.08 ± 8.93 34.97 ± 10.32 31.63 ± 4.47 42.04 ± 05.08

Algorithm 2 timem time p .04 ± .00 0.06 ± 0.00 .08 ± .00 0.13 ± 0.01 .08 ± .01 0.22 ± 0.02 .08 ± .00 0.34 ± 0.03 .09 ± .01 1.09 ± 0.06 .25 ± .05 8.81 ± 1.72 .39 ± .02 9.54 ± 0.54

ADMM timem 0.51 ± 0.03 0.87 ± 0.04 1.26 ± 0.04 1.54 ± 0.08 1.32 ± 0.05 1.66 ± 0.11 7.04 ± 0.63

time p 0.77 ± 0.05 1.28 ± 0.11 2.74 ± 0.30 5.06 ± 0.64 12.01 ± 1.66 45.83 ± 4.76 139.22 ± 14.2

time can be reduced significantly if we exploit the decomposable characteristic of these algorithms. Each iteration of the algorithms runs the master-stage problem first and then multiple second-stage problems are run based on the output of the master-stage problem. The latter second-stage problems have no dependency on each other and hence can be implemented to solve in parallel in order to speed up the overall runtime. Let timem (times ) represent the cumulative time spent in the master problem (subproblem) by each algorithm. One can estimate the runtime time p of such a parallel implementation as time p = timem + κ

times , min(# proc, N)

where κ represents the efficiency factor of the parallel implementation, # proc represents the total parallel processes that are run, and N is the number of scenarios as defined earlier. The factor κ is strictly greater than 1.0 due to overheads such as message passing. In Table V, we report the estimated running times for parallel implementations in a workstation with 32 processors. The total runtimes for the algorithms are from the Zero Correlation columns of Tables II and III. The efficiency factor κ is assumed to be a typically observed value of 1.4. The effect on the runtime of Algorithm 1 is muted because of the large number of linear inequality constraints added to the master problem (Mk) per iteration, which makes (Mk) are large problem to solve serially to obtain the next first-stage solutions to evaluate in parallel. In contrast, Algorithm 2 adds one linear constraint to the master problem (Mk ) per iteration and hence manages to keep its timem low, thus benefiting greatly from the parallelism. The structure of the master-stage problems of ADMM algorithm does not change, and so ADMM too shows an improved estimated runtime. ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 1, Article 2, Publication date: January 2014.

2:20


6. CONCLUSIONS

This article proposes two algorithms to solve a two-stage nonconvex stochastic formulation for the ED problem faced by power transmission authorities under renewablegeneration uncertainty. The structure of the formulation hews to standard two-stage problems, in that certain generation decisions are made only in the first stage and the second stage realizes the actual renewable generation. Recourse for alleviating supply-demand mismatches in the second stage is provided by high marginal cost power sources that can be tapped in short order. We provide methods to solve an SAA of the stochastic formulation, where the uncertainty is captured by a finite number of scenarios. Our contributions to the literature lie in the novel outer approximation algorithms in which we propose to solve this nonconvex SAA problem to optimality. We show that for problem instances that satisfy certain conditions, an effective decomposition scheme can be used, just like in two-stage linear programs, to obtain a sequence of approximate solutions that has a limit point that is a globally optimal solution to the two-stage nonconvex SAA program. In particular, these methods are well suited for a parallel implementation; we provide estimates of expected speed-up under such an implementation for the numerical experiments described here. We also establish the consistency of the SAA approximations with respect to the true stochastic formulation as the size of the scenario set grows. We describe an alternative decomposition approach derived from the ADMM and compare our methods to this ADMM variant. Our experiments for a variety of parameter settings indicate that the two approximation schemes that we described are efficient and usable even in large practical instances. They perform an order of magnitude better on average thanthe ADMM-based decomposition. ACKNOWLEDGMENTS This material is based upon work supported by the U.S. Department of Energy under Award Number DEOE0000190.

REFERENCES M. Afonso, J. Bioucas-Dias, and M. Figueiredo. 2010. Fast image recovery using variable splitting and constrained optimization. IEEE Transactions on Image Processing 19, 9 (2010), 2345–2356. H. Bevrani and T. Hiyama. 2011. Intelligent Automatic Generation Control. CRC Press. P. Bonami, L. T. Biegler, A. R. Conn, G. Cornuéjols, I. E. Grossmann, C. D. Laird, J. Lee, A. Lodi, F. Margot, N. Sawaya, and A. Wëchter. 2008. An algorithmic framework for convex mixed integer nonlinear programs. Discrete Optimization 5, 2 (2008), 186–204. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2011. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning 3 (2011), 1–124. S. Boyd and L. Vandenberghe. 2004. Convex optimization. Cambridge University Press. J. Carpentier. 1962. Contribution to the economic dispatch problem. Bulletin Society Francaise Electriciens 8, 3 (August 1962), 431–447. J. H. Chow, D. K. Frederick, and N. W. Chbat. 2003. Discrete-Time Control Problems Using MATLAB and the Control System Toolbox. Thomson-Brooks/Cole, Pacific Grove, CA. K. Dragoon and M. Milligan. 2003. Assessing wind integration costs with dispatch models: a case study of PacifiCorp. In Windpower, 2003, Austin, TX. M. Duran and I. Grossmann. 1986. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Math. Program. 36, 3 (1986), 307–339. J. Eckstein and D. P. Bertsekas. 1992. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55 (1992), 293–318. E. Esser. 2009. Applications of Lagrangian-based alternating direction methods and connections to split Bregman. Technical Report CAM Rep. 0931. UCLA, Los Angeles.



2:21

R. Fletcher and S. Leyffer. 1994. Solving mixed integer nonlinear programs by outer approximation. Math. Program. 66, 1 (1994), 327–349. S. Frank, I. Steponavice, and S. Rebennack. 2012. Optimal power flow: a bibliographic survey I - Formulations and deterministic methods. Energy Systems 3, 3 (2012), 221–258. W. Fu and J. D. McCalley. 2001. Risk based optimal power flow. In 2001 IEEE Porto Power Tech Conference. D. Gabay and B. Mercier. 1976. A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2 (1976), 17–40. L. Gan, U. Topcu, N. Li, and S. Low. 2012. Exact Convex Relaxation for Optimal Power Flow in Tree Networks. (Aug. 2012). http://arxiv.org/pdf/1208.4076.pdf. A. M. Geoffrion. 1972. Generalized Benders decomposition. J. Optim. Theory Appl. 10, 4 (1972), 237–260. S. Ghosh, J. R. Kalagnanam, D. Katz, M. S. Squillante, and X. Zhang. 2010. Incentive design for lowest cost aggregate energy demand reduction. In Proceedings of the 1st IEEE International Conference on Smart Grid Communications. 519–524. S. Ghosh, J. R. Kalagnanam, D. Katz, M. S. Squillante, and X. Zhang. 2011. Integration of demand response and renewable resources for power generation management. In Proceedings of the 1st IEEE Power Power Engineering Society ISGT Meeting. H. Glavitsch and R. Bacher. 1991. Optimal power flow algorithms. In Analysis and Control System Techniques for Electric Power Systems, vol. 41. ACADEMIC Press Inc. R. Glowinski and A. Marrocco. 1975. Sur l’approximation par e´ léments finis d’ordre un, et la résolution, par pénalisation-dualité, d’une classe de problèmes de Dirichlet non lineaires. RAIRO Anal. Numér. 2 (1975), 41–76. A. Grothey, S. Leyffer, and K. I. M. Mckinnon. 1999. A Note on Feasibility in Benders Decomposition. University of Dundee Numerical Analysis Report NA-188. (1999). A. R. Hatami, H. Seifi, and M. K. Sheikh-El-Eslami. 2009. Hedging risks with interruptible load programs for a load serving entity. Decis. Support Syst. 48, 1 (2009), 150–157. M. R. Hestenes. 1969. Multiplier and gradient methods. J. Optim. Theory Appl. 4, 5 (1969), 303–320. R. A. Jabr and B. C. Pal. 2009. Intermittent wind generation in optimal power flow dispatching. IET Generation, Transmission & Distribution 3, 1 (2009), 66–74. B. H. Kim and R. Baldick. 1997. Coarse-grained distributed optimal power flow. IEEE Trans. Power Syst. 12, 2 (1997), 932–939. B. H. Kim and R. Baldick. 2000. A comparison of distributed optimal power flow algorithms. IEEE Trans. Power Syst. 15, 2 (2000), 599–604. J. Lavaei and S. H. Low. 2012. Zero duality gap in optimal power flow problem. IEEE Trans. Power Syst. 27, 1 (2012), 92–107. North American Electric Reliability Corporation (NERC) Standard. 2005. Standard BAL-002-0: Disturbance Control Performance. (2005). M. A. Pai. 1989. Energy Function Analysis for Power System Stability. Kluwer Academic Publishers, Boston, MA. D. T. Phan. 2012. Lagrangian duality and branch-and-bound algorithms for optimal power flow. Operations Research 60, 2 (2012), 275–285. M. J. D. Powell. 1969. A method for nonlinear constraints in minimization problems. Optimization (1969), 283–298. ´ A. Ruszczynski. 2003. Decomposition methods. In Handbook in Operations Research and Management Sci´ and A. Shapiro (Eds.). Elsevier, Amsterdam, ence, Volume on Stochastic Programming, A. Ruszczynski 141–211. N. V. Sahinidis and I. E. Grossmann. 1991. Convergence properties of generalized Benders decomposition. Computers & Chemical Engineering 15, 7 (1991), 481–491. M. Shahidehpour, H. Yamin, and Z. Li. 2002. Market Operations in Electric Power Systems: Forecasting, Scheduling, and Risk Management. New York: IEEE/Wiley-Interscience. A. Shapiro, D. Darinka, and R. Andrzej. 2009. Lectures on Stochastic Programming: Modeling and Theory. Society for Industrial and Applied Mathematics, Philadelphia. M. Tawarmalani and N. V. Sahinidis. 2005. A polyhedral branch-and-cut approach to global optimization. Math. Program. B 103, 2 (2005), 225–249. Y. Wang, J. Yang, W. Yin, and Y. Zhang. 2008. A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imaging Sci. (2008), 248–272. A. J. Wood and B. F. Wollenberg. 1996. Power Generation Operation and Control. John Wiley & Sons, New York.


2:22


K. Xie and Y. H. Song. 2000. Optimal spinning reserve allocation with full AC network constraints via a nonlinear interior point method. Electric Machines and Power Systems 28, 11 (2000), 1071–1090. Y. Xue, L. Chang, and J. Meng. 2007. Dispatchable distributed generation network - a new concept to advance DG technologies. In Power Engineering Society General Meeting, 2007. IEEE. 1–5. T. Yong, R. Entriken, and P. Zhang. 2009a. Reserve determination for system with large wind generation. In Power Energy Society General Meeting, 2009 (PES’09). IEEE. 1–7. T. Yong, R. Entriken, and P. Zhang. 2009b. Reserve determination for system with large wind generation. In IEEE PES Power Energy Society General Meeting. 1–7. R. D. Zimmerman, C. E. Murillo-Sanchez, and R. J. Thomas. 2011. MATPOWER: Steady-state operations, planning, and analysis tools for power systems research and education. IEEE Trans. Power Syst. 26, 1 (Feb. 2011), 12–19. Received December 2011; revised March 2013; accepted August 2013


2 Two-Stage Stochastic Optimization for Optimal ... - ACM Digital Library

2 Two-Stage Stochastic Optimization for Optimal ... - ACM Digital Library

Suggest Documents

Generating Box-Constrained Optimization ... - ACM Digital Library

Stochastic Simulator for Smart Microgrid Planning - ACM Digital Library

Locally regressive G-optimal design for image ... - ACM Digital Library

A Dynamic Optimization Framework for a Java ... - ACM Digital Library

Compiler Driven Data Layout Optimization for ... - ACM Digital Library

2 Two-Stage Stochastic Optimization for Optimal ... - Semantic Scholar

An Optimization Framework for Web Farm ... - ACM Digital Library

Optimization Algorithms for the Multiplierless ... - ACM Digital Library

Black-box optimization benchmarking for ... - ACM Digital Library

Design Rule Optimization of Regular layout for ... - ACM Digital Library

simulation and optimization for batch order ... - ACM Digital Library

design - ACM Digital Library

crpit - ACM Digital Library

Conversations - ACM Digital Library

Incentives - ACM Digital Library

Gunrock - ACM Digital Library

Abstract - ACM Digital Library

AdaGIDE - ACM Digital Library

MOVELETS - ACM Digital Library

An Optimal Method for Stochastic Composite Optimization

P10 - ACM Digital Library

2PXMiner - ACM Digital Library

feature - ACM Digital Library

C++ ... - ACM Digital Library