Online Strategizing Distributed Renewable Energy Resource Access ...

3 downloads 136835 Views 284KB Size Report
renewable resources, such as wind and solar, is variable and generally speaking ... discovery approach to discover all the available DRERs within a microgrid.
Online Strategizing Distributed Renewable Energy Resource Access in Islanded Microgrids Xi Fang, Dejun Yang, and Guoliang Xue Arizona State University Abstract— The smart grid, perceived as the next generation power grid, uses two-way flow of electricity and information to create a widely distributed automated energy delivery network. By grouping distributed renewable energy generations and loads, a microgrid, which is seen as one of the cornerstones of the future smart grids, can disconnect from the macrogrid and function autonomously. This intentional islanding of generations and loads has the potential to provide a higher local reliability than that provided by the power system as a whole. One of the fundamental issues for a user in an islanded microgrid is how to find the one among the distributed renewable energy resources (DRERs) in a microgrid, which can supply the power most efficiently, effectively and reliably, as its power supply source. This problem is difficult since the power pattern of renewable resources, such as wind and solar, is variable and generally speaking is not easy to accurately predict. In order to solve this problem, we first propose a distributed DRER discovery approach to discover all the available DRERs within a microgrid. Furthermore, based on the online machine learning theory, we propose two distributed algorithms according to the information the user can obtain, in order to compute a good DRER access strategy, with no assumption on what distribution the power patterns of the DRERs follow. We prove that when the time horizon is sufficiently large, on average the upper bound on the gap between the expected profit obtained at each time slot by using the global optimal strategy and that by using our algorithms is arbitrarily small. Keywords: smart grid, microgrid, distributed renewable energy resource, online learning, resource discovery

I. I NTRODUCTION Traditionally, the term grid is used for an electricity network which may support all or some of the following four operations: electricity generation, electricity transmission, electricity distribution, and electricity control. A smart grid is an upgrade of the 20th century power grids which generally “broadcast” power from a few central generators to a large number of users. A smart grid uses two-way flow of electricity and information to create a widely distributed automated energy delivery network. By integrating distributed computing and communications to deliver real-time information, the smart grid enables the near-instantaneous balance of supply and demand at the device level [16]. One of the key features in a smart grid is the distributed energy generation using renewable resources in order to improve the power quality [14]. Distributed renewable energy resources (DRERs), such as solar panels, small wind turbines, and etc., are often small-scale power generators used to provide an Fang, Yang, and Xue are all affiliated with Arizona State University, Tempe, AZ. Email: {xi.fang, dejun.yang, xue}@asu.edu. This research was supported in part by NSF grant 0901451 and ARO grant W911NF-09-10467. The information reported here does not reflect the position or the policy of the federal government.

I

P

M

a

c

r

o

g

r

i

n

o

f

w

o

r

e

m

r

a

f

t

l

o

i

o

n

f

l

o

w

w

d

M

i

c

r

o

g

r

i

d

Fig. 1. An example of a microgrid: the blue layer shows the information flow within this microgrid and the red layer shows the power flow.

alternative to or an enhancement of the traditional power system. Two way energy and information flows integrated with DRERs drive the development of a new grid paradigm, called microgrid, which is seen as one of the cornerstones of the future smart grids [4]. A microgrid is a localized grouping of electricity generations and loads, which normally operates connected to a traditional power grid (macrogrid). The single point of common coupling with the macrogrid can be disconnected. The microgrid can then function autonomously [9]. This operation will result in an islanded microgrid [12], in which distributed generators continue to power the users in this microgrid without obtaining power from the electric utility in the macrogrid. Fig. 1 shows an example of the microgrid. Thus, the multiple DRERs and the ability to isolate the microgrid from a larger network in disturbance would provide highly reliable electric power supply. This intentional islanding of generations and loads has the potential to provide a higher local reliability than that provided by the power system as a whole [10]. Although the microgrid is conceptually simple, we still face many challenges when implementing a microgrid in real networks. One of the fundamental issues for a user in an islanded microgrid is how to find the one among the DRERs in a microgrid, which can supply the power most efficiently, effectively and reliably, as its power supply source. We call this issue the DRER access strategy problem. More specifically, the question for a user is first how to know all the available DRERs in an islanded microgrid, and second how to find the DRER which can deliver the highest profit. Especially, the second one is more challenging since the power pattern of the renewable energy source is usually variable or intermittent [8]. Generally speaking, in practice the power pattern does not

follow a stationary or simple distribution, or the right statistical distribution estimate is not achievable, due to unpredictable environment factors. This makes the user very difficult to estimate which DRER it should access for obtaining the most profit. Let us consider a simple motivation example, where one user takes the solar panel as its main power supply source (through buying power from another user who is equipped with this solar panel). However, (we assume that) actually a wind turbine can provide the most stable and reliable long-term power pattern. Obviously, by using this wind turbine this user would probably obtain the highest profit. The challenge is how this user can know this wind turbine is the best power source candidate beforehand, since the power pattern of renewable resources is variable and generally speaking is not easy to effectively predict. This is essentially a paradigmatic example of the trade-off between exploration and exploitation. On one hand, if a user selects exclusively a DRER as its power supply source which it thinks is best (“exploitation”), it may not discover or exploit the DRER which is actually better. On the other hand, if this user spends too much time trying out all possible DRERs (“exploration”), it may fail to select the best one often enough to obtain a high profit. For simplicity, we assume that at any time point a user can obtain power from at most one DRER and it can have different DRERs as its power supply sources at different time points. In addition, we assume that the power can be intentionally delivered from a distributed generator to a user. In the future smart grid, this can be done by using power packet switching [17], in which the information is added to the electric power itself and power is distributed according to this information. In this paper, we address this DRER access strategy problem, and our contribution is two-fold: 1) We propose a distributed DRER discovery approach, called power map construction procedure, to discover all the available DRERs within a microgrid. 2) Inspired by the online machine learning theory, we propose two distributed algorithms based on the information the user can obtain, in order to compute a good DRER access strategy, with no assumption on what distribution the power patterns of the DRERs follow. We prove that when the time horizon is sufficiently large, on average the upper bound on the gap between the expected profit obtained at each time slot by using the global optimal strategy and that by using our algorithms is arbitrarily small. It should be noted that some researchers have studied renewable energy generations and microgrids. Vandoorn et al. [18] presented a method for active load control in islanded microgrids. It is shown that with the combination of the active power control and the presented active load control, the renewable energy can be exploited optimally. Ochoa and Harrison [15] proposed a useful method to determine the optimal accommodation of renewable distributed generation by using multiperiod AC optimal power flow. Guan et al. [7] applied microgrid technology to solving the problem of how to save energy in buildings. Atwa et al. [1] aimed to

minimize energy loss through the optimal mix of statisticallymodeled renewable sources considering a passive approach to network management. In addition, in order to optimize the renewable energy supply, the authors of [11, 13] used stochastic programming to cope with a time-varying power supply. To the best of our knowledge, this is the first paper to address the DRER access problem for islanded microgrids. The rest of this paper is organized as follows. Section II describes our system model. In Section III, we present our power map construction procedure and distributed strategy computation algorithms. Section IV reveals our numerical results. We conclude this paper in Section V. II. S YSTEM M ODEL In this paper, we consider an islanded microgrid, with K DRERs and N users. Each DRER is associated with one user while each user could be equipped with zero or several DRERs. A user who needs power supply should try to obtain power from one of these K DRERs through buying power from the user who is equipped with this DRER. These users exchange information via an information network, such as a wireless mesh network, a powerline communication system [6] or a cellular system among them. In the initial phase of the island mode, each user does not know the topology of the information network underlying this microgrid, and which user has what DRER(s). The power within this microgrid flows based on a power grid. We reasonably assume that both the topologies of the information network and the power grid underlying this microgrid are connected. Otherwise, such isolated user is not considered as a member of this microgrid. Each user u may select a different DRER as its power supply source after a certain period of time, called a time slot. This DRER is called its strategy for this time slot. In time slot t, by accessing DRER k user u can obtain a utility (u) (u) of Uk (t) ≥ 0, while paying Ck (t) ≥ 0 for the cost charged by the user equipped with this DRER. Therefore, user u can (u) (u) (u) collect a profit of Pk (t) = Uk (t) − Ck (t) in time slot t from DRER k. Considering that different users could have different utility functions and cost functions, we leave the users to define their own utility functions and cost functions. Note that we do not consider the user demand pattern either. However we do note that the profit obtained in time slot t is the result of the power pattern of the DRER, the utility function, the cost function, and the user demand pattern. Our task is to prove that our result holds generally as long as the profits are bounded no matter what utility functions and cost functions are, no matter what user demand pattern is and no matter what distribution the power supply pattern follows. Without losing generality, we (u) assume that Pk (t) is bounded by [0, PH ] for all k ∈ [1, K] and for all t = 1, 2, · · · . Although in each time slot a user only chooses one DRER as its power supply source and collects some profit, at the end of a time slot it may know the profit if it chooses another DRER in this time slot. We therefore consider two scenarios: (u) 1) Each user u can know the profit Pk (t) at the end of time slot t even if u does not choose the kth DRER as

S

S

1

S

H

4

S

5

1

S

1

4

S

1

S

H

4

H

2

3

2

M

O

P

=

S S

3

3

S

3

2

M

S

P

S (

=

S

( (

S

1

,

S

2

,

S

3

,

S

4

2

,

S

3

,

S

4

,

S

5

S

1

,

S

2

,

S

3

2

S

3

S

2

S

3

S

2

S

3

,

)

)

(

S

4

,

S

5

S

1

,

S

2

,

S

3

,

S

4

,

S

5

)

) 1

T

H H

2

H

2

H

M

H

2

H

(

S

1

,

S

2

,

S

3

,

S

4

P

M

S

=

P

2

=

T

M

2

T

O

M

P

S

P

=

=

T

3

M

T

O

M

P

S

P

=

=

2

2

)

1

O

P

3

T

M

O

3

3

(

S

1

,

S

2

,

S

3

,

S

4

,

S

5

)

=

2

M

S

P

=

S

S S

1

H

1

H

5

S

1

H

1

H

5

S

1

S

4

5

5

S

H H

1

H

1

H

5

S

5

H

4

H

5

4

4

H

4

4

M

O

P

=

1

(

S

1

,

S

2

,

S

3

,

S

4

, (

M

S

P

S

2

,

S

4

,

S

5

)

(

S

1

,

S

2

,

S

3

,

S

4

, (

=

S

1

,

S

2

,

S

3

,

S

4

,

S

5

)

4

T

3

M

O

P

M

O

P

=

5

T

M

O

P

=

4

=

M

O

P

=

S

5

)

S S

S

2

S

2

S

4

5

) S

4

2

S

4 1

2

M

S

P

T

1

=

M

S

P

M

S

P

=

2

T

M

S

P

=

=

(a) An example microgrid (b) After the first time interval (c) After the second time interval (d) Power maps Fig. 2. An example of the power map construction procedure: Fig.2(a) shows an example microgrid with five users and five DRERs. The blue dashed lines show the information network connecting these five users. Each red solid line shows an association relationship between a user and a DRER. Fig.2(b) and Fig.2(c) show the status after the first and the second time interval, respectively. The red labels represent the DRERs which are newly known in the current time interval, while the black ones represent the DRERs which are already known by this user. Note that after two time intervals, the DRERs in this microgrid have been known by all the users. Fig.2(d) shows the power maps constructed by each user according to their requirement thresholds: TM OP and TM SP .

its strategy for time slot t. We call this scenario the full information scenario. This is possible since at the end of each time slot the power pattern of each DRER in this time slot is known already. A user can estimate its profit for each DRER based on the power pattern and the cost notified by the user equipped with this DRER. (u) 2) User u can only know the profit Pk (t) at the end of time slot t if u chooses the kth DRER as its strategy (u) for time slot t. The profit Pi (t) for i 6= k is “hidden” so that u will never know. We call this scenario the partial information scenario. This is also possible. For example, the power pattern of a DRER may not be broadcast by the user equipped with this DRER due to the privacy concern. Recall that we have no assumption on what distribution the power patterns of the DRERs follow and what the user demand pattern is, and no requirement on what utility and cost functions should be (except for the bounded range). We thus summarize the value of profit function as follows: for any (u) (u) k ∈ K, < Pk (1), Pk (2), ... > is a sequence of unknown arbitrary numbers bounded by a real interval [0, PH ]. In the (u) full information scenario, the value of Pk (t) for k ∈ [1, K] can be observed by user u at the end of time slot t. In (u) the partial information scenario, the value of Pk (t) can be observed at the end of time slot t, only if u selects k as its strategy for time slot t. Let T denote the duration (time horizon) of user u’s power obtaining session. If user u follows algorithm A, which chooses a sequence of strategies k (u) (1) ∈ [1, K], ..., k (u) (T ) ∈ [1, K], its total profit till time slot T > 0 is computed as: (u)

GA (T ) =

T X t=1

(u)

Pk(u) (t) (t).

(1)

To quantify the performance, we study a typical measurement: user u’s regret, defined by (u)

(u)

(u) RA (T ) = Gmax (T ) − GA (T ),

(2)

PT (u) (u) where Gmax (T ) = maxk∈[1,K] t=1 Pk (t) is u’s total profit till T if it keeps choosing the global optimal strategy. Intuitively, having low regret means that you do almost as well as the global optimal strategy would have done.

III. DRER ACCESS S TRATEGY A NALYSIS A. Power map construction procedure In the initial phase of the island mode, all the users must discover all the available DRERs in this microgrid, and further decide a subset of these available DRERs as its candidate set of power supply sources based on the maximum output powers (MOP) and the minimum sale prices (MSP) of DRERs. MOP and MSP are two attributes associated with each DRER. The user equipped with this DRER will decide these attributes. We present a distributed synchronous DRER discovery protocol to find all the available DRERs in this microgrid. Each user maintains a power map table, consisting of a series of power map table entries for all its known DRERs: . Obviously, at the beginning each user only knows the DRER(s) it is equipped with. The timeline is divided into a sequence of time intervals of a constant length. In each time interval, each user sends its power map table to all its immediate neighbors. At the end of each time interval, each user updates its power map table based on the received new power map table entries sent from its neighbors. Theorem 3.1 characterizes the time intervals required to discover all the available DRERs. Theorem 3.1: After at most N time intervals, all the users will have discovered all the available DRERs within the microgrid. 2 The proof is given in Appendix. Since different DRERs could provide different services, each user will only select the subset of DRERs, which satisfy its requirement, as the candidate set of its power supply sources. We call this candidate set the power map of this user. We consider two obvious requirements: MOP and MSP. More specifically, a DRER is included in the power map of a user if and only if the MOP of this DRER is no less than TMOP (an MOP threshold set by this user) and the MSP is no greater than TMSP (an MSP threshold set by this user). Note that other requirements, such as average output power, historical variance of output power, and etc., could also be used as filter criteria. Fig.2 illustrates the power map construction procedure using a simple example. Discussion: Although Theorem 3.1 indicates that this procedure can terminate after N time intervals, in practice N might be much larger than the number of the time intervals which are necessarily required (denoted by D). For example, the simulation results in Section IV show that the average number of time intervals required is no more than 9 even in a microgrid with 100 users. We thus propose the following

“guessing” scheme (an enhancement to our basic power map construction procedure) to decide when the procedure can be terminated earlier. An obvious stopping criterion is that all the users do not update their power map tables in the current time interval since no new power map table entries are received. We need the following simple global status checking as a building block: every user sends one bit to a leader (elected using well-known leader-election algorithms) to notify whether it updates or not. If this leader finds no user does update operation, it sends back one bit procedure termination signaling. Although this global status checking involves global information exchanges, obviously the overhead is very small. In order to reduce the number of global status checking times, the “guessing” scheme works as follows: after 20 , 21 , ... time intervals, we do global status checking. Obviously although D is unknown, after at most ⌈log2 D⌉ global status checking operations, the procedure terminates. B. Strategy selection procedure After constructing a power map, each user will compute a sequence of strategies to explore and exploit the DRERs in its power map, with the goal of keeping the regret as small as possible. Inspired by the online machine learning algorithms [2, 3], we propose two algorithms for the full information scenario and the partial information scenario, respectively. 1) Full information scenario: At the beginning of each time slot, each user u runs the FSSA algorithm shown in Algorithm 1 to compute the DRER it will access in the current time slot. The algorithm can be done in O(K) time for each time slot. FSSA is a variant of the Hedge algorithm [2] by setting different parameters. The basic idea is to chooseDRER k at time slot t with probability proportional  p to exp wk (t) ln(1 + 2(ln K)/T )/PH , where wk (t) is the total profit obtained by choosing DRER k up through time slot t − 1. By using this mechanism, the DRER leading to more profit quickly gains a higher probability of being chosen. Algorithm 1 Full information strategy selection algorithm (FSSA) 1: if t = 1 then (u) 2: for each k ∈ [1, K], wk (1) = 0 3: else (u) (u) (u) 4: for each k ∈ [1, K], wk (t) = wk (t−1)+Pk (t−1) 5: choose DRER k according to √ the following distribution: (u)

pk (t) =

strategy and that by using FSSA is upper bounded (say, sublinear in T ). The second property implies that when T is sufficient large, on average, the upper bound on the gap between a user’s profit obtained at each time slot by choosing the global optimal strategy and that by using FSSA can be arbitrarily small. 2) Partial information scenario: Similar to FSSA, at the beginning of each time slot, each user u runs the PSSA algorithm shown in Algorithm 2 to compute the DRER it will use in the current time slot. The algorithm can be done in O(K) time for each time slot. PSSA is a variant of the Exponential-weight algorithm [3] by setting different parameters and using a different computation method for estimated profits (explained below). Note that for the actual (u) chosen DRER k (t) (u), PSSA sets the estimated profit Pˆi (t) P

(t)

(u)

to (u)i . Intuitively, dividing the actual profit by the pi (t)PH probability that the strategy was chosen compensates the profits of strategies that are unlikely to be chosen. Note that like FSSA, in each time slot the user chooses its strategy according to a computed probability as shown in Line 8. However, in PSSA, this distribution is a mixture of the uniform distribution and a distribution which assigns to each strategy a probability mass exponential in the estimated cumulative profit for that strategy. Intuitively, this uniform distribution enforces the algorithm to try out all the strategies (i.e. “exploration”) and helps the algorithm to get good estimates of the profits for each strategy. The other distribution, which is related to the cumulative estimated profit for that strategy, helps the DRER leading to more profit quickly gain a higher probability of being chosen (i.e. “exploitation”). Algorithm 2 Partial information strategy selection algorithm (PSSA) q K ln K 1: γ = min{1, (e−1)T } 2: if t = 1 then (u) 3: for each i ∈ [1, K], wi (1) = 1 4: else 5: for each i ∈ [1, K] do 6:

(u) Pˆi (t

  

− 1) =

(u)

exp wk (t) ln(1+ 2(ln K)/T )/PH   √ PK (u) (t) ln(1+ 2(ln K)/T )/PH i=1 exp wi

The following theorem characterizes the properties of this algorithm, and the proof is given in Appendix. (t) Theorem 3.2: For any sequence of profits Pk (u), for any K > 0, and for any T > 0, the following properties hold for any user u: √ (u) 1) E(RF SSA (T )) ≤ PH 2T ln K; (u)

R

(T )

2) limT →∞ E( F SSA ) ≤ 0. 2 T Remark: The first property indicates that the gap between a user’s expected profit obtained by choosing the global optimal

7: 8:

(u)

(u)

(t−1)

Pi

(u)

(u) pi (t

− 1)PH   0, otherwise

, if i = k (u) (t − 1)

ˆ (u) (t−1) P

wi (t) = wi (t − 1) exp(γ i K ) choose DRER k according to the following distribution: w

(u)

(u)

pk (t) = (1 − γ) P K k

(t) (u)

i=1 wi

(t)

+

γ K

The following theorem characterizes the properties of this algorithm, and the proof is given in Appendix. Theorem 3.3: For any sequence of profits Pk (t), for any K > 0, and for any T > 0, the following properties hold for any user u: √ (u) 1) E(RP SSA (T )) ≤ 2.62PH T K ln K; (u)

2) limT →∞ E(

RP SSA (T ) ) T

≤ 0.

2

6 4

2 100

Num 70 ber of

100 70

40

users

10

(N)

40 10

r of DR

Numbe

)

ERs (K

(a) The number of time intervals

0.4 0.3 0.2 0.1 0 10

Tim 100 e ho 1000 rizon (T) 10000

100 70 40 10

Numbe

) ERs (K

r of DR

0.8 0.6 0.4 0.2 0 10

Tim 100 e ho 1000 rizon (T) 10000

100 70 40 10

r Numbe

Rs (K)

of DRE

(b) The normalized regret (FSSA) (c) The normalized regret (PSSA) Fig. 3.

Running time (sec)

8

Normalized regret

Normalized regret

Number of time intervals

−3

10

2 x 10 FSSA PSSA 1.5 1 0.5 0

10

40 70 Number of DRERs(K)

100

(d) The running time

Simulation results

Remark: Note that compared with FSSA, the upper bound shown in the first property is bigger. This is expected since in the partial information scenario we only know the partial information (i.e. the profit from the strategy the user chooses) after each time slot, while in the full information scenario we know the full information (i.e. the profits from all the strategies) after each time slot. IV. N UMERICAL R ESULTS In this section, we verify the performance of our approaches using extensive simulations. We simulated different scenarios by varying the values of N , K, and T . For a tuple < N, K >, we first randomly generated a connected graph with N users and 4N edges as the topology of the information network underlying the microgrid, then randomly generated K DRERs, and finally randomly associated each DRER with one user. We simulated a group of DRERs with heterogenous power generating qualities. The profit delivered from a DRER follows a normal distribution with the mean µ being randomly selected µ , in the range [0.1, PH ], and the variance δ being set to 10 subject to the bound constraint [0, PH ], where PH = 1. The results were averaged over 10 test cases. All tests were performed on a 1.8GHz Linux PC with 2G bytes of memory. Fig.3(a) shows the average number of time intervals required by our power map construction procedure when all the users discover all the available DRERs within the microgrid. Note that we do not evaluate the time delay and message exchange in each time interval since only local information exchange is required. In literature, there exist a large number of works focused on the local information exchange. We observe that the average number of time intervals is no more than 9 even in a microgrid with 100 users. Thus the power map construction procedure is very efficient. In Fig.3(b)-Fig.3(c), we use normalized regret as our metric, which is computed as RAT(T ) . It represents, on average, the gap between the profit obtained at each time slot by using the global optimal strategy and that by using algorithm A. We observe that the average normalized regrets of both FSSA and PSSA decrease gradually as the time horizon increases. This is expected, because FSSA and PSSA gradually learn the DRERs which deliver the most profits. Moreover, we observe that the normalized regret of FSSA decreases much faster than that of PSSA as the time horizon increases. This is because in the partial information scenario we only know the profit from the strategy the user chooses after each time slot, while in the full information scenario we know the profits from all the strategies after each time slot. In a brief, the more information a user can get, the smarter the strategy decision is. Fig.3(d) compares the running time of computing a strategy for one time slot by varying the values of K. In all the cases

studied the average running times are no more than 2ms. V. C ONCLUSION In this paper, we have studied DRER access strategy problem for islanded microgrids. First, we have proposed a distributed DRER discovery approach to discover all the available DRERs within a microgrid. Furthermore, based on the online machine learning theory, we have proposed two distributed algorithms according to the information the user can obtain, in order to compute a good DRER access strategy, with no assumption on what distribution the power patterns of the DRERs follow. We have proved that when the time horizon is sufficiently large, on average the upper bound on the gap between the expected profit obtained at each time slot by using the global optimal strategy and that by using our algorithms is arbitrarily small. Since online machine learning provides a useful way to analyze the system evolution, we believe that it will play an important role in the analysis of renewable resource utilization. R EFERENCES [1] Y. M. Atwa, E. F. El-Saadany, M. M. A. Salama, and R. Seethapathy, “Optimal renewable resources mix for distribution system energy loss minimization,” IEEE Trans. Power Syst. to appear. [2] P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, “Gambling in a rigged casino: The adversarial multi-armed bandit problem,” FOCS 1995, pp. 322-331. [3] P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire, “The nonstochastic multi-armed bandit problem,” SIAM Journal on Computing’03. [4] European technology smart grid platform. (2006). “Smartgrids: Vision and strategy for european electricity networks of the future.” http://www. smartgrids.eu/documents/vision.pdf. [5] X. Fang, S. Misra, G. Xue, and D. Yang; Smart Grid - The New and Improved Power Grid: A Survey; IEEE Communications Surveys and Tutorials, to appear. [6] H. Ferreira, L. Lampe, J. Newbury, and Eds T. Swart, “Power line communications,” 1st ed. New York, NY: John Wiley and Sons. [7] X. Guan, Z. Xu, and Q. Jia, “Energy-efficient buildings facilitated by microgrid,” IEEE Transcations on Smart Grid, Vol. 1, no. 3, 2010, pp.243-252. [8] A. Ipakchi and F. Albuyeh, “Grid of the future”, IEEE Power and Energy Magazine, Vol. 7, no. 2, 2009, pp.52-62. [9] S. M. Kaplan and F. Sissine, Smart grid: Modernizing electric 16 power transmission and distribution; energy independence, storage and security; energy independence and security act and resiliency; integra (government series). [10] R. H. Lasseter and P. Piagi, “Microgrid: A conceptual solution,” PESC’04. [11] X. Liu, “Economic load dispatch constrained by wind power availability: A wait-and-see approach,” IEEE Transcations on Smart Grid, vol 1, no. 3, 2010, pp. 347-355. [12] J. Mitra, S. B. Patra, and S. J. Ranade, and “Reliability stipulated microgrid architecture using particle swarm optimization,” 9th Int. Conf. Probabilistic Methods Applied to Power Syst. (2006). [13] M. J. Neely, A. S. Tehrani, and A. G. Dimakis, “Efficient algorithms for renewable energy allocation to delay tolerant consumers,” IEEE SmartGridComm’10, pp. 549-554. [14] National Institute of Standards and Technology, “Nist framework and roadmap for smart grid interoperability standards, release 1.0, united states national institute of standards and technology Jan. 2010.”

[15] L. F. Ochoa and G. P. Harrison, “Minimizing energy losses: Optimal accommodation and smart operation of renewable distributed generation,” IEEE Transcations on Power Systems, vol. 26, no. 1, 2011, pp. 198-205. [16] S. Rohjansand, M. Uslar, R. Bleiker, J. Gonz´alez, Michael Specht, Thomas Suding, and Tobias Weidelt, “Survey of smart grid standardization studies and recommendations,” IEEE SmartGridComm’10. [17] T. Takuno, M. Koyama, and T. Hikihara, “In-home power distribution systems by circuit switching and power packet dispatching,” IEEE SmartGridComm’10. [18] T. L. Vandoorn, B. Renders, L. Degroote, B. Meersman, and L. Vandevelde, “Active load control in islanded microgrids based on the grid voltage,” IEEE Transcations on Smart Grid to appear.

A PPENDIX A P ROOF OF T HEOREM 3.1 Proof. We arbitrarily select one user u, and use D(h) to denote the set of users which are h hops far away from u in the information network. Note that the value of hops is computed based on the shortest path from u to this user. Obviously after the first time interval, all the users in D(1) will know the DRERs associated with user u since u sends this information to its immediate neighbors. After the second time interval, all the users in D(2) will know the DRERs associated with user u since all the immediate neighbors of u send this information to their immediate neighbors. This procedure continues until all the users in this microgrid know this information. Obviously, no more than N time intervals are required since no user can be more than N hops far away from u. Recall that u is arbitrarily selected. Thus the proof is complete. A PPENDIX B P ROOF OF T HEOREM 3.2 Proof. In the following proof, we omit specifying user u in √all the notations for simplicity. We use η to denote ln(1+ 2(ln K)/T ) Hη and ΦPH (η) = exp(PH η)−1−P . Now we 2 PH PH consider two cases. The first case is that T ≤ 2 ln K. The first property of Theorem 3.2 trivially holds since considering E(RF SSA (T )) ≤ PH T we know that √ (1) E(RF SSA (T )) ≤ PH T ≤ PH 2T ln K. The second case is that T > 2 ln K. Note that compared with the Hedge algorithm [2] FSSA uses a different input parameter for coping with the issue of different profit upper bounds. Our proof for this case is based on the proof of Theorem 3.1 for the Hedge algorithm in [2], which indicates that for all strategies i = 1, ..., K T X K X

≥ ≥

t=1 k=1 T X t=1

T X

pk (t)Pk (t)

Pi (t) −

T

K

ln K ΦPH (η) X X − pk (t)Pk (t)2 η η t=1

ln K ΦPH Pi (t) − − PH η η t=1

k=1

T K (η) X X

pk (t)Pk (t).

t=1 k=1

Transforming the inequality above, we have P T X K X maxi∈[1,K] Tt=1 Pi (t) − pk (t)Pk (t) ≥ Φ (η) 1 + PH PHη t=1 k=1

ln K η

.

Taking the expectation on both sides, we have T X K X E( pk (t)Pk (t)) t=1 k=1

T X

! ln K ΦPH (η) )/(1 + PH ) ≥E (max Pi (t) − i η η t=1 p PT ln(1 + 2(ln K)/T )E(maxi t=1 Pi (t)) PH ln K p = −p . 2 ln K/T 2 ln K/T

According to ln(1 + y) ≥ y − we have

y2 2 ∀y

∈ (0, 1) and T > 2 ln K,

T X K X E( pk (t)Pk (t))

pt=1 k=1  PT 2(ln K)/T − (ln K)/T E(maxi t=1 Pi (t)) − PH ln K p ≥ . 2 ln K/T By the definition of RA (T ), we hence have r 1 PH ln K E(RF SSA (T )) ≤ T ln KPH + p 2 2 ln K/T √ = 2T ln KPH .

(2)

Combining (1) and (2), we hence prove Theorem 3.2. A PPENDIX C P ROOF OF T HEOREM 3.3 Proof. In the following proof, we omit specifying user u in all the notations for simplicity. Now q we consider two K ln K cases. The first case is that γ = 1 ≤ (e−1)T . The first property in Theorem 3.3 holds trivially, since considering E(RP SSA (T )) ≤ PH T , we know s T K ln K E(RP SSA (T )) ≤ PH T ≤ PH (e − 1) √ ≤ 2.62PH T K ln K. (1) q K ln K The second case is that γ = (e−1)T ≤ 1. Note that compared with the Exponential-weight algorithm [3], PSSA uses a different input parameter and a different method for computing estimated profits to cope with the issue of different profit upper bounds. Our proof for this case is based on the proof of Theorem 3.1 for the Exponential-weight algorithm in [3], which indicates that PH K ln K E(RP SSA (T )) ≤ (e − 1)γGmax (T ) + . γ Note that we scale the second term on the right hand of that theorem by PH since the upper bound of profits in this paper is PH instead of 1 used in [3]. We thus have PH K ln K E(RP SSA (T )) ≤ (e − 1)γPH T + γ p = 2PH (e − 1)T K ln K. (2) Combining (1) and (2), we hence prove Theorem 3.3.