Addressing the stochastic nature of energy management ... - IEEE Xplore

Addressing the stochastic nature of energy management in smart homes Chanaka Keerthisinghe, Student MIEEE, Gregor Verbiˇc, Senior MIEEE, and Archie C. Chapman, MIEEE School of Electrical and Information Engineering The University of Sydney Sydney, New South Wales, Australia Email:{chanaka.keerthisinghe, gregor.verbic, archie.chapman}@sydney.edu.au Abstract—In the future, automated smart home energy management systems (SHEMSs) will assist residential energy users to schedule and coordinate their energy use. In order to undertake efficient and robust scheduling of distributed energy resources, such a SHEMS needs to consider the stochastic nature of the household’s energy use and the intermittent nature of its distributed generation. Currently, stochastic mixed-integer linear programming (MILP), particle swarm optimization and dynamic programming approaches have been proposed for incorporating these stochastic variables. However, these approaches result in a SHEMS with very costly computational requirements or lower quality solutions. Given this context, this paper discusses the drawbacks associated with these existing methods by comparing a SHEMS using stochastic MILP with heuristic scenario reduction techniques to one using a dynamic programming approach. Then, drawing on analysis of the two methods above, this paper discusses ways of reducing the computational burden of the stochastic optimization framework by using approximate dynamic programming to implement a SHEMS. Keywords—future grid, demand response, smart home, stochastic mixed-integer linear programming, dynamic programming, scenario reduction techniques, approximate dynamic programming.

I.

I NTRODUCTION

F

UTURE electrical power grids will be based on digitalized information and communication technologies that enable bi-directional flow of information and electrical power. Also referred to as the “smart grid”, the aim is a more efficient, reliable and green electrical power system. The development of this future grid is expected to enable utilities and customers to seize the full potential of demand side management (DSM) and demand response (DR) programs [1]. On the one hand, utilities are looking at DSM and DR programs to better manage their electricity networks; while on the other hand, DR programs encourage customers to reduce loads during periods of critical network congestion or periods of high energy prices, in return reducing their electricity costs. Given this context, residential and commercial buildings will play a major role in any DR program, and this paper focuses on residential buildings.

In particular, a smart home is envisioned as an automated residential building that uses distributed energy resources (DERs) for managing energy consumption and providing suitable levels of comfort to the end-user. As depicted in Fig. 1, DERs in a smart home can consist of: distributed generation (DG) units, such as rooftop photovoltaic systems (PVs), fuel cells and micro-turbines; energy storage units, such as a battery storage, electric vehicle (EV) battery storage and thermal

energy storage (TES) units (i.e. hot water tank); and enduser loads, such as cyclical loads (i.e. heating, ventilation and air conditioning (HVAC) system, refrigerator and hot water heater) and shiftable loads (i.e. washing machine, dryer and dishwasher). A smart home energy management system (SHEMS) can help a customer seize the full financial benefits of the available DERs. However, in order to undertake efficient and robust scheduling of DERs, a SHEMS needs to consider the stochastic nature of the household’s energy use and the intermittent nature of its DG, which includes electrical power generation from DG units, electrical power demand, thermal energy demand, availability of an electric vehicle (EV) as storage and the required end-of-day battery state of charge (SOC). Incorporating these stochastic variables using appropriate probability distributions is an important aspect of an effective SHEMS. Currently, proposed methods for implementing a SHEMS include: mixed-integer linear programming (MILP) [2]–[4], particle swarm optimization (PSO) [5]–[7] and dynamic programming approaches [8]. As presented in [2]–[4], MILP is widely used in SHEMSs mainly because of the simplicity associated with off-the-shelf MILP solvers, such as MOSEK and CPLEX. In [5]–[7], PSO is used for robust scheduling of the DERs in a smart home. However, neither MILP nor PSO considers the uncertainties in a household’s energy use and the intermittent nature of its DG. A comparison of a stochastic and deterministic optimizations using MILP is presented in [9]. The paper concluded that stochastic optimization has the better quality solutions as it considers the uncertainties in the household. However, the energy management problem becomes computationally challenging. Robust scheduling of DERs with stochastic programming using PSO is presented in [7], which is only preliminary work and the accuracy of the results depends on the selection method used for reducing the number of scenarios. A dynamic programming approach for a SHEMS presented in [8] considers the stochastic variables of the household. However, the computational complexity increases exponentially with dimensionalities of state, outcome and action spaces, making it impractical in large-scale or long-horizon problems. Within this context, this paper discusses the drawbacks associated with these existing methods, such as the computational burden and the quality of the solutions. In order to do this, a SHEMS that uses stochastic MILP with heuristic scenario reduction technique is compared to a one that uses a

Formally, the problem is given by: PV System

Battery Storage

minimize s.t.

Electrical Grid

Controller

=

+

+ Fuel Cell

+

~

Electric Vehicle

Thermal Energy Storage

Demand in the Smart Home Electrical energy Thermal energy

Fig. 1.

Illustration of electrical and thermal energy flows in a smart home

fo (X, S), xik and sik satisfy device and user constraints, ∀k ∈ {1 . . . N } , ∀i ∈ I . (1)

The state variable, sik ∈ S, contains the minimum necessary and sufficient information used to compute the decision/control variable, xik ∈ X, which is a control action for each device, i ∈ I , over the decision horizon. Device i’s constraints consist of power limits and mathematical models describing the operational characteristics. The user comfort criteria consider variables, such as the internal temperature of the household, and whether the shiftable loads are completed on time.

dynamic programming approach. The paper is structured as follows: Section II states the general optimization problem and describes the stochastic variables of the household. This is followed by a description of the stochastic MILP and dynamic programming formulations in Section III and IV, respectively. Section V consists of a description of the two implemented SHEMSs. Then, the simulation results are used to illustrate the drawbacks associated with the existing methods in Section VI. Drawing on the analysis of the simulation results, Section VII presents ways of reducing the computational burden of the stochastic optimization framework by using approximate dynamic programming methods for implementing a SHEMS. II.

G ENERAL FORMULATION OF THE ENERGY MANAGEMENT PROBLEM

The smart home considered in this paper is depicted in Fig. 1, and consists of a PV system, micro-combined heat and power (MCHP) fuel cell system, battery storage, EV and TES unit. The decision horizon is a 24 hour period starting at 6 : 30 am, divided into 15 minutes time-steps, giving a total of N = 96 time-steps, where k denotes a particular time-step. A. Optimization problem Depending on the customer’s choice, the objective function of the optimization problem can consider different criteria, such as energy costs, user comfort preferences, energy consumption, CO2 emissions costs and peak load. The general formulation of the objective function used in this paper is represented using control/decision and state variables. Let: xi =

xi1 . . . xiN

,

and

si =

si1 . . . siN

,

where xi and si are the decision and state variables, respectively, of device i ∈ I over the decision horizon, for all k ∈ {1 . . . N }. Furthermore, let X and S be matrices of decision and state variables, respectively, for all devices; that is: ⎡ 1 ⎡ 1 |I | ⎤ |I | ⎤ x1 · · · x1 s1 · · · s1 ⎢ .. ⎥ , and S = ⎢ .. .. ⎥ .. .. X = ⎣ ... ⎣ . . . . ⎦ . ⎦ x1N

···

|I |

xN

s1N

···

|I |

sN

B. Stochastic variables Currently, proposed SHEMSs require accurate weather forecasts and an exact knowledge about the behavioral patterns of the inhabitants to generate efficient schedules for DERs. However, making such forecasts and predictions is difficult to do in the real world. Moreover, as explained in [7], inaccurate predictions may result in additional costs, and may affect the comfort of the inhabitants. Therefore, incorporating these uncertainties using appropriate probabilistic models is an important aspect of an effective SHEMS. 1) PV output: PV output depends on weather patterns, and a forecast is assumed to be available before the horizon starts. Even though the forecast may be fairly accurate, uncertainties in transitions of the weather (i.e. going cold or sunny from the given prediction) need to be considered. 2) Electrical demand of the household: Electrical demand depends on the number of occupants and their behavioral patterns. However, modeling these behavioral patterns is difficult [10]. Therefore, it is often assumed that the electrical demand is directly related to the number of occupants in the household at a given time. As such, the probabilities for occupancy transitions (i.e. all home, some home and all away) should also be considered. 3) Thermal demand of the household: Thermal demand is typically considered to be primarily related to the external temperature. Therefore, transition probabilities for ambient temperature should be considered. 4) Availability of the EV: It is difficult to mathematically model the tendency of the inhabitants to use their EVs. Typically, it is assumed that the occupants have a fair knowledge about the transition probabilities associated with the availability of the EV as storage. 5) The required end of the day battery SOC: As the PV output is one of the main sources of energy in a smart home, the required battery SOC at the end of the day should depend on the PV output and the electrical demand in future days. As such, controlling battery SOC will be a special focus of attention in Section VII-B of this paper.

III.

S TOCHASTIC MILP APPROACH

Deterministic versions of the energy management problem are often solved using a MILP approach, which is an optimization of a linear objective function subject to linear constraints with continuous and integer variables [9]. However, this method can also be applied to the stochastic optimization problem formulated in Section II by using a scenario-based approach, which is referred to as stochastic MILP. In order to use a stochastic MILP approach, the device i’s operational constraints are linearized and a large number of scenarios is generated by sampling from possible combinations of the stochastic variables mentioned in Section II-B. A larger number of scenarios should improve the solutions generated by better incorporating the stochastic variables, but this imposes a greater computational burden. Therefore, heuristic scenario reduction techniques are employed to obtain a scenario set of size J, which can be solved within a given time with reasonable accuracy. A scenario-based stochastic MILP formulation of the problem is described by: minimize

J

Pj (si )fo (X, S),

where Pj (si ) is the probability of a particular scenario j corresponding to realizations of stochastic variables si . The sum of the probabilities of all the scenarios in J adds up to 1:

DYNAMIC PROGRAMMING

A. Markov decision processes (MDPs) A MDP is a discrete time state-transition system described by: a state space, (s ∈ S); a decision space, (x ∈ X ); transition functions, P(s |sk , xk ), describing the probability of landing on state s , given action xk from state sk after a realization of random variables, ωk ; and a contribution (i.e. reward or cost) function, Ck (sk , xk , s ), which is an instantaneous utility function that describes how good/bad it is to take an action in a particular state. It is often better to collect the rewards sooner than later, so a discount factor, γ, is used to account for this in some problems; γ ∈ (0, 1]. B. Value functions and optimality conditions Dynamic programming is used to solve the optimization problem of the form: N −1

π min E γ k Ckπ (sk , π(sk ), ωk ) + γ N CN (sN ) , (4) k=0

(5)

C. Optimal policy An optimal policy, π ∗ , is one that minimizes (5). It can be found recursively by computing the optimal value function, ∗ Vkπ (sk ), using Bellman’s optimality condition:

∗ ∗ π Vkπ (sk ) = min Ck (sk , xk (sk )) + γ k E Vk+1 (s )|sk . xk ∈Xk

The expression in (6) is easily computed using backward induction, a procedure called value iteration, and then the optimum policy can be extracted from the value function by selecting the minimum value action for each state. I MPLEMENTED SMART HOME ENERGY MANAGEMENT

(3)

The stochastic energy management problem formulated in Section II can be solved using dynamic programming, since the problem is a discrete time dynamic system (i.e. can be modeled as Markov decision processes) with a cost function that is additive over time. In this section, Markov decision processes are described, value functions and optimality conditions are explained and a method to obtain the optimal policy is presented.

π

s ∈S

V. Pj (si ) = 1.

j=1

IV.

The expected future discounted cost of following a policy, π, starting in state, sk , is V π (sk ), which is given by the value function:

V π (sk ) = P(sk , π(sk ), s ) C(sk , π(sk ), s ) + γ k V π (s ) .

(6) (2)

j=1

J

where: π is a policy, a choice of action for each state, π : S → X ; CN (sN ) is a terminal cost incurred at the end of the horizon; and Ck (sk , xk , ωk ) is the cost incurred at a given time-step k, which accumulates over time. Therefore, the cost function is additive, and the problem is formulated as an optimization of the expected cost because the cost is generally a random variable due to the effect of ωk [11].

SYSTEMS

In this section, the stochastic energy management problem described in Section II is solved using the two solution methods, stochastic MILP and dynamic programming, which are explained in Section III and IV, respectively. A. Formulation of the problem The smart home consists of a PV system, fuel cell MCHP system, battery storage, EV and TES unit. Specifically: 1) State variables are used to represent the battery SOC, sB k, C TES unit state, sTk , fuel cell output, sF k , availability of the EV, sEV,a (i.e. a binary variable), and EV battery SOC, sEV k , k for each time-step, k, in the decision horizon. +

2) Control variables consists of charge rate of the battery, xB k , − , electrical energy generated discharge rate of the battery, xB k C,e by fuel cell, xF , thermal energy generated by fuel cell, k + C,t , charge rate of the EV battery, xEV , discharge rate xF k k − , thermal energy generated by the of the EV battery, xEV k H supplementary water heater, xW , distribution of DC energy k , and electrical grid power, between battery and inverter, xd,DC k xgk , at a given time-step k. 3) Transition functions governs how the state variables evolve over time. For examplethe SOC of the battery described by B,min B,max will progress by: the state variable sB ,s k ∈ s −

+

+

B B B B sB + μB (xB k )), k+1 = (1 − l (sk ))(sk − xk +

+

(7)

B where μB k (xk ) is the efficiency of the charging process and B B l (sk ) models the self-discharging process of the battery.

TABLE I.

4) Random variables are PV output, ωkP V , electrical demand, ωkD,e , and thermal demand, ωkD,t . 5) All parameters used are given in Table I, and transition probabilities, forecasts and characteristics of the devices are given in Fig. 2. The input characteristics of the devices indicate that dynamic programming can properly incorporate non-linear characteristics, while linear approximations have to be made with stochastic MILP.

Minimum & maximum battery SOC

2kWh, 10kWh

Minimum & maximum fuel cell output (electrical)

0.2kWh, 0.5kWh

Minimum & maximum fuel cell output (thermal)

0.2kWh, 0.4kWh

Minimum & maximum TES unit state

0kWh, 12kWh

ca , cb , cp

$5, $1, 0.3

B. Objective function The main objective of the implemented SHEMS is to minimize energy costs over the decision horizon, ∀k {1 . . . N }, which consists of cost of electricity consumption, pg,b k , revenue from selling electrical power back to the grid, pg,s k , and cost of gas consumption, pgas k . The SHEMS is also capable of transferring the excess power back to the electrical grid during peak periods, which can generate an income. This is given the least weight in the objective function as the price of the electricity sold to the electrical grid is much lower than the cost of electricity bought. In this paper, it is assumed that g g,b g,s the electricity price, pk ∈ pk , pk , and pgas are available k before the start of the decision horizon. 1) Stochastic MILP: formulation of the objective function mentioned in Section II and III is given by: N

th th F C,e , (8) minimize pgk xgk + pgas k xk N N k=1

where th is the total hours in the decision horizon. 2) Dynamic programming: In order to make a fair comparison with the stochastic MILP, the cost function used in dynamic programming also consists of energy costs. The electrical energy stored in the batteries and the thermal energy stored in the TES unit in time step N results in: B B,min EV CN (sN ) = −pB ) − pTN sTN − pEV N (sN − s N sN ,

(9)

T EV where pB are the financial costs of battery N , pN and pN storage, TES unit and the EV at time-step N, respectively, and the costs for all the other time-steps are given by: C,e H Ck (sk , xk , ωk ) =pgk xgk + pF C (xF ) + pW H (xW , pgk ) k k ˜ +ca (sEV,max − sEV )∂(k, k) k

S IMULATION PARAMETERS

˜ N k,

94, 96

EV T pgas , pB N , pN , pN

$0.064, $0.21, $0.16, $0.07

Efficiency of the electric water heater

0.8

Minimum & maximum EV battery SOC

0kWh, 16kWh

Efficiency of the battery charging

100%

Efficiency of the EV battery charging

92%

EV,a C T B EV [sF ] and realizations of random varik , sk , sk , sk , sk ables, at each time-step.

As mentioned in Section III, in stochastic MILP, heuristic scenario reduction techniques are employed to obtain the desired scenario set, J. Specifically, the techniques used are backward and forward scenario reduction, which are already used to obtain optimal scenario trees in power systems; a detail explanation of these can be found in [7]. VI.

S IMULATION RESULTS

This section presents analysis of the simulation results obtained for the two implemented SHEMSs. The SHEMSs implemented using the stochastic MILP and the DP approaches are operating according to the following constraints: •

Fuel cell is only turned on with the highest possible output of electrical power.

•

Charge the battery with the highest possible charge rate if the combined power generated from the fuel cell and the PV system exceeds the household demand.

•

Only transfer power back to the electrical grid if the combined power generated by the household exceeds the household demand and the highest possible charge rate of the batteries. This is because feed-in tariffs (FiTs) are usually less than the retail tariffs paid by the households. For example: in Australia it is ≈ 8c to ≈ 30c/kW h [12].

C. Computational aspects

The total daily financial costs caused by the SHEMS using dynamic programming is $2.48 while $2.72 for the SHEMS using stochastic MILP. In Fig. 3, a comparison is made of the key state variables. These indicate that the dynamic programming approach results in a better quality schedule (i.e. better resolution). This is because the nonlinear characteristics of the devices are incorporated with dynamic programming while linear constraints have to be used with stochastic MILP. Moreover, the stochastic nature of the input variables is modeled assuming appropriate probability distributions. However, as mentioned before, the computational burden increases exponentially because of the dimensionalities of state, outcome and action spaces.

In both of the implemented SHEMSs, the values of − C,e EV + the control variables xk = [xF , xd,DC , xB ], are k , xk k k to be determined depending on the state variables sk =

The stochastic MILP approach has better computational performance as it uses scenario reduction technique to reduce the size of the scenario set, so that the problem can be

+cb sEV,a H(cp sEV,max − sEV k ), k

(10)

where pW H and pF C are the financial costs caused by the supplementary water heater and fuel cell, respectively, and ca and cb are positive constants, which are used to penalize different undesired states of the EV battery. The minimum required charge level of the EV battery, cp , at time step k˜ is needed to be reached as quickly as possible after the EV is parked. H(n) is a unit step function and ∂(m, n) is the Kronecker delta function [8].

Efficiency vs. the input power of the inverter

Electrical and thermal efficiency of the fuel cell 60

μF C,e , μF C,t , (%)

μI (%)

100

50

0

0.2

0.4

Dynamic programming Stochastic MILP 0.6 0.8 1 x Ik (kWh)

1.2

40

μF C,e (x Fk C,e ) - Dynamic programming μF C,e (x Fk C,e ) - Stochastic MILP μF C,t (x Fk C,t ) - Dynamic programming μF C,t (x Fk C,t ) - Stochastic MILP

20

0 0.1

1.4

0.2

0.3 x Fk C,e , x Fk C,t , (kWh)

(a) Discharge efficiency of the battery

Maximum charge rate vs the state of charge of the batteries

x B , x EV (kWh)

0.6

0.4

−

+

μB (%)

70

50 0

+

60

0.1

Dynamic programming Stochastic MILP 0.2 − 0.3 xB k (kWh)

0.4

0.2

0

0.5

(c)

6

8 10 EV sB (kWh) k , sk

12 13

16

Mean Standard deviation

0.2

0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)

(e)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)

(f)

Thermal energy demand over time

PV output over time


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)


0.4

ωkP V (kW)

0.4

0

4

0.4

ωkD,e (kW)

Probability (%)

50

0.8

2

Electrical power demand over time

park P (ωk = 1) away P (ωk = 1)

100

0

+

x B (sB k ) - Dynamic programming + x B (sB k ) - Stochastic MILP EV + EV x (sk ) - Dynamic programming EV + EV x (sk ) - Stochastic MILP

(d)

Transition probabilities for the availability of the electric vehicle

ωkD,t (kW)

0.5

(b)

80

0.2

0

(g) Fig. 2.

0.4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)

(h)

Probabilities, characteristics of the devices, and random variables.

solved within a desired time limit with reasonable accuracy. The financial costs caused by full and half scenarios are $2.72 and $2.91, respectively, validating the value of the scenario reduction technique. Furthermore, a comparison of the solutions are presented for a full and half number of scenarios in Fig. 3.

VII.

A PPROXIMATE DYNAMIC PROGRAMMING AND FUTURE RESEARCH DIRECTIONS

This section draws on the analysis presented above to identify directions for future work. This will involve using approximate dynamic programming (ADP) methods to solve the optimization problem formulated in Section II and IV, with the aim of improving the computational performance over that of the dynamic programming, but with a similar

Electricity tariff vs. time

eling, which avoids looping over all possible states, actions and outcomes,

pgk ($)

0.4

model the shiftable/controllable user loads in a smart home with less computational effort,

•

use a two-stage optimization, as mentioned in Section VII-B, but with a higher resolution (i.e. 15 minutes time-steps) that is computationally feasible,

•

easily incorporate on-line learning methods mentioned in VII-C, which will enable seamless integration of DERs in a plug-and-play manner, increasing ancillary services to the electrical grid.

0.2

0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)

Fuel cell output vs. time 0.5

sFk C (kWh)

•

0.2

0

Dynamic programming Stochastic MILP-full Stochastic MILP-half 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (hour)

In the proposed approximate dynamic programming approach, the optimization problem in (4) is solved by computing the optimal value function (6) while stepping forward in time. Since the values of the future states are not yet computed, policies (i.e set of actions) are computed by value function approximation and/or policy function approximations, as explained in Section VII-A and C, respectively.

Electric vehicle battery state of charge vs. time

A. Policy based value function approximation

sEV (kWh) k

15

10

5

0


Thermal energy storage unit state vs. time

One approach in ADP is to approximate the value function and extract a policy from this approximation. This can be found iteratively, as mentioned in [13]. However, approximating the expectation within the max or min operator is difficult to do in practical applications. Here the focus is on approximating the value function around a post decision state vector, sxk , the state of the system at discrete time, k, soon after making the decisions but before the realization of random variables. Hence the new form of the value function is written as: Vkπ (sk ) = min Ck (sk , π(sk )) + γ k Vkπ,x (sxk ) , (11) xk

sTk (kWh)

10

Vkπ,x (sxk )

5

0


Battery state of charge vs. time

sB k (kWh)

10

5

0


Fig. 3. Simulation results for both SHEMSs over a 24 hour horizon starting at 6 : 30 am.

solution quality. The conjecture is that approximate dynamic programming can: •

scale to handle fine-grained temporal and spatial mod-

is the value function approximation around where the post-decision state sxk , given by: π (12) (sk+1 )|sxk . Vkπ,x (sxk ) = E Vk+1 method is computationally feasible because This π is a function of the post-decision E Vk+1 (Sk+1 )|sxk state sxk (i.e a deterministic function of xk ) rather than sk in the original value function in (6). In order to approximate the value function around a post decision state vector: (i) start with the initial value function approximations and step forward in time by solving the problem at each time-step, using the value function approximation of the future state; and, (ii) repeat this procedure over several iterations, updating the value function approximation after each iteration. Each iteration depends on a sample realization of the random information, a process that is referred to as Monte Carlo simulation. This iterative updating enables to obtain an approximation of the value function around the post decision state [14]. The initial value function approximations for post decision states can be found by solving a simplified optimization problem using off-the-shelf solvers, as explained in Section VII-B and/or policy function approximations, as explained in Section VII-C.

Battery SOC and electricity price over time

pgk ($)

sB k (kWh)

10

0.2

5

0

Battery SOC (sB k) Electricity price (pgk ) 1 day

Time

2 day

0 3 day

(a) PV output and electrical demand over time

ωkD,e , ωkP V (kW)

2

Electrical demand (ωkD,e ) PV output (ωkP V )

1

0

1 day

Time

2 day

3 day

(b) Fig. 4. First three days of the weekly optimization with 1 hour time-steps: (a) battery SOC and electricity cost; and (b) PV output and electrical demand.

B. Approximating the values of state variables In finite horizon problems, accurately estimating the value of the state variable in the final time-step can improve the solution quality. In this section, this concept is illustrated with a two-stage look-ahead optimization of the energy management problem. As explained in Section II, the required battery SOC at the end of the horizon, sB N , should depend on the PV output and the predicted demand in future days. Given this context, a simplified optimization over a longer horizon (i.e weekly) is carried out considering the weekly PV and demand forecast to approximate the value of sB N. As shown in Fig. 4, the battery SOC is maximal when the demand is high, and end-of-day battery SOC depends on the PV output and the electrical demand in future days (i.e. end-of-day one battery SOC is minimized since day two and day three are sunny with low demand). The approximated value of the end-of-day battery SOC, sB N , can be used in a more detail stochastic daily optimization, which improves the quality of the solutions. For example: if the required end-ofday battery SOC is 5 kWh instead of 9.5 kWh, as depicted in Fig. 3, the financial costs are reduced by $0.28, which can be possible if future days are sunny with low demand. Finally the operating schedules (i.e. set of decisions) from a deterministic daily optimisation can be used to approximate the initial value functions and post decision states (mentioned in Section VIIA). C. Policy function approximations For the sake of completeness, policy function approximations are also described, which are functions that return an action for a given state. However, designing specific functions that map a state to an action is a challenge, which can be overcome by three strategies: (i) rule-based lookup table with actions for each state or a value for each state; (ii)

parametric models used to obtain value of being in each state depending on different conditions. For example: the battery can be charged if electricity costs are below a certain value and discharge if it is above a certain value; and (iii) nonparametric statistical representations, such as Kernel regression, support vector regression and neural networks (i.e. learning methods). An interested reader can refer to [14] for more detail. VIII.

C ONCLUSION

This paper presents a comparison of two existing solution methods, stochastic MILP and dynamic programming, used to schedule DERs in a smart home. The drawbacks associated with these existing SHEMSs were identified in order to propose future research work using approximate dynamic programming methods, which is expected to improve the computational performance of dynamic programming but with similar solution quality. In particular, a two-stage look-ahead optimization of a smart home energy management problem was used as an example to show how approximate dynamic programming methods can be applied, which will be considered in future research. R EFERENCES [1] A. Ipakchi and F. Albuyeh, “Grid of the future,” Power and Energy Magazine, IEEE, vol. 7, no. 2, pp. 52–62, 2009. [2] M. Bozchalui, S. Hashmi, H. Hassen, C. Canizares, and K. Bhattacharya, “Optimal operation of residential energy hubs in smart grids,” Smart Grid, IEEE Transactions on, vol. 3, no. 4, pp. 1755–1766, 2012. [3] T. Hubert and S. Grijalva, “Realizing smart grid benefits requires energy optimization algorithms at residential level,” in Innovative Smart Grid Technologies (ISGT), 2011 IEEE PES, 2011, pp. 1–8. [4] K. C. Sou, J. Weimer, H. Sandberg, and K. Johansson, “Scheduling smart home appliances using mixed integer linear programming,” in Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on, 2011, pp. 5144–5149. [5] M. Pedrasa, E. Spooner, and I. MacGill, “Improved energy services provision through the intelligent control of distributed energy resources,” in PowerTech, 2009 IEEE Bucharest, 2009, pp. 1–8. [6] M. Pedrasa, T. Spooner, and I. MacGill, “Coordinated scheduling of residential distributed energy resources to optimize smart home energy services,” Smart Grid, IEEE Transactions on, vol. 1, no. 2, pp. 134–143, 2010. [7] M. Pedrasa, E. Spooner, and I. MacGill, “Robust scheduling of residential distributed energy resources using a novel energy service decisionsupport tool,” in Innovative Smart Grid Technologies (ISGT), 2011 IEEE PES, 2011, pp. 1–8. [8] H. Tischer and G. Verbic, “Towards a smart home energy management system - a dynamic programming approach,” in Innovative Smart Grid Technologies Asia (ISGT), 2011 IEEE PES, 2011, pp. 1–7. [9] Z. Chen, L. Wu, and Y. Fu, “Real-time price-based demand response management for residential appliances via stochastic optimization and robust optimization,” Smart Grid, IEEE Transactions on, vol. 3, no. 4, pp. 1822–1831, 2012. [10] B. Li, S. Gangadhar, S. Cheng, and P. Verma, “Predicting user comfort level using machine learning for smart grid environments,” in Innovative Smart Grid Technologies (ISGT), 2011 IEEE PES, 2011, pp. 1–6. [11] D. P. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Athena Scientific, Bemont, Massachuttes, 2005, vol. 1. [12] J. C. P. J. Boxall and S. Draper, “Solar feed-in tariffs,” IPART, Tech. Rep., July 2013. [13] W. B. Powell, Approximate Dynamic Programming - Solving the Curses of Dimensionality. John Wiley & Sons, Inc., 2007. [14] R. Anderson, A. Boulanger, W. Powell, and W. Scott, “Adaptive stochastic control for the smart grid,” Proceedings of the IEEE, vol. 99, no. 6, pp. 1098–1115, 2011.