Distributing Power Grid State Estimation on HPC Clusters - IEEE Xplore

2012 IEEE 201226th IEEE International 26th International ParallelParallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum

Distributing Power Grid State Estimation on HPC Clusters A System Architecture Prototype Yan Liu Data Intensive Computing Group Pacific Northwest National Laboratory Richland WA, 99354 Email: [email protected]

Wei Jiang Department of Computer Science and Engineering The Ohio State University Columbus OH, 43210 Email: [email protected]

Shuangshuang Jin Mark Rice Yousu Chen Electrical Power Systems Engergy Electrical Power Systems Engergy Electrical Power Systems Engergy Pacific Northwest National Laboratory Pacific Northwest National Laboratory Pacific Northwest National Laboratory Richland WA, 99354 Richland WA, 99354 Richland WA, 99354 Email: [email protected] Email: [email protected] Email: [email protected]

Abstract—The future power grid is expected to further expand with highly distributed energy sources and smart loads. The increased size and complexity lead to increased burden on existing computational resources in energy control centers. Thus the need to perform real-time assessment on such systems entails efficient means to distribute centralized functions such as state estimation in the power system. In this paper, we present our experience of prototyping a system architecture that connects distributed state estimators individually running parallel programs to solve non-linear estimation procedure. Through our experience, we highlight the needs of integrating the distributed state estimation algorithm with efficient partition and data communication tools so that distributed state estimation has low overhead compared to the centralized solution. We build a test case based on the IEEE 118 bus system and partition the state estimation of the whole system model to available HPC clusters. The measurement from the testbed demonstrates the low overhead of our solution.

I. I NTRODUCTION The ability to determine the state of the power systems in real time is key to the controls and operations in the power grid. The state of the system is the voltage and angle of every bus in the power grid. State estimation is a central component in the power system. It collects system field measurements from the Supervisory Control And Data Acquisition (SCADA) system every four seconds, typically. State estimation code solves the non-linear estimation procedure based on the weighted least squares method. The results are estimated states such as voltage magnitude, power injections and power flows. These are critical inputs for other power system operational tools, such as contingency analysis, optimal power flow, economic dispatch, and automatic generation control. Conventionally the state estimation is performed by the state estimator in the control centers of balancing authorities 978-0-7695-4676-6 2012 U.S. Government Work Not Protected by U.S. Copyright DOI 10.1109/IPDPSW.2012.183

and reliability coordinators, which have access to measurements and operations of power grid systems. A large power system consists of many balancing authorities and a few reliability coordinators, each having their own control center running local state estimation. At today’s power systems, the data exchange structure at control centers follows Hierarchical State Estimation. In hierarchical state estimation, the entire regional power system is decomposed into non-overlapping subsystems at the balancing authority level. Each subsystem sends estimated states to a centralized coordinator, the state estimator at the reliability coordinator level that coordinates several balancing authorities [1]. After processing the intimate knowledge at the balancing authority level, appropriate control decisions are made and related information are sent to the corresponding subsystems. Currently, hierarchical state estimation is adopted in industry due to the simplicity of implementation. However, there is growing evidence that the power systems will further expand in size and in complexity. One aspect is the increasing deployment of phasor measurement units (PMUs) that produce real-time synchrophasor data that capture the dynamic characteristics of the power system, and hence facilitate time-critical control. PMUs typically generate measurements 30 samples per second with precise time synchronization. As a result, the time to solution of state estimation needs to be radically reduced to the 10 milliseconds to 1 second range, compared to the current delay of 2-4 minutes to obtain results. Consequently, High Performance Computing (HPC) software has been developed in our laboratory to solve the state estimation equations with significant improvement of the computation time [2]. In addition to the need of processing the measurement data in real-time, the synchrophasor data impose constraints on the state estimation problem conducted on the entire 1444 1461 1467

system. In the Western Interconnect, 137 PMUs are currently installed, and the number is planned to increase over 300 by 2012 [3]. Given these projections, the amount of data storage would eventually approach approximately 1.12 TB for a 30 day period [4]. A single centralized coordinator can no longer collect all the data available from the corresponding balancing areas. One approach to alleviating the burden on computational resources is to distribute the computation across the system rather than to centralize it at one control center. Distributed State Estimation [5] provides an opportunity to scale as the power systems evolve in size and in complexity. In distributed state estimation, each subsystem can run its local state estimation algorithm and also exchange data with neighboring subsystems to re-evaluate its local state estimation solution. Thus peer-to-peer communication between subsystems is allowed without the need of a coordinator. The distributed state estimation algorithms have been actively investigated by power engineering researchers to improve the accuracy and robustness of the state estimation [5], [6], [7], [8], [9]. However the algorithms are mostly exercised on local desktop computers to validate the solution. There is lack of a system architecture on which a distributed state estimation algorithm can be deployed and evaluated so that the characteristics of the runtime behavior can be understood. For example, the dynamics of the power systems may result in varying number of data exchange sessions between state estimators, understanding the runtime communication delays can help to design the distributed state estimation algorithm to scale in size of the system and in data quantity. The data exchange structure imposes requirements on the systems architecture design as follows: • Accommodate state estimation models. The state estimation can follow either hierarchical state estimation or distributed state estimation that leads to different structure for data exchange; • Connect HPC-based state estimation code to the data communication infrastructure; • Optimize the allocation of HPC resources to compute state estimation procedure. Under the umbrella of the PNNL’s Future Power Grid Initiative 1 , our research focuses on the design of a distributed systems architecture that supports distributed power applications running on HPC clusters. State estimation is a key application for this architecture to support, on which a diversity of applications rely for obtaining the accurate system snapshots. In this paper, we present a prototype of the system architecture that combines distributed state estimation and HPC. In this architecture, the entire power system model is partitioned into subsystems based on the sensitivity analysis of bus lines. The computation of the subsystems is further

partitioned to available HPC clusters that balanced the computation and communication costs. Hence, individual state estimators run parallel programs to solve non-linear estimation procedure and communicate to their peers the intermediate results. The data exchange is built on top of MeDICi, a service oriented middleware developed by PNNL. Given this middleware solution, individual state estimator is well encapsulated by interfaces for data communication and uniquely identified by endpoints. This leads to an extensible design since the variation to the state estimation algorithms resulting in different data exchange structures (hierarchical state estimation versus distributed state estimation) are both allowed. To evaluate our solution, we build a test case based on the IEEE 118 bus system and partition the state estimation of the whole system model to available HPC clusters. The measurement from the testbed demonstrates the low overhead of our solution. II. I NTRODUCTION OF D ISTRIBUTED S TATE E STIMATION State estimation has been extensively used to obtain the best estimate of the current state of a power system based on the redundant noisy measurements. The traditional power system state estimation is based on the model: z = h(x) + e where z is the measurement vector, x is the state vector, h(.) is a nonlinear states to measurements function, and e is the measurement error vector satisfying a zero mean. Approximately, the state estimation problem can be solved by obtaining the solution to the following equation: z = Hx + e where H is the states to measurements matrix for the entire power system. The data resources include power flow-injections and voltage magnitudes. In cases in which state estimators utilize Phasor Measurement Units (PMUs), phasor data are also collected. In the distributed version, the entire power system is decomposed into m non-overlapping subsystems and each subsystem can run its local state estimation algorithm and also exchange data with neighboring subsystems to reevaluate its local state estimation solution. Subsystems are connected via tie lines. Correspondingly, the matrix H and the vector z are also partitioned into m parts as follows and each part is responsible for a subsystem of the entire power system. ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ z1 e1 H1 ⎢ z2 ⎥ ⎢ e2 ⎥ ⎢ H2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ H = ⎢ . ⎥,z = ⎢ . ⎥,e = ⎢ . ⎥ ⎣ .. ⎦ ⎣ .. ⎦ ⎣ .. ⎦ Hm zm em A typical DSE algorithm could work as follows [5]. 1) Preliminary Step: Given a power system decomposition, boundary buses and tie lines have been identified

1 http://gridoptics.pnnl.gov/index.html

1468 1462 1445

for each subsystem. In this step, sensitivity analysis is usually performed to determine the sensitive internal buses. Note that this step is carried out off-line, once for a given graph topology. 2) Step 1: After the preliminary step, each subsystem runs its own state estimation with data collected from its own buses and PMUs. The state estimation in each subsystem in this step can be formulated as follows: zi = hi (xi ) + ei where hi (xi ) is the nonlinear states to measurements function for each subsystem i, zi is the measurements vector for subsystem i, and ei is the error vector for subsystem i. 3) Step 2: In this step, the solutions of the boundary buses and sensitive internal buses from neighboring subsystems are considered as pseudo measurements. In each subsystem, together with the local measurements related to the boundary buses and sensitive internal buses, the pseudo measurements from neighboring subsystems are used to re-evaluate the state estimation solution of the boundary buses and sensitive internal buses in the previous step. Note that both Step 1 and Step 2 only need a finite number of iterations before converging and the number of iterations can be up-bounded by the diameter of the power system decomposition [10]. 4) Final Step: Finally, the aggregated system wide state estimation solution can be computed by combining the solutions of each subsystem in Step 1 and Step 2. In the rest of this paper, DSE Step 1 and DSE Step 2 will be used to refer the above two steps in the architecture prototype, respectively. III. R ELATED W ORK Research on decentralized power grid applications has been exploring solutions to simplifying the state estimation process. For distributed state estimation, two different communication strategies can be applied in DSE algorithms, i.e., centralized communication and decentralized communication. Shahraeini et.al. [11] compare the two communication infrastructures and conclude that, compared to the centralized strategy, decentralizing results in improvement of latency and reliability of data communication between state estimators. Ali Abur et.al. present a centralized DSE algorithm for mega grids and coordination between state estimators is via a central coordinator [12]. Our work instead is based on a decentralized DSE algorithm proposed in [5], in which state estimators coordinate with each other without a centralized coordinator. We focus on deploying the decentralized DSE algorithm to HPC clusters and connecting them by a data communication middleware. We are not aiming to propose a new DSE algorithm, but targeting at partitioning power system decompositions

onto HPC clusters in the DSE algorithm. In this way, the DSE computations can be well distributed onto HPC clusters to achieve the best overall runtime efficiency. SuperCalibrator implements DSE using substation data obtained from any devices connecting to its neighborhood [13]. A desktop application is developed based on the SuperCalibrator concept to observe the status updates using the relay data from devices. SuperCalibrator conducts three-phase state estimation with measurement data from distributed devices. The DSE we mean in this paper is at the control centers above the substation level. Bose et al. evaluate the effects of the network topology on the accuracy of the hierarchical state estimation [6]. A set of experiments are devised that covered different scenarios of the type of data communicated, failure at the network connection, and partition of the network topology [6]. The hierarchical state estimation model is presented with sampling data preloaded to the model, which simplifies the issues arising from the distributed architecture. Our work further explores the distributed architecture designs accommodating these testing scenarios. Many other literatures focus on the algorithms of state estimation rather than the implementation and deployment in distributed architecture. In the distributed computing area, GridStat is a middleware framework managing the quality of service of data dissemination in the power grid [14]. GridStat uses publish/subscribe notification framework to collect and match QoS requirements such as latency and rate. For parallel power models to leverage GridStat, the interface discussed in section IV provides a solution to wrap the GridStat APIs. A detailed survey of the QoS requirements for data communication is presented in a GridStat technical report [14]. From our study of the literature, existing related work focuses on developing network infrastructure to enable power grid status dissemination through hierarchical management[15], [16]. A conceptual level architecture provides a general guideline for deploying power grid entities [15]. IV. T HE T ECHNICAL A PPROACH Our technical approach is comprised of a distributed system architecture that guides the connection between distributed state estimators through MeDICi2 , a middleware developed in PNNL for data intensive computing [17]. We develop a method that allows the state estimation decomposition to be partitioned to available HPC clusters. The architecture is equipped with techniques to implement DSE that can make use of HPC clusters. A. The System Architecture Overview The goal of the system architecture is connecting distributed power applications, each running on HPC platforms, to efficiently exchange necessary raw data and intermediate 2 http://medici.pnl.gov/

1469 1463 1446

processed results. A reference architecture is depicted in Figure 1 that demonstrates the elements integral to the architecture. In this architecture, it is assumed that each site corresponds to a balancing authority level control center that accommodates a HPC platform to process the state estimation. The parallel program of the state estimation follows the solution in [2] as introduced in the next section. However, the existing implementation focuses on running local state estimation only. For distributed state estimation, communication across the local MPI boundary is necessary. Therefore an interface layer is deployed on the master node of each HPC cluster to exchange data between estimators. The communication between state estimators is bidirectional according to the DSE algorithm in section II. A state estimator sends to and receives from its neighboring estimators pseudo measurements on boundary buses and internal sensitive buses to re-evaluate its local state estimation. Thus, the interface layer is deployed on the master node of each site. It includes a middleware client that wraps the communication code for disseminating and retrieving data. To resolve the source and the destination, each state estimator or data source is uniquely identified by a URL. A middleware client sends the request for data to the destination URL. The middleware resolves the location by the URL, routes the requests and fetches remote measurement data into a local data buffer, including bus voltage, phase angle, power flow, or power injections. A data processor acquires the data from a local data buffer, extracts the required fields (such as bus voltage and phase angle) and assembles them as inputs to the parallel power models. The data processor then dispatches the inputs to multiple worker processors on each site. The data processor and the middleware client form an interface layer between the parallel state estimators and the data communication infrastructure. The same structure extends to the hierarchical state estimation as shown on the top layer of Figure 1 - the centralized coordinator can be assigned a unique URL to identify itself. Its master node can deploy the interface layer to communicate with its subsystems’ state estimators.

an undirected graph where each vertex corresponds to a subsystem and each edge corresponds to the tie lines between subsystems. In this way, the computations in each subsystem represent the vertex weights and the communications in the tie lines denote the edge weights. Formally, let G = (V, E) represent the decomposition graph that consists of a set of subsystems V and a set of tie lines E, then the problem can be reduced to a static graph partitioning problem where the |V | subsystems in the power system decomposition graph are partitioned into p disjoint partitions, where p is the number of HPC clusters. Subsystems in the same partition are mapped onto the same HPC cluster to run DSE Step 1 (please refer to section II for details) - each subsystem runs its own state estimation with data collected from its own buses and PMUs. In DSE Step 2, each subsystem re-evaluates the state estimation solution of the boundary buses and sensitive internal buses in the previous step, with its local measurements and pseudo measurements from neighboring subsystems. Consequently, neighboring state estimators need to exchange the pseudo measurements, which incur communication overhead in addition to the computational costs of re-evaluation. 2) Estimating Graph Weights: Generally, vertex weights refer to the computational costs while edge weights are communication costs. To associate vertex weights for a subsystem, we observe that the computational costs in DSE Step 1 and 2 depend on the number of iterations to run the state estimation, and the time for each iteration is decided by the size of measurements collected for the subsystem. To associate edge weights for a subsystem, we observe that no communication is needed in DSE Step 1 and the communication costs in DSE Step 2 depend on the size of measurements exchanged between two neighboring subsystems and the network bandwidth. Therefore, we formulate our observations to estimate the graph weights using empirical results. Formally, let Ni represent the number of iterations and x represent the noise level that normally follows the Gaussian distribution. Suppose x can be derived from a time frame δt using the following function: x = f (δt)

B. Partitioning a Power System Decomposition Given a large-scale power system decomposition that could contain hundreds of subsystems and a limited number of HPC clusters, we then need to consider the appropriate partitioning of the power system decomposition so that a group of subsystems are assigned to a specific HPC cluster as a partition of the decomposition graph. In our work, a mapping method is developed to balance the computation and communication costs amongst the state estimators at the subsystem level. 1) Formulating a Power System Decomposition to a Graph: We formulate the power system decomposition into

(1)

We assume the following expression to express the relationship between Ni and x: Ni = g1 × x + g2

(2)

where g1 and g2 represent properties of a subsystem. For example, for a 14-bus subsystem, empirical studies show that g1 = 3.7579 and g2 = 5.2464. So, given a time frame δt, we can first estimate the noise level of the measurements collected during that period of time based on a Gaussian distribution and then use Expression (2) to estimate the number of iterations.

1470 1464 1447

F IGURE 1: The Architecture Overview of the Prototype System

Then we formulate the computation time for each iteration. Observe that the computation time per iteration depends on the size of the measurements as input for state estimation. In DSE Step 1, it in turn could correspond to size of the raw measurements collected from all the buses in a subsystem; in DSE Step 2, it should include the raw measurements of the boundary and sensitive internal buses of a subsystem and also the pseudo measurements of the boundary and sensitive internal buses from its neighboring subsystems. Assume that each bus produces a similar size of measurements and the size of measurements in DSE Step 2 is similar to the size of measurements in DSE Step 1, then the size of measurements in a subsystem for both steps can be represented by the number of buses in the subsystem. Therefore, the vertex weight Wv for a specific subsystem can be roughly represented as in the following equation: W v = Nb × N i

Since DSE Step 1 doesn’t need any communication between subsystems, we only need to estimate the edge weights for DSE Step 2. Assume the network bandwidth is the same, we can estimate the edge weights based on the total size of measurements exchanged between two neighboring subsystems. In DSE Step 2, this refers to the size of measurements collected from the boundary buses and internal sensitive buses of the two neighboring subsystems. Sensitivity analysis should be performed to identify the boundary buses and internal sensitive buses. Assume that each bus has a similar size of measurements, the edge weights We between two neighboring subsystems s1 and s2 can be estimated as follows: We = gs (s1 ) + gs (s2 )

where gs is the number of boundary buses and internal sensitive buses of a given subsystem. It is easy to see that the upper bound of We could be represented simply by the sum of the number of the buses in two neighboring subsystems based on the assumption that each bus produces a similar size of measurements. 3) Partitioning the Graph: The goal of partitioning the subsystems is to load balance the overall computations

(3)

where Nb represents the number of buses. Given Expressions (1), (2), and (3), we can estimate the vertex weights for a subsystem given the set of measurements collected from a time frame δt as follows: Wv = Nb × (g1 × f (δt) + g2)

(5)

(4)

1471 1465 1448

F IGURE 3: Decomposition of the IEEE 118 Bus System

F IGURE 2: The IEEE 118 Bus System

assigned to HPC clusters in DSE Step 1 and DSE Step 2, and also minimize the overall communications between neighboring state estimators in DSE Step 2. We build our partition solution using METIS [18] to achieve the above objectives. METIS is a set of serial programs for partitioning graphs. The inputs to METIS are the power system decomposition graph and values assigned to different vertex and edge weights that reflect computational and communication costs respectively. We use the IEEE 118 Bus Test Case (as shown in Figure 2) to illustrate how DSE and the partition are combined. The IEEE 118 Bus Test Case represents a portion of the American Electric Power System (in the Midwestern US) 3 . Assume the IEEE 118 bus system is decomposed into 9 subsystems (numbered from 1 to 9) following the preliminary step introduced in section II. The sensitivity analysis of this step identifies the boundary buses and internal sensitive buses for each subsystem. Hence, each subsystem has around 14 buses. To formulate the bus system decomposition into a graph, we treat each subsystem as a vertex and an edge exists between two subsystems if there are tie-lines between the buses of the two subsystems.

F IGURE 4: Partitioning the Graph onto 3 HPC Clusters Before Step 1

Initially, we associate the power system decomposition graph with initial vertex and edge weights listed as shown in Figure 3. The vertex weights are initialized to be the number of buses in each subsystem and the edge weights are represented by the sum of the number of buses in two neighboring subsystems based on Expression (5). Then, METIS reads the input graph and computes an initial partitioning scheme based on the initial weights. This initial partitioning schema is not used in our mapping method, but required by METIS. After that, the system works as follows. Before DSE Step 1, given each new time frame, the mapping method estimates the noise level, updates the graph weights, and invokes the repartitioning routine provided by METIS to obtain the best-possible partitioning scheme. Now the vertex weights are updated based on the estimated noise level as in Expression (4). In this repartitioning, our focus is to load-balance the computational costs of each HPC cluster. Since no communication is needed in DSE Step 1, it assigns the same value to all edge weights. Suppose we have three HPC clusters named Nwiceb, Catamount, and Chinook in our laboratory network. The METIS repartitioning routine maps subsystems 1, 4, and 8 onto Chinook, subsystems 2, 3, and 6 onto Nwiceb, and subsystems 5, 7, and 9 onto

TABLE I: The Initial Vertex and Edge Weights for the IEEE 118 Bus System Decomposition Vertex Number 1 2 3 4 5 6 7 8 9

Vertex Weights 14 13 13 13 13 12 14 13 13

Edge (1, (1, (1, (2, (2, (3, (4, (4, (5, (5, (5, (7,

Pair 2) 4) 5) 3) 6) 6) 5) 7) 6) 7) 8) 9)

Edge Weights 27 27 27 26 25 25 27 27 25 27 26 27

3 http://www.ee.washington.edu/research/pstca/pf118/pg

tca118bus.htm

1472 1466 1449

S TATE E STIMATION (SubSystem, M eDICi) comment: SEs have been mapped onto HPCs for DSE Step 1 input ← SubSystem.measurements step1 solution ← SE RUN(input) comment: SEs have been re-mapped onto HPCs for DSE Step 2 local ← SubSystem.measurements(boundary&sensitive) for each neighbor ∈ P owerGraph.neighbors MW C LIENT S END(M eDICi, neighbor, step1 solution) do pseudo[neighbor] ← MW C LIENT R ECV(M eDICi, neighbor) step2 solution ← SE R E RUN(local, pseudo) comment: Solutions from all SEs are combined in final step

F IGURE 5: Partitioning the Graph onto 3 HPC Clusters Before Step 2 F IGURE 6: Overview of Running Each State Estimation in DSE

Catamount as in Figure 4. This partitioning scheme yields a load-imblancing ratio of 1.035, which indicates the state estimation is almost equally distributed and running on 3 different HPC clusters. After DSE Step 1 is done, the pseudo measurements from the boundary buses and sensitive internal buses are produced for each state estimation. These data will be exchanged between the neighboring state estimators in DSE Step 2, which incur the communication costs. Before DSE Step 2, the vertex and edge weights are updated accordingly to reflect the change of the computational and communication costs. Note that in this case study, we use the upper bound of the size of the pseudo measurements exchanged between two neighboring subsystems to represent the edge weights, i.e., the sum of the number of buses of two neighboring subsystems. Based on the updated graph weights, METIS recommends a slightly different partitioning scheme as in Figure 5 to minimize the communication costs while keeping the computational costs well balanced. Particularly in this example, the load-imbalancing ratio of the new partitioning scheme is 1.079, around the suggested threshold 1.05 defined by METIS. Given this new partitioning scheme, subsystem 4 is re-mapped onto Catamount while subsystem 5 is re-mapped onto Chinook. Nwiceb still has the same set of subsystems. The raw measurements related to the boundary and sensitive internal buses for subsystem 4 and subsystem 5 need to be exchanged between Chinook and Catamount. After that, all the state estimators will finish DSE Step 2 where pseudo measurements of neighboring subsystems are exchanged and the solutions from DSE Step 1 are reevaluated. In practice, a real power grid system may exhibit more constraints than just the computational and communication costs due to measurements data availability and privacy issues. Right now, we assume the mapping method produce the best-case partitioning of the power system decomposition

such that each HPC cluster has almost the same aggregate computational loads in the subsystems assigned to each HPC cluster and the communications between HPC clusters are minimized as well. C. The HPC Implementation of DSE The weighted least square algorithm is the most widely used state estimation method. The details of this method can be found at [19]. The main step for solving weighted least square in the state estimation is to solve a large and sparse system of linear equations in each cycle: Ax = b where the matrix A is the symmetric positive-definite (SPD) gain matrix. The HPC implementation of the state estimation procedure follows the solution by Yousu et al. [2]. The solution used a parallel preconditioned conjugate gradient (PCG) solver to solve this linear equation. By pre-multiplying the inverse of a pre-conditioner matrix (P ) to both sides of the above equation, P −1 Ax = P −1 b a new linear equation of system is yield: ˆ Ax = ˆb The condition number of matrix Aˆ is significantly lower than that of matrix A , to make the equation converge faster. Figure 6 shows the pseudo code of running each state estimation on HPC clusters. The state estimation will be first mapped onto HPC clusters to run DSE Step 1. Based on the mapping, each HPC cluster needs to obtain the raw data measurements of the subsystems it hosts from the data sources. After DSE Step 1 is finished, each subsystem produces the solution that contains the pseudo measurements, i.e., the solutions for its boundary buses and sensitive internal buses.

1473 1467 1450

As in Figure 6, to communicate with each other, each source state estimator only needs to specify the destination state estimator and then sends data to MeDICi pipelines. The MeDICi pipelines will forward the data to the corresponding destination state estimator. Overall, MeDICi acts as a router to exchange data between the neighboring state estimators in DSE Step 2. The use of MeDICi as the communication infrastructure can simplify the implementation and the design of the data communication between state estimators. To enable the communication between state estimators via MeDICi, the MeDICi pipelines need to be created first as the communication channels. Each MeDICi pipeline is responsible for a one-way communication between two state estimators. As an example, Figure 7 shows the sample code of creating a MeDICi pipeline to send data from a state estimator on Nwiceb HPC cluster to another state estimator on Chinook HPC cluster. After all the MeDICi pipelines and proper inbound/outbound endpoints between neighboring state estimators are established, the neighboring state estimators can communicate with each other through MeDICi. As in Figure 6, when a state estimator needs to communicate with a neighbor, it will invoke the MW_Client_Send function, which will in turn invoke a C socket program to connect the appropriate MeDICi inbound endpoint and sends data to it. Inside MeDICi, the data in the inbound endpoint will then be forwarded into the appropriate MeDICi outbound endpoint based on the description of the destination neighboring state estimator specified in the MW_Client_Send function. Finally, the MeDICi outbound endpoint will connect to the neighboring state estimator and send data to it. Thus, a one-way communication is finished through MeDICi pipelines. Note that the state estimation code only needs to specify the destination neighboring state estimator and the data to send out. The low-level communication details are taken care of by the MeDICi middleware.

M E DIC I (StateEstimation) comment: Create a MeDiCi Pipeline M if P ipelinepipeline = newM if P ipeline(); comment: Use the TCP Connector M if Connectorconn = pipeline.addM if Connector( EndpointP rotocol.T CP ); conn.setP roperty(”tcpP rotocol”, newEOF P rotocol()); comment: Add the SE Component SESocketSE = newSESocket(); pipeline.addM if Component(SE); comment: Set inbound/outbound endpoints for the SE Component SE.setInN ameEndp(”tcp : //nwiceb.pnl.gov : 6789”); SE.setOutHalEndp(”tcp : //chinook.emsl.pnl.gov : 7890”); comment: Start this MeDICi Pipeline pipeline.start();

F IGURE 7: A MeDICi Pipeline between Two State Estimators

Since DSE Step 1 and DSE Step 2 have different computational costs and communication costs, it is necessary to be able to repartition the decomposition graph to reflect the dynamic change of the computational and communication costs associated with the subsystem and tie lines in the runtime. The subsystems then need to be re-mapped onto HPC clusters to run DSE Step 2. For DSE Step 2, each subsystem needs the raw measurements data related to its boundary and sensitive internal buses and also the pseudo measurements from its neighboring subsystems. Due to the re-mapping, some of the raw measurements data for a subsystem may need to be redistributed to another HPC cluster if the subsystem was residing on a different HPC cluster in DSE Step 1. After the data redistribution is done, DSE Step 2 will be running based on the re-mapping, the pseudo measurements will be exchanged between neighboring state estimators via MeDICi pipelines and the state estimation solution will be re-evaluated to generate the final result based on the pseudo measurements from neighboring subsystems and the raw measurements from its boundary and sensitive internal buses.

V. E XPERIMENTAL R ESULTS A. Effectiveness of the Mapping Mapping the power system decomposition to HPC clusters is an add-on feature of the architecture. It helps to balance the computational and communication costs on a limited number of HPC clusters. We show in Table II two scenarios of the IEEE 118 bus system partition with and without the mapping method. The best-possible partitioning scheme of mapping subsystems onto HPC clusters is produced by the mapping method. From Figure 4 and Figure 5, we can see that as the graph weights are updated for DSE Step 1 and DSE Step 2 respectively to reflect the changes as the DSE proceeds. As in Figure 4 for DSE Step 1, since no communication is needed, the objective is to load balance the computational loads of the HPC clusters. As in Figure 5 for DSE Step 2, the vertex

D. Data Communication between State Estimators The middleware provides a high-level interface for the state estimators to communicate with each other. This interface encapsulates the details of network connection and data transmission, which provides the state estimation a unified interface for data communication within the network infrastructure of the power system.

1474 1468 1451

TABLE III: Performance Comparison between W/O MeDICi and W/ MeDICi for Data Communication Within a Linux Workstation

TABLE II: Decomposition Comparison between W/O Mapping and W/ Mapping Areas Area 1 Area 2 Area 3

w/o mapping (# of buses) 35 46 37

Data Size 100MB 200MB 500MB 1GB 2GB

w/ mapping (# of buses) 40 40 38

weights and edge weights are both updated and the objective is to minimize the communication costs while keeping each HPC cluster load-balanced. Due to the efficiency of running METIS in parallel, partitioning is typically much faster than running state estimation computations. Overall, we can see that the mapping method provides the option to adaptively choose the best-possible partitioning scheme based on the associated graph weights from the power grid measurements. The architecture can be opted for distributing state estimation by heuristic rules such as designating several subsystems together to fulfill some business policy. The architecture allows both hierarchical structure and peer-to-peer structure of DSE independent of the mapping scheme.

TCP Socket Connection (secs) T1 0.052123 0.106736 0.261842 0.523994 1.097956

w/ MeDICi (secs) T2 0.380771 0.643337 1.620076 3.124528 6.015401

Abs. Overhead 1=T2-T1 0.328648 0.536601 1.358234 2.600534 4.917445

TABLE IV: Performance Comparison between W/O MeDICi and W/ MeDICi for Data Communication Between a Linux Workstation and a HPC Cluster Data Size 100MB 200MB 500MB 1GB 2GB

TCP Socket Connection (secs) T3 0.872868 1.743650 4.399657 8.825293 17.754515

w/ MeDICi (secs) T4 1.255889 2.430136 6.133293 11.816114 24.058421

Abs. Overhead 2=T4-T3 0.383021 0.686486 1.733636 2.990821 6.303906

size of data exchanged (only the pseudo measurements), we can conclude the simplified system implementation using MeDICi improves the programming productivity and trades off the middleware overhead.

B. Performance Overheads of MeDICi

VI. C ONCLUSION

We conduct a set of experiments to examine the overhead of the architecture, in particular, using the MeDICi middleware to communicate data between neighboring subsystems. We run the experiments in two modes: with MeDICi and without MeDICi. In the former mode, data sent from the source subsystem needs to go through the MeDICi first before transferred to the destination subsystem. In the latter mode, the source subsystem sends data directly to the destination subsystem using TCP sockets. The time difference between these two scenarios is the absolute overhead incurred by MeDICi. Table III and Table IV show the communication times within a Linux workstation and across the network respectively. First, within a Linux workstation, the communication times using TCP sockets without MeDICi mainly include the start-up costs of TCP sockets since data does not go through the network. Also, we can find that the data relaying rate through the middleware is around 0.4GB/s. Second, between a Linux workstation and a HPC cluster, the communication times increase since data needs to go through the network. The relative overheads of using MeDICi is comparable to the first scenario within a workstation. Similarly, we can calculate that the data relaying rate through the middleware is also around 0.4GB/s. As we vary the communication data size from 100MB to 2GB, the overhead follows a linear trend to the data size as shown in Figure 8. Overall, we can observe that the absolute overheads of using MeDICi depend on the data relaying rate through of the middleware and the data size. Since the distribution of state estimation help to reduce the

Our research in this paper develops a distributed system architecture that connects future decentralized electric power applications to determine the state of the power systems in real-time. Being able to integrate high performance computing platforms with decentralized power applications will significantly improve the computing capacity and efficiency of control systems being used in the electric power system. The early prototype of this architecture encompasses the high performance computing techniques. The solution runs individual state estimators in parallel and uses middleware to communicate the intermediate data at real-time between distributed state estimators. Our ongoing work is developing a DSE test case on the WECC (Western Electricity Coordinating Council) power system. This system has 37

F IGURE 8: Overheads of Data Communication through MeDICi

1475 1469 1452

balancing authorities. State estimation needs to be run on each of these distributed sites in real time to support timely data updates for the reliability coordinators to monitor the interconnected power system. Our distributed system architecture is expected to support DSE in real-time at the balancing authority level and facilitate the data exchange between the hierarchical level of balancing authorities and reliability coordinators. Building the architecture itself leads to a testbed to evaluate the distributed solution of deploying power applications. In this paper, the architecture is demonstrated to decentralize the function of state estimation. Under PNNL’s Future Power Grid Initiative, the development of new models are on progress to deal with distributed energy resources and smart loads. The simulation to aggregate distributed energy resources is computing intensive, and thus requires HPC support. The modeling results that predict the generated energy can be dispatched to various planning applications and wide area control applications. The architecture can be applied to integrate disparate power applications running remotely. The middleware and interface layer in the architecture wrap applications in any language to exchange data in hierarchical structure or peer-to-peer structure. The architecture can invoke applications in any protocol as the middleware supports to transfer data from devices (including phasor measurement units) at substations to the Energy Management System at control centers. R EFERENCES [1] A. Gmez-Expsito, A. de la Villa Jan, C. Gmez-Quiles, P. Rousseaux, and T. V. Cutsem, “A taxonomy of multiarea state estimation methods,” Electric Power Systems Research, vol. 81, no. 4, pp. 1060 – 1069, 2011. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S0378779610002841 [2] Y. Chen, Z. Huang, and D. Chavarria-Miranda, “Performance evaluation of counter-based dynamic load balancing schemes for massive contingency analysis with different computing environments,” in Power and Energy Society General Meeting, 2010 IEEE, july 2010, pp. 1 –6. [3] F. B. Beidou, W. G. Morsi, C. P. Diduch, and L. Chang, “Smart grid: Challenges, research directions and possible solutions,” in Power Electronics for Distributed Generation Systems (PEDG), 2010 2nd IEEE International Symposium on, june 2010, pp. 670 –673. [4] T. Gibson, A. Kulkarni, K. Kleese-van Dam, and T. Critchlow, “The feasibility of moving pmu data in the future power grid,” in CIGRE Canada Conference on Power Systems: Promoting Better Interconnected Power Systems, Hallifax,NS, Canada, September 6-8 2011. [5] W. Jiang, V. Vittal, and G. Heydt, “A Distributed State Estimator Utilizing Synchronized Phasor Measurements,” Power Systems, IEEE Transactions on, vol. 22, no. 2, pp. 563 –571, may 2007.

[6] A. Bose, K. Poon, and R. Emami, “Implementation Issues for Hierarchical State Estimators.” [Online]. Available: http://www.pserc.wisc.edu/documents/publications/reports/ 2010 reports/Bose State Estimation S-33 Final Report 8-2010 ExecSum.pdf [7] D. Bakken, A. Bose, C. Hauser, D. Whitehead, and G. Zweigle, “Smart generation and transmission with coherent, realtime data,” Proceedings of the IEEE, vol. 99, no. 6, pp. 928 –951, june 2011. [8] G. Korres, “A distributed multiarea state estimation,” Power Systems, IEEE Transactions on, vol. 26, no. 1, pp. 73 –84, feb. 2011. [9] A. Gomez-Exposito, A. Abur, A. de la Villa Jaen, and C. Gomez-Quiles, “A multilevel state estimation paradigm for smart grids,” Proceedings of the IEEE, vol. 99, no. 6, pp. 952 –976, june 2011. [10] F. Pasqualetti, R. Carli, and F. Bullo, “A Distributed Method for State Estimation and False Data Detection in Power Networks,” in Smart Grid Comm., Brussels, Belgium, Oct. 2011, to appear. [11] M. Shahraeini, M. Javidi, and M. Ghazizadeh, “Comparison Between Communication Infrastructures of Centralized and Decentralized Wide Area Measurement Systems,” Smart Grid, IEEE Transactions on, vol. 2, no. 1, pp. 206 –211, march 2011. [12] A. Abur, “Distributed State Estimation for Mega Grids,” in Proc. 15th Power Systems Computation Conf., Aug 2005, pp. 22–26. [13] A. Meliopoulos, G. Cokkinides, F. Galvan, B. Fardanesh, and P. Myrda, “Advances in the SuperCalibrator Concept Practical Implementations,” in System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on, jan. 2007, p. 118. [14] D. Bakken, A. Bose, C. Hauser, D. Whitehead, and G. Zweigle, “Smart Generation and Transmission With Coherent, Real-Time Data,” Proceedings of the IEEE, vol. 99, no. 6, pp. 928 –951, june 2011. [15] R. Bobba, E. Heine, H. Khurana, and T. Yardley, “Exploring a tiered architecture for NASPInet,” in Innovative Smart Grid Technologies (ISGT), 2010, jan. 2010, pp. 1 –8. [16] K. Tomsovic, D. Bakken, V. Venkatasubramanian, and A. Bose, “Designing the Next Generation of Real-Time Control, Communication, and Computations for Large Power Systems,” Proceedings of the IEEE, vol. 93, no. 5, pp. 965 –979, may 2005. [17] I. Gorton, A. Wynne, Y. Liu, and J. Yin, “Components in the pipeline,” Software, IEEE, vol. 28, no. 3, pp. 34 –40, mayjune 2011. [18] “METIS Manual.” [Online]. Available: http://glaros.dtc.umn. edu/gkhome/fetch/sw/metis/manual.pdf [19] A. Abur and A. G. Expósito, Power System State Estimation Theory and Implementation. CRC Press, 2004.

1476 1470 1453

Distributing Power Grid State Estimation on HPC Clusters - IEEE Xplore

Distributing Power Grid State Estimation on HPC Clusters - IEEE Xplore

Suggest Documents

Distributing Power Grid State Estimation on HPC Clusters - A System ...

Analog-Digital Power System State Estimation Based on ... - IEEE Xplore

Distribution Grid State Estimation from Compressed ... - IEEE Xplore

Multi Area State Estimation for Smart Grid Application ... - IEEE Xplore

Multi-time Interval Power System State Estimation ... - IEEE Xplore

Dynamic State Estimation in Power System by Applying ... - IEEE Xplore

ON THE ESTIMATION OF GRID OFFSETS IN CS ... - IEEE Xplore

Improving Grid Power Quality with FACTS Device on ... - IEEE Xplore

Investigations on Solar PV Grid Interfaced Power ... - IEEE Xplore

Compressive Wideband Power Spectrum Estimation - IEEE Xplore

compressive power spectral density estimation - IEEE Xplore

State Estimation Sensitivity Analysis - IEEE Xplore

Grid Electrification of Clusters of Villages - IEEE Xplore

PMSM Magnetization State Estimation Based on Stator ... - IEEE Xplore

On the Estimation of Asymptotic Stability Regions: State ... - IEEE Xplore

HPC Clusters Health Check

Power Electronics, IEEE Transactions on - IEEE Xplore

Power Systems, IEEE Transactions on - IEEE Xplore

Power Electronics, IEEE Transactions on - IEEE Xplore

Power Electronics, IEEE Transactions on - IEEE Xplore

Power Electronics, IEEE Transactions on - IEEE Xplore

HPC Clusters Health Check

Power Systems, IEEE Transactions on - IEEE Xplore

Power losses estimation platform for power converters - IEEE Xplore