Achieving Efficient Detection against False Data ... - IEEE Xplore

0 downloads 0 Views 1MB Size Report
Personal use is also permitted, but republication/redistribution requires IEEE permission. See ... Index Terms—Smart grid, state estimation, false data injection attack, control ..... effect. Administrators can use NVSI values on-line to monitor ..... v - is not under attack. Step 2. Detecting FDIA. Using t z and t z as the input,. 0. 0. 1.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

Achieving Efficient Detection against False Data Injection Attacks in Smart Grid

1

Ruzhi Xu, Rui Wang, Zhitao Guan, Longfei Wu, Jun Wu, Xiaojiang Du  Abstract—Internet of Things (IoT) technologies have been broadly applied in smart grid for monitoring physical or environmental conditions. Especially, state estimation is an important IoT-based application in smart grid, which is used in system monitoring to get the best estimate of the power grid state through an analysis of the meter measurements and power system topologies. However, false data injection attack (FDIA) is a severe threat to state estimation, which is known for the difficulty of detection. In this paper, we propose an efficient detection scheme against FDIA. Firstly, two parameters that reflect the physical property of smart grid are investigated. One parameter is the Control Signal from the controller to the Static Var Compensator (CSSVC). A large CSSVC indicates there exists the intense voltage fluctuation. The other parameter is the quantitative node voltage stability index (NVSI). A larger NVSI indicates a higher vulnerability level. Secondly, according to the values of the CSSVC and NVSI, an optimized clustering algorithm is proposed to distribute the potential vulnerable nodes into several classes. Finally, based on these classes, a detection method is proposed for the real-time detection of FDIA. The simulation results show that the proposed scheme can detect the FDIA effectively. Index Terms—Smart grid, state estimation, false data injection attack, control signal, node voltage stability index.

I. INTRODUCTION

A

S the next generation power supply system, the smart grid can achieve reliable and effective transmission of electricity from power generators to factories or household electric appliances with the support of the recent communication and information technologies[1-3]. Taking advantage of the Internet of Things (IoT) technology, different components in smart grid can collaborate and exchange information, so the control center can monitor the physical conditions of the power grid and make the proper decisions. As shown in Fig. 1, the monitoring data in smart grid is collected by field devices, like sensors, meters and remote terminal units (RTUs). Electrical power grid is integrated together with communication and information networks; besides power flow, information flow is also transmitted on most of the network links. However, there are cons and pros in applying information technology to smart grid. Cyber security threats also extend *Zhitao Guan is the corresponding author (e-mail: [email protected]). Ruzhi Xu, Rui Wang, and Zhitao Guan are with School of Control and Computer Engineering, North China Electric Power University, Beijing, 102206, China ([email protected], [email protected], [email protected]). Longfei Wu and Xiaojiang Du are with the Department of Computer and Information Sciences, Temple University, Philadelphia, USA (e-mail: [email protected], [email protected]). Jun Wu is with the College of Information Security Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China. ([email protected]).

from information system to smart grid[4-7]. For instance, false data injection attack (FDIA) is one dangerous security threat in wireless sensor networks. In FDIA, the attackers inject malicious packets into the targeted network to disrupt network services, by either compromising the sensor nodes or hijacking the communication channel. Liu et al firstly demonstrated that FDIA can also become a serious threat in smart grid; they presented two FDIA attack scenarios in which they can successfully inject malicious data to the state estimation in smart grid[8,9]. Power system monitoring is crucial to guarantee the secure and steady operation of power grid. As shown in Fig. 1, first, the monitoring data is collected and transferred to the supervisory control and data acquisition (SCADA) system. It usually includes the active and reactive power flows of branches, the active and reactive power injections of buses, and the bus voltages in every subsystem of the smart grid. Second, the optimal power grid state estimation is performed through analyzing the monitoring data and the power system topologies. The outputs of state estimation are some immeasurable state variables like the voltage amplitude values and voltage phase angles, which can be used as the input of many EMS (Energy Management System) applications, i.e., the power dispatch, accident analysis, power flow analysis. Therefore, state estimation is an important application to assist the smart grid to operate safely and reliably. In Fig. 1, the illustration of FDIA in smart grid is shown. The attackers may inject the falsified monitoring data by three means: compromising the smart meters, sensors or RTUs; hijacking the communication between sensor networks and the SCADA system; or intruding the SCADA system. As the result, incorrect estimate of the smart grid state will be derived due to the false “measurements”. This will further mislead the control center to make wrong decisions and operations, such as bad real-time electricity pricing and even large-area power failure accidents [9,10]. To tackle with such threats, great attentions have been paid to the identification, filtering, and detection of FDIA [8-11]. However, existing FDIA detection methods seldom take the correlation between FDIA and the power system physical parameters into consideration. In fact, the values of physical parameters are closely related to the status of the power system. For instance, under certain initial operating condition, the voltage stability reflects the ability of the power system to return the stable state after suffered from a physical interference [12]. In this paper, based on analyzing the impact of FDIA on the voltage stability, we present an efficient detection method against FDIA. And the main contributions of our work can be summarized as following:  We describe the FDIA problem with a unified logic for

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

2 state estimation, and give the formal model of FDIA, which can deal with complex environment of smart grid with high dynamics.  We investigate the relationship between two physical parameters and FDIA. The first parameter is the Control Signal from the Controller to the Static Var Compensator (CSSVC), which can be used to analyze the node voltage fluctuation to get the impact of FDIA on voltage deviation. The second parameter is the Node Voltage Stability Index (NVSI), we also studied the relationship between FDIA and the node voltage stability, which can be used to quantitatively depict the impact of FDIA on the system measurements.  An optimized clustering algorithm is proposed, which takes the topology of the power system, and the values

of CSSVC and NVSI as the input, and distributes the potential vulnerable nodes into different classes.  Based on the above, we propose a state forecasting method to make state prediction and detect FDIA. And we perform the simulation in the IEEE 39-bus and IEEE 118-bus systems to verify the efficiency and performance of the proposed scheme. The rest of this paper is organized as follows. Section II introduces the related works. In section III, the preliminaries are given. The node vulnerability level identification method is stated in section IV. In section V, based on the results of section IV, the FDIA detecting method is stated. In Section VI, the performance of the proposed scheme is evaluated. The paper is concluded in Section VII.

Power Grid Field Devices Power Generation

Injection Attacks

Measure Data Acquisition

False commands /operations

Power Transmission

Power Distribution

Power Consumption

Injection Attacks

S C A D A

Accident Analysis

Power Dispatch

State Estimation

Control Center/EMS Power Flow Analysis

S C A D A

Injection Attacks

Fig 1. False Data Injection Attack (FDIA) in smart grid.

II. RELATED WORK FDIA, as a typical data integrity attack, is one of the most threatening cyber-attacks in smart grid. It is first presented in [8]. To mitigate this serious vulnerability, many algorithms have been proposed to detect FDIA [13], such as the geometrically designed residual filter and generalized likelihood ratio test [5]. The cumulative sum (CUSUM) test-based detection mechanism introduced in [14,15] is also designed to counter FDIA. Other works [16,17] use the machine learning method to deal with the stealthy false data. In [18], the relationship between the physical characteristics of the power system and FDIA is analyzed, and the vulnerable nodes can be identified. Moreover, how to deploy Precision Measurement Units (PMUs) economically and effectively to assist the state estimator and improve FDIA detection has

become a valuable question [19,20]. In [21], a detection method based on PMU is proposed; the authors assume that the measurements in part of the system are guaranteed to be reliable due to the physically protected hardware, for example, the attackers couldn’t tamper the protected meters; otherwise it will be detected as an attack and enforce a restraint to the attackers' behavior. With the increasingly interconnected power systems in smart grid, distributed state estimation (DSE) has become an important alternative to the centralized and hierarchical solutions [22, 23]. In [24], two new methods of distributed state estimation are proposed, one method uses the incremental mode of cooperation, and the other is based on diffusive interaction pattern. The work [25] apply distributed state estimation (DSE) into fully distributed power systems for the detection of FDIA. In [26], Cramer et al. propose a bad data detection method based on an extended distributed state

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

3 estimation (EDSE). In EDSE, each power system is divided into several subsystems by using graph partition algorithms. Buses are divided into three categories in each subsystem: internal bus, boundary bus and adjacent bus. Simulation results indicate that the EDSE-based method has higher overall detection accuracy than the traditional method. However, the

EDSE-based method has lower computation complexity. Many feasible false data injection attack detection methods have been proposed in existing literatures. In this paper, we analyzed and compared the advantages and disadvantages of these methods. The results are summarized in TABLE I.

TABLE I THE COMPARED RESULT OF EACH DETECTION METHOD Detection Method The ∞- norm detection

The advantage of detection method Detect the false data under the non-ideal condition effectively

The disadvantage of detection method

Reference

The threshold settings influence the detection accuracy

[27]

Error detection and residual detection Unable to detect the false data under the ideal condition

Adaptive partition detection

Higher detection accuracy

Unable to detect the false data under the ideal condition

Generalized Likelihood Ratio Test

Detect the false data under the ideal and non-ideal condition

High computational complexity

[28,25] [29]

Slow detection speed Doesn’t apply to large system

Adaptive CUSUM detection

Detect the false data under the ideal and non-ideal condition with higher detection accuracy

Consuming too much time for large systems

[18]

The detector implementing the Euclidean distance metric

Detect the false data under the ideal and non-ideal condition

Abnormal fluctuations caused by the normal data will significantly affect the test results

[30]

Principal component analysis test

Higher detection accuracy

The communication data loss will lead to error in detection results

[16]

Most of the existing detection methods have their respective advantages and disadvantages, and each detection method is generally adapted to one specific scenario. Besides, various methods have been proposed to address the problems of false data injection attacks in Smart Grid. It is one novel approach to utilize the physical parameters of the power system to detect FDIA[18]. In this paper, inspired by the work [18], we also study the relationship between the physical parameters of the power system and FDIA. Only one physical parameter is taken into consideration in [18], while multiple physical parameters are analyzed together in our work. In [18], all nodes are identified the level of vulnerability, which can needs much computational cost. While, in our work, only some of key nodes, the reactive compensation leading nodes, will be considered for vulnerability level identification, which can reduce the computational cost significantly and is more suitable for large scale power system. In addition, we propose an efficient FDIA detection method based on the results of vulnerable nodes identification. III. PRELIMINARIES In this section, all important notations are listed. The state estimation in power system is introduced. Next, the control of voltage stability and the node voltage stability index are briefly discussed. A. Notations The important notations used in this paper are listed in TABLE Ⅱ.

e m n v'

TABLE Ⅱ DESCRIPTION FOR NOTATIONS Description The m1 vector of measurements The n 1 vector of state variables The m  n Jacobian matrix denoting the power system topology Random errors of measurements The number of measurements The number of state variables The estimated value of state variables

zf

The m1 measurements vector with false data

b c

The m1 attacked vector The n 1 vector of estimated errors The estimated value of state variables with false data The threshold The measurement residuals

Notation

z v H

v 'F





CSSVC ( Ni ) NVSI ( Ni )

The control signal at node i The voltage stability index at node i

Uj

The voltage magnitude of node j

R X

Pi

The resistance of branch The reactance of branch The real power of node i

Qi

The reactive power of node i

K

The number of the centroids

Gt 1

The state transition matrix at time sample t  1

Qt 1

The nonzero diagonal matrix at time sample t  1

zt

The forecasting measurements at sample t

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

4 B. Problem Formulation 1) State estimation State estimation is of vital significance in connecting measurements collected via the communication network and controlling the physical operations in a smart grid. Using the redundancy of real-time measurement system to improve the data accuracy, the state estimation automatically eliminates the error information caused by random interferences, estimate or predict system operating state. The main task of the state estimators is conducted in a control center with the following three main functions: (1) Improving the accuracy of data measurements. The noise and disturbance in data acquisition and transmission may cause data error. Using state estimation, the best estimate of the system status can be got. (2) Detecting and identifying the bad data. Then, delete or correct bad data, and improve the reliability of the data measurements. (3) Forecasting the future trend and possible state of the system. The forecast data enriches the contents of the database, and provides the necessary conditions for the safety analysis and operation planning. The procedure of state estimation is shown in Fig. 2. Topology Information

IoT Front-ends Measurements

Estimation Compute state variables and residue errors

 z1   h1 (v1 , v2 ,  z   h (v , v , z 2  2 1 2  ...   ...     zm   hm (v1 , v2 ,  h1 ( v1 , v2 ,  h (v , v , Where, h(v )   2 1 2  ...   hm ( v1 , v2 ,

Here, v is the state vector, e is the Gaussian error noise with a zero mean and a covariance matrix  e and h(v) is the theoretical calculation of measurement z. We all know that h(v) is used to describe Alternating Current (AC) power system. So it is a set of nonlinear functions of the state variables. However, if the load flow equations submitted to Direct Current (DC) approximations, h(v) would be a set of linear functions. Hence, the DC power model for the real power measurements can be represented in a compact vector-matrix form as: z  Hv  e ................................................................ (4) Where, H 

h(v) is an invariable Jacobi matrix. v v  v 0

State estimation aims to find an estimated value of v, denoted by v’. v’ is the optimal solution for the equation z  Hv  e . The Weighted Least Squares Algorithm is often used to solve such problems. The form of state estimation is transformed into a quadratic optimization problem: T 1 J (v)   z  Hv  e  z  Hv .................................... (5) And the estimated linearized state vector v’ is given by 1 1 v '  ( H T  e H )1 H T  e z ....................................... (6) Let Q  ( H T  e H )1 H T  e , and then the measurement 1

Traditional Detection

N

If there is topology error or bad data

End

Y

Identification Locate the bad data or error

Fig. 2. The flowchart of state estimation.

The states in smart grid usually include the voltage magnitude and the bus phase angles. In this paper, the state vector is denoted by v. If a system contains n buses, the state vector will be written as: v  [v , v ,..., v ]T (v  R) ............................................ (1) 1

2

n

1

2

m

i

1

residuals  can be expressed as:   z  Hv '  (I  Q)(Hv  e)  (I  Q)e ..................... (7) The Weighted Least Squares estimation method takes the minimum square residual as the objective function f. Most of the existing detection methods are based on the Chi-square test or the Residual test. Taking Chi-square test as an example, the identification basis of this method is shown in Table III, where v ' is a vector of estimation state variable and  is a threshold [8]. The definition of the constant  is a key issue. Assuming that all the variables are mutual independent and that the 2 measuring errors follow the normal distribution, z  Hv ' will follow Chi-squared distribution, whose degree of freedom is m-n. TABLE III CRITERION OF BAD DATA DETECTION

i

Where vi indicates the state variable at the i-th bus, usually includes the voltage angle or voltage amplitude. Take measurement vector z into account. For a system contains n buses, the measurement vector is denoted as z  Rm1 , m>2n-1.Then, z should be: z  [ z , z ,..., z ]T ( z  R) ........................................... (2) For nonideal sensors, there exsit some errors between measurement function values and actual measurement values. Take measurement errors into account, state estimation in actual electric power system can be described as:

, vn )   e1  , vn )   e2    h(v )  e  ...     , vn )   em  , vn )   e1   v1  e  v  , vn )  , e   2  , v   2  ....... (3)   ...  ...       , vn )   em   vn 

Detection State

f 

Normal State

f 

Wrong State

f 

Marginal State

f 

2

2) FDIA Bad data is generated due to the random errors of measurements, while false data is deliberately constructed by the malicious attackers. Traditional bad data detection method in state estimation work efficiently for detecting bad data, but it

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

5 is ineffective for detecting FDIA. Under FDIA, the falsified data b is maliciously injected into the power flow measurement vector as: z  Hv  b  e ................................................................. (8) The observed measuring zf is represented as T z f  [ z f1 , z f2 ,..., z fm ] ................................................ (9)

And the injected false data vector is T b  [b1, b2 ,..., bm ] .................................................... (10) Then, z will be equaled with z f  z  b ............................................................. (11) f

When there exist false data injected by some attackers, b will be a nonzero vector. The estimation state variable v ' will be changed into v 'F due to the injected false data and there is v 'F  v ' c , where c is an n-dimensional and nonzero vector. Assuming that the injected data vector z f equals to Hc , b will be ignored by the traditional detection method as mentioned above. This is because z f  Hv 'F  z  b  H (v ' c)  z  Hv ' ............... (12) Thus FDIA is able to escape from traditional detections. C. CSSVC CSSVC is the control signal from the controller to the Static Var Compensator (SVC), which is used to support reactive power compensation and voltage control of the corresponding nodes. SVC is with a complicated higher-order nonlinear model, and it is usually controlled by the Proportion Integral Differential (PID) controller. So CSSVC can be calculated by PID algorithm, which is shown as follows, CSSVC ( N i )  K p Ui  Ki  Ui dt  Kd

d Ui ............... (13) dt

When the power system works in normal status, the voltage of each node will only float in a small range (±5%) around the rated value. In (13), U i is the deviation between the voltage magnitude of node i and its rated value. The first part contains the current status information in the voltage control process; the second integral part contains the past status information; and the differential part contains the future status information. To sum up, CSSVC reflects the fluctuation of voltage in the power system. When the power grid is under FDIA, the voltage of relevant nodes will fluctuate for a period of time, so there will exist deviation between the voltage magnitude of these nodes and their rated value. The controller will be triggered by the voltage deviation to send abnormal high CSSVC to the control center, and the system administrator can analyze the abnormal data to locate the fault point. With the increasing complexity of the higher voltage level and network structure, voltage stability control is becoming more difficult. In this work, the method for partitioning the power system at the voltage stability critical point is adopted [31], which is proved effective to solve the voltage control problem of complex power grid. In each partition, the appropriate reactive compensation node is selected according to the sensitivity analysis. SVC is installed in these reactive

power compensation leading nodes, and more details is stated in Section IV.C. D. NVSI Voltage stability is an important concern for power systems. NVSI is one of the key factor regarding voltage stability, which can reveal the real-time status of voltage stability and help to prevent the occurrence of voltage collapse accidents [32]. It has been found that the main advantages of using NVSI are its accuracy in modeling and calculating, and ease of measurement in real time or on-line applications. The values of NVSI alert us the existence of vulnerable nodes. These vulnerable nodes may cause the system instability. When the system load parameter is close to collapse critical value, NVSI values may also take effect. Administrators can use NVSI values on-line to monitor the system stability and even make the prediction. Based on NVSI values, administrators can take proper action to prevent voltage collapse. So, in this paper, we identify the vulnerable nodes by means of the CSSVC and NVSI at first, and then perform the detection for FDIA at each node according to the vulnerability level. In our scheme, we use the method presented in [32] for calculating the NVSI of each reactive power compensation leading node in the power system, as follows, NVSI ( Ni )  4U j 4 ( RQi  XPi )2  4U j 2 ( XQi  RPj ) .......... (14)

The NVSI only needs measurements information about nodes, which can be easily obtained from the synchronized PMUs or the existing state estimator of EMS at control centers. It can be calculated in a very short time so that it can be applied in real-time or on-line environment. In the Eq. (14), NVSI ( Ni ) is the voltage stability index at the node i , U j is the voltage magnitude of node j. Pi , Qi are the summation of the real power and reactive power. Besides, R is the resistance of branch. X is the reactance of branch. We can obtain R and X from the power network electric topological database. By doing a successful power flow solution, all parameters of Eq. (14) can be obtained, and the NVSI of each reactive power compensation leading node can be calculated in short order. IV. NODE VULNERABILITY LEVEL IDENTIFICATION In this section, we use the CSSVC and NVSI to analyze the impact of FDIA. We use an optimization algorithm to distinguish which nodes are tend to be attacked, which can help the administrator to locate the injected points. The details are as follows. A. The influence of FDIA on CSSVC For the real-time control system, CSSVC can be transmitted to the control center through the communication network. However, if an attacker has launched FDIA, the state variables such as voltage of reactive compensation nodes might be changed. We analysis the voltage control system as a close loop system and take the tampered state variable as the feedback, as shown in Fig. 3.

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

6

CSSVC

U rated

U i

+

PID

Clustering Algorithm

SVC

Ui

-

Ui '

FDIA Fig.3 Structural block diagram of PID voltage regulator

After being attacked, the true value of the node voltage U i measured by the system is replaced by false data U i ' , which leads to a deviation U i between the voltage magnitude of node i and its rated value U rated . The deviation U i triggers PID controller, and PID controller produces the CSSVC. SVC accepts a false command and changes the true node voltage. We take IEEE 39-bus system as an example for simulation. Due to space limitation, we display voltage fluctuations for only one of the reactive power compensation nodes under FDIA. From Fig.4 we can see that after FDIA, the true value of the node voltage varies with the false value. We compare the values of CSSVC before and after attack in TABLE IV. In general, the more intensive the voltage fluctuation is, the larger the CSSVC is. The weaker the voltage fluctuation is, the smaller the CSSVC is. TABLE IV THE VALUES OF CSSVC BEFORE/AFTER ATTACK SVC node

before FDIA

after FDIA

12

0.1874

0.9756

20

0.0167

0.5539

23

0.0108

0.0176

25

0.2142

0.8867

29

0.0836

0.4769

39

0.1215

0.6535

Fig.4 Voltage fluctuation after attack

B. The influence of FDIA on NVSI The system operation data can be obtained in real-time or extended real-time operation easily. For instance, the operation data can be obtained from synchronized PMUs or the existing state estimator of EMS at control center. However, if some or all of the measurements are accessible for an attacker, these measurements will be modified and injected by false data. Once the measurements are tampered, the real and reactive power measurements Pi , Qi and U j will be changed, and the value of NVSI will change correspondingly. In general, the smaller the NVSI is, the better the system node voltage stability is; the greater the NVSI is, the worse the system node voltage stability is. In the worst condition, the node voltage will collapse when it approaches the system voltage collapse critical point. So, the system administrator should be vigilant to the vulnerable nodes and keep the system from FDIA attacks or the verge of instability.

Fig.5 The value of NVSI under different attack intensity of IEEE 14-bus system

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

7

Fig.6 The value of NVSI under different attack intensity of IEEE 30-bus system

We try to imitate the attackers who inject the false data and tamper the measurements. In the simulations, we assume that the state variable Pi , Qi and U j are gradually and proportionally increased to stress the system with a constant factor §, which can be seen as the attack intensity. For each state variable in the power system, we use §times its real estimate as the injected error. §is set to 1, 1.5, 2, 2.5, 3 respectively, and the results of IEEE 14-bus system and IEEE30-bus system are shown in Fig.5 and Fig.6. Here, §=1 indicates that there is no attack. From the two figures, we can see that the value of NVSI increases as the attack intensity §increases, which reflects that the node becomes unstable.

quality cluster centroids initially. The steps of the method are as follows: Step 1). Algorithm initialization K nodes are selected randomly as the initial center of clustering. The encoding of particle i is expressed as follows: par (i)  {loc[], vel[], fit} .................................... (15) The position encoding of the particle i is expressed as follows: par (i).loc[]  [ X  , X  ,…, X  ] ......................... (16) 1

2

K

Where X  represents the  j th cluster centroid j

C. The methods of identifying the node vulnerability level According to the CSSVC and NVSI values of all reactive power compensation nodes, we can identify the vulnerable nodes in the power system preliminarily. Therefore, the vulnerable nodes can easily trigger an emergent remedial action scheme to remind the administrator to detect the FDIA and take appropriate measures to protect the system. It is worth thinking how to identify these vulnerable nodes in the power system. Next, we will analyze our methods. According to the similarity between different data sources, clustering algorithms can classify the data sources into different clusters. If clustering the vulnerable nodes with larger value of NVSI and higher CSSVC into the same cluster, it will not be hard to distinguish these vulnerable nodes, which can help the administrator to identify the injected data roughly. The k-means clustering is popular in datamining field. It is a vector quantization method developed from signal processing. However, its defiect is that it randomly selects K points as the initial cluster centroids at the beginning, and it is easily trapped in the local optimum. We adopt a valid method to get better clustering results. The improved k-means clustering based on the particle swarm optimization (PSO) is efficient to obtain the

calculated by particle par (i) . The velocity encoding of the particle is expressed as follows: par (i).vel[]  [V1,V2 ,…,VK ] ............................. (17) The location of a particle corresponds to the set of K cluster centers. Step 2). Fitness calculation According to the clustering algorithm criteria, the definition of the fitness of par (i) in swarm is as equation below: K

par (i ). fit 

  ( X j 1 X i 

i

X

( j )

j

Mp

2

) ........................... (18)

Where M p is the input size of clustering algorithm. Step 3). If the iteration number reaches the maximum limitation, then turn to Step 8, if not, turn to Step 4. Step 4). Store the best centroids

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

8 If par (i) gets a smaller fitness, store the particle as the individual optimal solution pid .The position of particle with the minimum fitness among all the particles is stored as pgd .

Where iter is the present iteration number, and itermax is the maximum limitation of iteration number,  max =1 ,.  min  0 Step 7). If the global best of the swarm, pgd , keep stable for a

Step 5). Swarm regeneration

number of iterations, turn to Step 8; if not, turn to Step 3.

The position updating algorithm is as follows: par (i).vel[]   * par (i)vel[]  c1 * r1( pid (i).loc[]  par (i).loc) c2 * r2 ( pgd .loc[]  par (i).loc[]) .................... (19)

Step 8). Use the traditional k-means algorithm to complete the clustering program. G

G

par (i).loc[]  par (i).loc[]  par (i).vel[] ....................... (20)

Where  is the inertia weight of velocity updating. The initial value of  is 1. If par (i) falls beyond the range of possible solutions [ X 0 , X 0 ] ,then the location of par (i) is set min

0

0

min

max

to X or X

37

30

26

25

2

1

1

18

3

G

28

29

27

N

Level 1

N

Level 2

N

Level 3

38

6

17

39

G

16

21

max

2

; similarly if a new generated velocity of 0

0

par (i) is beyond the velocity range [Vmin ,Vmax ] , the new

5

15

5

G

36

24

14 9

generated velocity will be set to V 0 or V 0 . min

3

4

13

6

max

23

12

Step 6). Inertia weight regeneration

19

7 11

According to the following equation to renew the inertia weight  :      max  iter max min ....................................... (21)

31

8

4

10

22

20

33

34

35

32 G

G

G

G

G

Fig.7 The node vulnerability level of IEEE 39-bus system

itermax



G G



15 G

G

G 17



43 45

19

18

46

2

31 115

24

4

Level 2

N

Level 3

61

8

… 62

68

G

70 G



22

G 23

78

G

74

75

118



G 25

N

G

69

G

G 26

27

… 49

G

71

G 72

21

32 34

20

60



G

G

G





47

73

30

Level 1

58



G

113

N

48

38

G

59

G

51



13

16

57

52 44

34

56





… 37

G

14



54





G

1

3

117

76 77

G

6



96

82

100



… …

95

89

G

94

93

G

103

92 G

101







108

G

… …

88

99 98

G

G 85

80



5

G

91

109 G

102

G

110

… …

12

53





7

Fig.8 The node vulnerability level of IEEE 118-bus system

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

9 We summarize the total steps of the vulnerable node identification procedure as follows: 1) Read the system data and calculate the NVSI of each reactive power compensation leading node Usually, the system administrator can easily obtain the power data from synchronized PMUs or the existing state estimator of EMS at the control center. So it is convenient to calculate the NVSI quickly. The CSSVC is transmitted to the control center in real time. In our work, we use the test data obtaining from the power simulation package, MATPOWER[33]. And we calculate the NVSI of two IEEE standard test systems, IEEE 39-bus and IEEE 118-bus. 2) Obtain the high-quality clustering centroids from improved algorithm, and cluster the system nodes into different classes In this procedure, we set K=3, which means that we will cluster all nodes into three different classes. Most importantly, if we want to get better cluster results, we need three better centroids firstly and each centroid obtained from the PSO algorithm. Then, the high-quality centroids will be used as the real initial cluster centroids to perform the standard k-means algorithm again and cluster the reactive compensation nodes into three classes according to their CSSVC and NVSI, each of the classes represents one different vulnerability level. 3) Identify the nodes vulnerability level of different classes In order to identify the vulnerable nodes, we use each of the classes to represent one vulnerability level. For example, the Level.1 indicates the most vulnerable level with higher value of CSSVC and higher value of NVSI, the Level.2 indicates vulnerable level and the Level.3 indicates the stable level, respectively. Therefore, we can easily identify the vulnerable nodes according to the different nodes vulnerability levels. We conduct experiments using the measurements at the moment t in the IEEE 39-bus and IEEE 118-bus. Firstly, we try to inject the false data into the measurements of some nodes and then calculate the NVSI of each reactive power compensation leading node quickly. Secondly, we get PID output of these nodes at moment t. Then, it is easy to cluster all reactive power compensation leading nodes into three classes and determine the vulnerability level of these nodes by the improved algorithm proposed above. The simulation results of the two standard systems are shown in Fig.7 and Fig.8. In order to make it easier to observe, we use different colors to distinguish the node vulnerability levels. Specifically, the red color represents the most vulnerable level, namely Level.1. Similarly, blue represents Level.2 and green represents Level.3. After the vulnerable nodes are found, the administrator can treat them as the suspicious false data injection attacks targets. Then, they should take a precise detection measures immediately. V. DETECTING FDIA In section IV, the most vulnerable nodes are recognized. It is expected to decrease the cost of detecting FDIA greatly since the recognized vulnerable nodes that are the most suspicious targets of FDIA will be checked preferentially. Next, we propose a FDIA detection method named State Forecasting Detection (SFD), in which two steps are included.

Step 1. Measurement forecasting It needs to get the real measurement and the forecasted measurement of a specific moment. The forecasted measurement is got based on the short-term forecasting technology. The short-term forecasting technologies have been used in many fields [35,36]. Based on the historical measurements, the next measurement can be forecasted. The short-term prediction methods include statistical regression method, time series method, neural network method, etc. In our scheme, the auto-regressive model is adopted for FDIA detection. Consider two consecutive moments t  1 and t, using the measurement

zt 1 of t  1 moment, we can estimate the measurement zt of t moment. We can also get the measurement zt of t moment. The details of step 1 is as follows. vt '  Gt 1vt 1  Qt 1 ................................................... (22) where Gt 1 is the state transition matrix, vt 1 is the estimated state value and Qt 1 is the nonzero diagonal matrix at t  1 moment. Considering two sampling time t  1 and t , we calculate the forecasting measurements of time t as:

zt  Hvt ' ......................................................................... (23) At time t , we obtain the measurement vector zt , and detect the FDIA by verifying the consistence of the measurements and the forecasting measurements. The premise of the prediction is the assuming that the system measurements at t  1 moment are not under any attack. That is, the state variable vt 1 is not under attack. Step 2. Detecting FDIA Using zt and zt as the input,

 ( z t  zt )   , H 0 is true  H 0 assumption: R  H H T   .... (24)  ( zt  zt )    , H 0 is false  H1 assumption: R  H H T   Where H 0 is the null hypothesis and H1 is the alternative hypothesis. R is diagonal covariance matrix of the measurement.  is the detection threshold. If H 0 is true, no attack is being found; on the contrary, if H 0 is false, there exists FDIA. The proposed detection method is verified effective and accurate in IEEE 39-bus and IEEE 118-bus power system. VI. PERFORMANCE EVALUATION The simulation environment is set up using MATPOWER [33], and the test data is obtained from it. We construct the false data vectors using the similar way as [14]. After vulnerable nodes are recognized, we compare the SFD with two traditional methods for detecting FDIA on vulnerable nodes, Largest Normalized Residue (LNR) method and the Object Detection J (v) method. For J (v) is the most widely used method for FDIA

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

10 detection; and the proposed SFD method is proposed based on

LNR method.

Fig.10 The detecting results in IEEE 118-bus system

Fig.9 The detecting results in IEEE 39-bus system

In Fig.9 and Fig.10, the detecting results in IEEE 39-bus system and IEEE 118-bus system are shown. It can be seen that the detection rate increase with the false alarm probability gradually. From Fig.9, it can be seen in IEEE 39-bus system, when the percentage of the attacked measurements is low, the three methods are not effective for detection. No matter how large the percentage is, the three methods cannot detect all the attacks. That is because the system is not large enough so that the changes of FDIA on the system is not very obvious. Among the three methods, the proposed SFD has a higher detection rate at the same attack percentage. From Fig.10, it can be seen in IEEE 118-bus system, when the percentage of attacked measurements is low, all the three methods can detect a small part of attacks. With the increase of the attack percentage, the detection rates of the three methods increase. When all the measurements in the system are under attack, the detection rate of SFD can reach 90%. From Fig.9 and Fig.10, we find that SFD have higher accuracy and higher detection rate. On the other hand, we compare two different systems and find that there is a higher detection rate and accuracy in larger systems. In a word, the effectiveness of SFD is verified by the simulation. VII.

CONCLUSION

To deal with the problem of FDIA in smart grid, which may lead to wrong decision makings in power dispatch or electric power market operations, we propose an efficient FDIA detection scheme based on power system physical parameters. Firstly, we analyze the power system and investigate the CSSVC and NVSI to identify the vulnerability level of nodes in the power system. As the result, we define three levels to cluster the nodes into three swarms. In the progress of clustering, we use an improved cluster algorithm for nodes clustering. This step helps us to find the suspicious false data injection nodes easily. Then we use the state forecasting method to obtain the states of power system. In addition, the results based on state forecasting detection methods are used to find the sensitive

measurement vectors. In the simulation, we built different types of attack vectors, which gives us sufficient experimental results. Finally, the simulation results demonstrate that the proposed mechanism can effectively detect FDIA in smart grid. In future work, we will verify the proposed scheme in larger system, such as IEEE 300-bus and IEEE 1354-bus system. ACKNOWLEDGMENT This work is partially supported by Natural Science Foundation of China under grant 61402171, 61471328, and the Fundamental Research Funds for the Central Universities under grant 2016MS29, as well as by the US National Science Foundation under grant CNS-1564128. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

X. Fang, S. Misra, G. Xue, and D. Yang, “Smart grid—The new and improved power grid: A survey”, IEEE communications surveys & tutorials, vol.14, no.4, pp. 944-980, Dec. 2012. N. Kayastha, D. Niyato, E. Hossain, and Z. Han, "Smart grid sensor data collection, communication, and networking: a tutorial", Wireless communications and mobile computing, vol.14, no.11, pp.1055-1087, Aug. 2014. X. Hei, and X. Du, “Biometric-based two-level secure access control for Implantable Medical Devices during emergencies”, in proc. IEEE INFOCOM, 2011, pp. 346-350. T. Liu, Y. Sun, Y. Liu, Y. Gui, Y. Zhao, D. Wang, and C. Shen, "Abnormal traffic-indexed state estimation: A cyber–physical fusion approach for Smart Grid attack detection", Future Generation Computer Systems, vol.49, no.48, pp.94-103, Aug. 2015. F. Pasqualetti, F. Dorfler, and F. Bullo, "Cyber-physical attacks in power networks: Models, fundamental limitations and monitor design", in proc. Decision & Control & European Control Conference, 2011, pp.2195-2201. S. Liang, and X. Du, “Permission-Combination-based Scheme for Android Mobile Malware Detection”, in Proc. of IEEE ICC, 2014, pp. 2301-2306. X. Hei, X. Du, S. Lin, and I. Lee, O. Sokolsky, "Patient Infusion Pattern based Access Control Schemes for Wireless Insulin Pump System", IEEE Transactions on Parallel & Distributed Systems, vol.26, no.11, pp.3108-3121, Nov. 2015. Y. Liu, P. Ning, and M. K. Reiter, "False data injection attacks against state estimation in electric power grids", ACM Transactions on Information and System Security, vol.14, no. 1, pp.1-33, May. 2011.

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2728681, IEEE Access

11 [9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

Z. Guan, P. An, and T. Yang, "Matrix partition-based detection scheme for false data injection in smart grid", International Journal of Wireless and Mobile Computing, vol.9, no.3, pp.250-256, Jan. 2015. Z. Guan, N. Sun, Y. Xu, and T. Yang, "A comprehensive survey of false data injection in smart grid", International Journal of Wireless and Mobile Computing, vol.8, no.1, pp.27-33, 2015. E. N. Asada, A. V. Garcia, and R. Romero, "Identifying multiple interacting bad data in power system state estimation", in proc. IEEE Power Engineering Society General Meeting, 2015, pp.571-577. T. Zabaiou, L. A. Dessaint, and I. Kamwa, "Preventive control approach for voltage stability improvement using voltage stability constrained optimal power flow based on static line voltage stability indices", IET Generation, Transmission & Distribution, vol.8, no.5,pp.924-934, May. 2014. S. Cui, Z. Han, S.Kar, and T. T. Kim, "Coordinated data-injection attack and detection in the smart grid: A detailed look at enriching detection solutions", IEEE Signal Processing Magazine , vol.29, no.5, pp.106-115, Sept. 2012. O. Kosut, L.Jia, R. J. Thomas, and L. Tong, "Malicious data attacks on the smart grid", IEEE Transactions on Smart Grid, 2011, vol.2, no.4, pp.645-658, Oct. 2011. S. Li, Y. Yılmaz, and X. Wang, "Quickest detection of false data injection attack in wide-area smart grids", IEEE Transactions on Smart Grid, vol. 6, no.6, pp. 2725-2735, Dec.2014. L. Liu, M. Esmalifalak, Q. Ding, and V. A. Emesih, "Detecting false data injection attacks on power grid by sparse optimization", IEEE Transactions on Smart Grid, vol.5, no.2, pp.612-621, Feb.2014. Y. Xiao, H. H. Chen, X. Du and M. Guizani, “Stream-based Cipher Feedback Mode in Wireless Error Channel”, IEEE Transactions on Wireless Communications, vol. 8, no. 2, pp.662 - 666, Feb.2009. A. Anwar, A. N. Mahmood, and Z. Tari, "Identification of vulnerable node clusters against false data injection attack in an AMI based Smart Grid", Information Systems, vol.53, pp.201-212, Oct.2015. R. Deng, G. Xiao, and R. Lu, "Defending against false data injection attacks on power system state estimation", IEEE Transactions on Industrial Informatics, vol.13, no.1, pp.198-207, Aug.2015. S. Bi, and Y. Zhang, "Graphical methods for defense against false-data injection attacks on power system state estimation", IEEE Transactions on Smart Grid, vol. 5, no.3, pp.1216-1227, Apr.2014. A. Giani, E. Bitar, M. Garcia, and M. Mcqueen, "Smart grid data integrity attacks", IEEE Transactions on Smart Grid, vol.4, no.3, pp.1244-1253, Apr.2013. T. T. Kim, and H. V. Poor, "Strategic protection against data injection attacks on power grids", IEEE Transactions on Smart Grid, vol.2, no.2, pp.326-333, Apr.2011. V. Kekatos, and G. B. Giannakis, "Distributed robust power system state estimation", IEEE Transactions on Power Systems, vol.28, no.2, pp.1617-1626, Oct.2012. M. Ozay, I. Esnaola, F. Vural, S. Kulkarni, and H. Poor, "Distributed models for sparse attack construction and state vector estimation in the smart grid",in proc. IEEE Third International Conference on Smart Grid Communications, 2012, pp.306-311. Y. Gu, T. Liu, D. Wang, and X. Guan, "Bad data detection method for smart grids based on distributed state estimation", in proc. IEEE International Conference on Communications (ICC), 2013, pp.4483-4487. M. Cramer, P. Goergens, and A. Schnettler, "Bad data detection and handling in distribution grid state estimation using artificial neural networks", in proc. PowerTech, IEEE Eindhoven, 2015, pp.1-6. O. Kosut, L. Jia, R. J. Thomas, and L Tong, "Limiting false data attacks on power system state estimation" in proc. Information Sciences and Systems, 2010, pp.1-6. D. Wang, X. Guan, T. Gu, C. Chen, and Z. Xu, "Extended distributed state estimation: a detection method against tolerable false data injection attacks in smart grids", Energies, vol.7, no.3, pp.1517-1538, Mar.2014. O. Kosut, L. Jia, R. J. Thomas, and L. Tong, "On malicious data attacks on power system state estimation", in proc. Universities Power Engineering Conference (UPEC), 2010, pp.1-6. K. Manandhar, X. Cao, F. Hu, and Y. Liu, "Detection of faults and attacks including false data injection attack in smart grid using kalman filter", IEEE transactions on control of network systems, vol.1, no.4, pp.370-379, Sep.2014. Y. Huang, H. Yang, and C. Huang. "Solving the capacitor placement problem in a radial distribution system using Tabu Search approach”,

[32]

[33]

[34]

[35]

[36]

IEEE Transactions on Power Systems, vol.11, no.4, pp.1868-1873, Nov.1996. G. B. Jasmon, and L. Lee, "New contingency ranking technique incorporating a voltage stability criterion", IEE Proceedings-Generation, Transmission and Distribution, vol.140, no.2, pp.87-90, Mar.1993 R. Zimmerman, and C. Murillo, "MATPOWER - A MATLAB Power System Simulation Package", http://www.pserc.cornell.edu//matpower/, Dec. 2016. R. C. Eberhart, and Y. Shi, "Comparing inertia weights and constriction factors in particle swarm optimization", in proc. Congress on Evolutionary Computation, 2000, pp.84-88. Z. Wang, and Z. Huang, "An Analysis and Discussion on Short-Term Traffic Flow Forecasting", Systems Engineering, vol.21, no.6, pp.97-100, Jun.2003. A. Ding, X. Zhao, and L. Jiao. "Traffic flow time series prediction based on statistics learning theory" in proc. Intelligent Transportation Systems, Sept.2002, pp.727-730.

2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.