Anomaly-Based Intrusion Detection for SCADA Systems - CiteSeerX

24 downloads 17229 Views 467KB Size Report
With the constantly growing number of internet related computer attacks, there is evidence that our .... that security violations should change the system usage.
Anomaly-Based Intrusion Detection for SCADA Systems Dayu Yang, Alexander Usynin, and J. Wesley Hines Department of Nuclear Engineering University of Tennessee Knoxville, TN 37996 [email protected], [email protected], [email protected],

Abstract – Most critical infrastructure such as chemical processing plants, electrical generation and distribution networks, and gas distribution is monitored and controlled by Supervisory Control and Data Acquisition Systems (SCADA). These systems have been the focus of increased security and there are concerns that they could be the target of international terrorists. With the constantly growing number of internet related computer attacks, there is evidence that our critical infrastructure may also be vulnerable. Researchers estimate that malicious online actions may cause $75 billion at 2007. One of the interesting countermeasures for enhancing information system security is called intrusion detection. This paper will briefly discuss the history of research in intrusion detection techniques and introduce the two basic detection approaches: signature detection and anomaly detection. Finally, it presents the application of techniques developed for monitoring critical process systems, such as nuclear power plants, to anomaly intrusion detection. The method uses an autoassociative kernel regression (AAKR) model coupled with the statistical probability ratio test (SPRT) and applied to a simulated SCADA system. The results show that these methods can be generally used to detect a variety of common attacks.

I. BACKGROUND Any action that is not legally allowed for a user to take towards an information system is called intrusion and intrusion detection is a process of detecting and tracing inappropriate, incorrect, or anomalous activity targeted at computing and networking resources. The idea of intrusion detection appeared in 1980 [1] and an early abstract intrusion detection model was proposed in 1987 by Denning [3]. Due to the widely used Internet and growing number of intrusions, intrusion detection systems (IDSs) have recently gained a considerable amount of interest. Intrusion detection systems (IDSs) “are based on the beliefs that an intruder’s behavior will be noticeably different from that of a legitimate user and that many unauthorized actions are detectable” [2]. IDSs detect suspicious activities that may compromise system security and alert the systems administrator to respond to the threat. In fact, an IDS is not a preventive security system but more like an alarm system working together with other passive information assurance processes as an important element of defense in layers. IDSs have been classified as either signature detection systems or anomaly detection systems. Signature detection recognizes an intrusion based on known intrusion or attack characteristics or signatures. Anomaly detection identifies an intrusion by calculating a deviation from normal system behavior. Signature detection attempts to identify intruders who are trying to break into an information system with some known techniques. The detection decision is made based

on the knowledge of the models of intrusive processes and what traces the detector should find in the observed system. Signature intrusion detection systems (SIDSs) detect the evidence of intrusive activities without considering what the background traffic looks like. They only need to look for patterns or clues that that the designer assumes may be associated with an intrusion. “First generation SIDSs used rules to describe what security administrators looked for within the system. Large number of rules accumulated and proved to be very difficult to interpret and modify because they were not necessarily grouped by intrusion scenario” [4]. Thus, second generation SIDSs introduced rule-based and statebased intrusion scenario representations that lend themselves to intrusion anticipation [4]. Anomaly detection methods assume that something abnormal is suspicious so anomaly intrusion detection systems (AIDSs) monitor for abnormalities in the network traffic. An AIDS should have an opinion on what should be considered as intrusion. SIDSs work well in identifying those intrusions that are known to them. But SIDSs have high false negative detection rates when a novel or disguised intrusion occurs. An AIDS tracks behavior and updates its base profile by “learning” from the continuous monitoring. Actually, an experienced intruder can take advantage of this and gradually shift the base. An attacker can build network traffic to train the IDS to accept intrusion behavior. Also, AIDSs are known to produce many false positives that burden the system administrators if the

system sensitivities are set to high, but a low threshold would let an actual attack pass. Industry networks are an important part of the rapidly expanding computer network all over the world. Industry networks are the instrumentation, control, and automation networks that exist within three industrial domains: Chemical Processing, Utilities, and Discrete Manufacturing [5]. There are three traditional areas of industry network security: physical security, personnel security, and cybersecurity. Here, Industry network security is referred as cybersecurity of industry networks which "covers prevention, detection, and mitigation of accidental or malicious acts on or involving computers and networks" [5]. An important task of industry network security identified by DHS is to protect Supervisory Control and Data Acquisition (SCADA) system. SCADA "is the technology that enables a user to collect data from one or more distant facilities and to send control instructions to those facilities" [10]. SCADA systems allow a processing center to control a widely distributed process such as oil pipe line system or a power grid. SCADA systems are one of the prime targets of intruders and are vulnerable to many attacks including attempted break-in, penetration by legitimate user, leakage by legitimate user, Trojan horse, virus, logic bomb, denial-of-service attack, and so on. "It is believed that the characterization of the hardware, software, as well as the user behavior is necessary in order to monitor system integrity" [12]. By defining what the normal characterization should be, the intrusion detection problem becomes a fault detection problem. In this way, several on-line monitoring techniques for performance assessment can be applied to anomaly intrusion detection. In this paper, an anomaly-based intrusion detection technique is used to monitor a simulated SCADA system. This AID technique should be well suited to SCADA systems because the normal operations are fairly stationary and repetitive. This is unlike most internet applications which have fairly chaotic network traffic and usage. II CURRENT RESEARCH Condition monitoring techniques that were developed to monitor nuclear reactors and other complex systems were applied to the IDS problem. The technique currently under study makes use of a non-parametric technique to model normal computer operations and identify abnormal operations.

The current research is funded by Idaho National Laboratory (INL) and we have teamed with researchers at Sun Microsystems who have applied condition monitoring techniques to computer anomaly detection, but not for intrusion detection. An experimental setup has been constructed to simulate a SCADA system consisting of several SUN servers and workstations on a local network. Data and actions at locations of interest was recorded using the continuous system telemetry harness (CSTH) developed by the Dynamics, Characterization and Control group at Sun Microsystems [13, 14]. This technology was used to monitor the server activity and build an initial base profile of its normal working status. An initial IDS was developed using this database and a MATLAB-based process and equipment monitoring (PEM) [6] toolbox that was developed for monitoring complex systems. These tools are used to monitor future operations and detect anomalous or suspicious activity. II.A Methodology Server and network management tools provide information about network traffic and hardware operating statistics. Information from the network devices and system hardware can be passively monitored and thus used to characterize server and network behavior. By comparing current behavior with previously characterized normal traffic behavior, anomaly detection can be achieved. The method used in this project can be categorized into one of the most common network anomaly detection methods, pattern matching. Pattern matching detects anomalies by analyzing deviations from normal behaviors. First of all, an online training process is necessary to build a traffic and usage profile for a given network. The system's normal traffic is used to train a model to identify normal behavior. Traffic profiles are created using symptom-specific feature vectors, called system indicators, such as link utilization, CPU usage, and login failure. These profiles then are classified by time of day, day of week, and special days, such as weekends and holidays. When new traffic data fails to fit within a predetermined confidence interval of the stored profiles, then an alarm is triggered. These anomaly detection methods seem especially applicable to SCADA system security which are characterized by routine and repetitious activities. II.A.1. Setup An experimental setup consisting of several SUN servers and workstations has been constructed. A number of break-in methods have been used to attempt to compromise the small network.

internet

subnet 1

residual sequence is more probably generated from a normal or an anomalous distribution" [8].

subnet 2

New Observations x

server

client

client

Predictions of Inputs x'

client

AAKR Model

Figure 1 Experimental network diagram II.A.2. Data collection The data for initial data analysis was obtained from existing auditing systems within Sun server which provide the server’s I/O flows and hardware working statistics. In future research, a free program called NETTOP that measures end-to-end statistics may be used to monitor network traffic. This software is generally used in conjunction with a 'mirror' port (a port that allows one to see all traffic from all hosts, legitimately) in a switched network and would provide accurate and detailed information on traffic received on a regular interface. The most important data source of network traffic statistics is Simple Network Management Protocol (SNMP). The SNMP server maintains management information base (MIB) variables that provide information that is specific to the individual network devices. II.A.3. Intrusion methods To demonstrate how the proposed intrusion detection system works, we briefly describe a scenario of Denial of service (DoS) attacks to the server. Denial of service attacks include: 1) ping flood 2) jolt2 attacks 3) bubonic attacks 4) simultaneous jolt2 and bubonic attacks II.A.4. Modeling methods: detection and isolation Figure 2 presents a simple diagram of the anomaly detection system. This model is based on the hypothesis that security violations should change the system usage and these changes could be detected. The input vector x consists of predetermined features that represent network behavior. These observations are used by an autoassociative kernel regression (AAKR) model to predict the "correct" versions of the input. Corrected versions are constructed by comparing the current observations with past observations that denote normal behavior. Prediction residuals are formed by comparing the observations with the model predictions. These residuals contain deviations from normality that may be indicative of anomalous behavior. A binary hypothesis technique called the sequential probability ratio test (SPRT) [9] is applied to the residuals to "determine if the

Prediction Residuals r

Comparison Module

Detection Decisions

SPRT Module

Figure 2 Anomaly detection system diagram AAKR is a nonparametric, empirical modeling technique that uses historical, exemplar observations to make predictions (corrections) for new observations. In Figure 2, the inputs x of the AAKR model are current features while the outputs x ' are the corrected features developed from historically allowed features. There are three major steps in the AAKR prediction process [11] including distance calculation, similarity quantification, and output estimation. In distance calculation, a distance between a new observation and each exemplar input is measured to determine how far this new input is from the exemplar inputs. Then these distances are converted to weights using a kernel function to represent the similarity of the new observation to each of the input exemplars. The most commonly used kernel function is Gaussian kernel. In the final step, the similarities are combined with the input exemplars to give an estimate of the input. The SPRT was first introduced by Wald [9]. The SPRT tests whether a new observation is more likely to be in a normal mode H 0 or in an abnormal mode H1 . The general procedure for the SPRT is to first calculate the likelihood ratio, which is given by the following equation where { xn } is a sequence of consecutive n observations of x [8].

Ln =

P ({ x n } / H 1 ) P ({ x n } / H 0 )

Then a threshold which takes false alarm probability ( α ) and missed alarm probability ( β ) into consideration is set to make the final decision. II.B Results II.B.1. Data collected The system currently collects 62 variables representing kernel statistics and I/O throughput. These variables aer listed in the Appendix. After initial intrusion testing, most

of the monitored variables were found not related to the current methods of attack. Only five variables, SB0PROC-1-system, SB0-PROC-1-idle, SB0-PROC-0-system, SB0-PROC-0-idle, Loadavg_1min, from the server audit record were chosen as anomaly indicators because their states are highly correlated with the actual attack. However, these parameters may be related to other attack types and thus will still be monitored. Indicator 1 is the usage of processor 1 in the system board zero and Indicator 2 is the idle time for the same processor. Indicators 3 and 4 are another pair of records for processor 0 on the system board while indicator 5 is the average loading in one minute.

Figure 4 Testing data of five indicators.

The system training set consists of 1000 observations selected from normal conditions. The test set consists of 300 observations made at both normal conditions and during the attacks. II.B.2 Modeling results The Process and Equipment Monitoring (PEM) toolbox [5], which is a MATLAB based set of tools that currently provide a generalized set of functions for use in process and equipment monitoring applications, specifically online monitoring systems (OLM), developed by at University of Tennessee was used to detect deviations from normality. Figures 3 and 4 show the training and testing data. The training data are used for model development and the test data is used to evaluate the model's performance. Figures 5-9 present the residual between the model input and its predictions in the upper plot and the SPRT detection output in the lower plot. The residual plot shows the deviation from normality while the SPRT plot determines if that deviation is statistically significant. The SPRT false alarm probability ( α ) is set to 1% and missed alarm probability ( β ) is 10%. The "o" at level 1 denotes the alarms that corresponding to abnormal input vectors. This series of plots clearly shows that the chosen indicators capture the trends in the abnormality when the attack starts.

Figure 5 DoS attack. Abnormality indicator 1 SB0PROC-1-system.

Figure 6 DoS attack. Abnormality indicator 2 SB0PROC-1-idle.

Figure 7 DoS attack. Abnormality indicator 3 SB0PROC-0-system.

Figure 3 Training data of five indicators.

A user might try to steal sensitive documents and send them out of the internal network by email, disc, or remote printer. 3) Sabotage by legitimate user A disgruntled employee might place a logic bomb or virus to erase the production software programs.

Figure 8 DoS attack. Abnormality indicator 4 SB0PROC-0-idle.

Fig. 9 DoS attack. Abnormality indicator 5 Loadavg_1min. II.B.3 One more result from insider attack So far, only intrusions made from outsiders are considered. However, "more than 70% of all corporate hacking is from the inside the firewall" [7]. "Traditional IT security techniques focus on threats from outside the organization. As we noted earlier, this may not be the primary risk for process control security" [7]. Network security research shows that insider attacks have much higher success rate than outsider attacks. Therefore, a more important potential security risk may be intrusions from insiders. People who have access to the facilities and/or the SCADA systems may take illegal actions intentionally or by mistake. Insider attacks are difference from external attacks. The intruder has been granted access to the network and may have some knowledge about the network architecture, including where their targeted files or system vulnerabilities. The intruder should have no problem getting the privileges they need to mount an attack. Also the intruder might be skillful to mount a credible attack. Thus, the intrusion from inside is even harder to detect. Some possible security violations made from inside are: 1) Penetration by legitimate user A user might try to access unauthorized files or programs to take control of the system which is not allowed to him. 2) Leakage by legitimate user

The empirical monitoring technique should be an ideal against to detect insider attacks. Its method of identifying normal states can be applied to different network users who have different access levels and require a different amount of access to different services, servers, and systems for their work. By developing the monitoring system with data generated from normal day-to-day work activities, abnormalities such as workers accessing different levels of services, will be detected. To illustrate how the fault detection model can detect insider attack, an unusual process was added into a SCADA system to simulate a malicious inside action when several regular processes were running. In a real attack, this unusual process could be a logic bomb, or a tool to sabotage the file system, or an illegal operation towards the operating system. In this simulation, the attack was launched between 10:11am to 10:40am and the test data was taken from 7:20am to 10:40am. From Figure 10 to Figure 13, it can be seen that around 10:10am, indicators start giving alarms of abnormal system status.

Figure 10 Insider attack. Abnormality indicator 6 SB0PROC-1-user.

Figure 11 Insider attack. Abnormality indicator 9 SB0PROC-1-idle.

Many of the initial experiments demonstrated areas in which our research methodology was lacking. In some cases the intrusion was very apparent because the computer system load and processes were near steady state and any change was easy to visually identify. These experimental shortcomings have been iteratively removed and we are still in the process of improving the SCADA system simulator to produce more realistic and challenging scenarios. Figure 12 Insider attack. Abnormality indicator 10 SB0PROC-0-user.

ACKNOWLEDGMENTS This work is funded by Idaho National Laboratory (INL), Contract No./Award No.: 00050837. We also acknowledge Sun Microsystems for their in-kind contributions. REFERENCES

Figure 13 Insider attack. Abnormality indicator 13 SB0PROC-0-idle. Variable 6 SB0-PROC-1-user is the usage of CPU No.1. Variable 9 SB0-PROC-1-idle is the percentage of CPU No.1 idle. Variable 10 SB0-PROC-0-user is the usage of CPU No.0. Variable 13 SB0-PROC-0-idle is the percentage of CPU No.0 idle. The results above clearly show that this fault detection model is capable of detecting a wide range of intrusions from outside network attacks to inside host misuses. III. CONCLUSIONS This paper has introduced a method to apply autoassociative kernel regression (AAKR) empirical modeling and the SPRT for SCADA system intrusion detection. The experiments demonstrate that this methodology can quickly detect anomalous behavior. Different intrusion methods will require different indicators, so an important system requirement is that it monitors a large number of potentially valuable variables. Therefore, the next step of this research is to identify an optimal set indicators associated with known and potential abnormalities. From many insider attack cases studied in [5], it is clear that although insider attacks are more difficult to detect, there is an increasing need. Future research of this project will focus on developing methods to detect changes in SCADA system computer performance that may be attributed to insider intrusion.

[1] J.P. Anderson, Computer security threat monitoring and surveillance. Technical report, James P. Anderson Co (1980). [2] B. Mukherjee, L.T. Heberlein, and K. N. Levitt, “Network Intrusion Detection”, IEEE Network Volume 8, Issue 3, May-June 1994 Page(s): 26 – 41. [3] D.E. Denning, “An Intrusion Detection Model”, IEEE Transactions on Software Engineering, vol. SE-13, pp. 222-232, February 1987. [4] A.K. Jones and R. S. Sielken, “Computer System Intrusion Detection: A Survey”, http://www.cs.virginia.edu/~jones/IDSresearch/Documents/jones-sielken-survey-v11.pdf [5] D.J. Teumim, Industrial Network Security, ISA-The Instrumentation, Systems, and Automation Society (December 24, 2004). [6] J.W. Hines and D. Garvey, "The Development of a Process and Equipment Monitoring (PEM) Toolbox and its Application to Sensor Calibration Monitoring", The Fourth International Conference on Quality and Reliability, 9 - 11 August, 2005, Beijing, P.R. China. [7] E.J. Byres, "Can't Happen at Your Site? Network Security on the Plant Floor", InTech Magazine, Instrumentation Systems and Automation Society, Research Triangle Park, NC, p. 20 – 22, February 2002. [8] J.W. Hines and D. Garvey, "Development and Application of Fault Detectability Performance Metrics for Instrument Calibration Verification and Anomaly Detection", Journal of Pattern Recognition Research, Vol. 1: 2006. [9] A. Wald, Sequential Analysis. New York, NY: John Wiley & Sons, 1947. [10] S. A. Boyer, SCADA: Supervisory Control and Data Acquisition, ISA-The Instrumentation, Systems, and Automation Society; 3 edition (June 2004)

[11] J. W. Hines, D. Garvey, R. Seibert, A. Usynin, and S. Arndt, "Technical Review of On-line Monitoring Techniques for Performance Assessment: Part II Theoretical Issues", NUREG/CR-6895. Submitted for Review to the U.S. Nuclear Regulatory Commission (NRC) (2005). [12] B. Yu, C. Howey and E.J. Byres; “Monitoring Controller's "DNA Sequence for System Security”, ISA Emerging Technologies Conference, Instrumentation Systems and Automation Society, Houston, September 2001. [13] K. Whisnant, K.C. Gross and N. Lingurovska, “Proactive Fault Monitoring in Enterprise Servers,” in Proceedings of the 2005 International Conference on Computer Design, pp. 3-10, June 2005. [14] K. Gross, "Continuous System Telemetry Harness", Tech. Rep., [Online] Available: research.sun.com/sunlabsday/docs.2004/talks/1.03_ Gross.pdf, 2005. APPENDIX: List of Monitored Parameters Physical Parameters (2) -----------------------------------CPU0_DIE_TEMPERATURE_SENSOR, CPU1_DIE_TEMPERATURE_SENSOR Kernel Statistics (14) --------------------------FreeMem, SwapAlloc, SwapAvail, SwapFree, SwapResv, SB0-PROC-1-user, SB0-PROC-1-system, SB0-PROC-1-wait, SB0-PROC-1-idle, SB0-PROC-0-user, SB0-PROC-0-system, SB0-PROC-0-wait, SB0-PROC-0-idle, Loadavg_1min IO Statistics (46) ----------------------------cpu-user, cpu-system, cpu-wait, cpu-idle, c0t6d0-reads_per_sec, c0t6d0-writes_per_sec, c0t6d0-kb_reads_per_sec, c0t6d0-kb_writes_per_sec, c0t6d0-wait_length, c0t6d0-active_length,

c0t6d0-avg_wait_time, c0t6d0-avg_svc_time, c0t6d0-percent_wait, c0t6d0-percent_busy, c0t6d0-soft_errors, c0t6d0-hard_errors, c0t6d0-transport_errors, c0t6d0-total_errors, c1t1d0-reads_per_sec, c1t1d0-writes_per_sec, c1t1d0-kb_reads_per_sec, c1t1d0-kb_writes_per_sec, c1t1d0-wait_length, c1t1d0-active_length, c1t1d0-avg_wait_time, c1t1d0-avg_svc_time, c1t1d0-percent_wait, c1t1d0-percent_busy, c1t1d0-soft_errors, c1t1d0-hard_errors, c1t1d0-transport_errors, c1t1d0-total_errors, c1t0d0-reads_per_sec, c1t0d0-writes_per_sec, c1t0d0-kb_reads_per_sec, c1t0d0-kb_writes_per_sec, c1t0d0-wait_length, c1t0d0-active_length, c1t0d0-avg_wait_time, c1t0d0-avg_svc_time, c1t0d0-percent_wait, c1t0d0-percent_busy, c1t0d0-soft_errors, c1t0d0-hard_errors, c1t0d0-transport_errors, c1t0d0-total_errors

Suggest Documents