Limiting Uncertainty in Intrusion Response - CiteSeerX

10 downloads 0 Views 77KB Size Report
consisting of the machines Limbo, Saint Peter, and Heaven. If Saint Peter is attacked, the tactic of remote logging could be implemented by logging to computer ...
Limiting Uncertainty in Intrusion Response Curtis A. Carver, John M.D. Hill, Member, IEEE and Udo W. Pooch, Senior Member IEEE

Abstract-- This paper explores techniques for limiting uncertainty in adaptive intrusion response systems and specifically in the Adaptive, Agent-based Intrusion Response System (AAIRS). Research by Cohen has explored the inadequacy of manual intrusion response and the need for automatic intrusion response. There is uncertainty in automatic intrusion response. Intrusion detection systems generate false alarms. The success or failure of a response is often not clear. Attackers attempt to mask their attacks so as to confuse the response system until it is too late. Automatic response systems must limit the effect of uncertainty in their internal decision-making and adapt over time to make better decisions. This paper addresses these issues by examining the AAIRS system and its techniques for limiting uncertainty. Index Terms-- Intrusion Response, Adaptive Systems, Computer Security, Intelligent Agents

I. INTRODUCTION The number of information warfare attacks is increasing and becoming increasingly sophisticated. Annual reports from the Computer Emergency Response Team (CERT) indicate a significant increase in the number of computer security incidents each year. Figure 1 depicts the rise the computer security incidents with six incidents reported in the 1988 and 8,268 in 1999 [1]. Not only are these attacks becoming more numerous, they are also becoming more sophisticated. The 1998 CERT Annual Report reports the growing use of "widespread attacks using scripted tools to control a collection of information-gathering and exploitation tools" [2]. The 1999 CERT Distributed Denial of Service Workshop likewise reports the growing use of automated scripts that launch and control tens of thousands of attacks against one or more targets. Each attacking computer has limited information on who is initiating the attack and from where [3]. The threat of a sophisticated computer attacks is growing. Unfortunately, intrusion detection and response systems have not kept up with the increasing threat. Current intrusion detection systems (IDSs) have limited response mechanisms that are inadequate given the current threat. While IDS research has focused on better techniques for intrusion detection, intrusion response remains principally a manual process. The IDS notifies the system administrator that an intrusion has occurred or is occurring

All authors are with the Department of Computer Science, Texas A&M University, College Station, Texas 77843 (email: {carverc, hillj, pooch}@cs.tamu.edu).

and the system administrator must respond to the intrusion. Regardless of the notification mechanism employed, there is a delay between detection of a possible intrusion and response to that intrusion. This delay in notification and response, ranging from minutes to months, provides a window of opportunity for attackers to exploit. Cohen explored the effect of reaction time on the success rate of attacks using simulations [4]. The results indicate that if skilled attackers are given ten hours after they are detected before a response, they will be successful 80% of the time. If they are given twenty hours, they will succeed 95% of the time. At thirty hours, the attacker almost never fails. The simulation results were also correlated against the skill of the defending system administrator. The results indicate that if a skilled attacker is given more than thirty hours, the skill of the system administrator becomes irrelevant - the attacker will succeed. On the other hand, if the response is instantaneous, the probability of a successful attack against a skilled system administrator is almost zero. Response is a fundamental factor in whether or not an attack is successful. For the response to be successful against skilled attacker, the response system must adapt its tactics so that the response system does not always respond with a static defense. Attackers would simply adapt their approach so as to mediate the defense. An adaptive, automated intrusion response system provides the best possible defense and shortens or closes this window of opportunity until the system administrator can take an active role in defending against the attack. Adaptive automated intrusion response systems must resolve uncertainties in intrusion detection and response in order to be effective. Intrusion detection systems generate false alarms and no current intrusion systems are perfect. This introduces uncertainty into the formulation of a response. Is the system really under attack or is this a false alarm? The response system must temper the response to the system's trust in the detection system. The response system also generates uncertainty that must be addressed. The success or failure of a response is often not clear. Did the response stop the attack or is the attacker continuing the attack? Is this new incident report part of an ongoing attack or is it a new attack? The response system must adapt over time and measure the success of its responses so as to limit uncertainty. This paper addresses these issues.

9000 8000 7000 6000 5000 Incidents 4000 3000 2000 1000 0 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Year Figure 1: CERT Reported Incidents per Year

II. RELATED WORK In the past seventeen years, there have been a number of intrusion detection and intrusion response tools developed (See Table 1). The response systems can be categorized as notification systems, manual response systems, or automatic response systems. The majority of intrusion detection and response systems are notification systems only - systems that generate reports and alarms only. Some systems provide the additional capability for the system administrator to initiate a manual response from a limited preprogrammed set of responses. While this capability is more useful than notification only, there is still a time gap between when the intrusion is detected and when a response is initiated. Automatic response systems immediately respond to an intrusion through pre-programmed responses. With three exceptions, all of these automatic response systems use a simple decision table where a particular response is associated with a particular attack. If the attack occurs, the preprogrammed response executes. This preprogrammed response was predominantly the execution of a single command or action instead of the invocation of a Intrusion Response Classification Notification Manual Response

# of Systems 31 8

Automatic Response

17

Total

56

Table 1: Classification of Intrusion Response Systems

series of actions to limit the effectiveness of the attacker. Notification, manual response, and automatic response systems with a pre-programmed response do not address uncertainty in intrusion response. As such, this paper will focus on adaptive automatic intrusion response systems. Cooperating Security Managers (CSM) is a distributed and host-based intrusion detection and response system. CSM proactively detects intrusions without using a central director. CSM reactively responds to intrusive behavior using the Fisch DC&A taxonomy to classify the attack as well as the suspicion level assigned to the user by the intrusion detection system. As the suspicion level changes, CSM employs one of eight different response sets, each of which consists of one or more of fourteen different response actions. CSM continues to respond to intruder actions until the intruder leaves the system when their suspicion level is reset to zero [5, 6]. CSM's suspicion level is the principal means of addressing intrusion response uncertainty. As it increases, the probability that the system is under attack and this is not a false alarm increases. CSM has no long-term technique for using the false alarm rate of the intrusion detection system(s) to temper the response. CSM also has no mechanisms for measuring the success of the response or determining if an incident report is part of an ongoing attack or is a new attack. Event Monitoring Enabling Responses to Anomalous Live Disturbances (EMERALD) is a distributed misuse and anomaly intrusion detection system. It is intended for largescale heterogeneous computing environments. The EMERALD architecture consists of hierarchical collections of monitors. Monitors contain an expert system that receives reports from the analysis components and invokes various response handlers. The possible responses are

Response Toolkit

Monitored System System Admin Tool

Intrusion Detection System

Policy Specification

Interface

Response Taxonomy

Master Analysis

Analysis Figure 2: Methodology defined in the resource object with two associated metrics that delimit their usage: a threshold metric and a severity metric. The threshold metric defines the degree of intrusive evidence necessary to use the response. The severity metric defines how harsh a particular response is [7, 8]. The threshold and severity metrics are EMERALD's primary mechanism for addressing intrusion response uncertainty. As the threshold metric increase, there is greater evidence that the system is under attack and more severe response tactics can be employed. EMERALD has no long-term technique for using the false alarm rate of the intrusion detection system(s) to temper the response. EMERALD also has no mechanisms for measuring the success of the response or determining if an incident report is part of an ongoing attack or is a new attack. This research addresses these open issues in resolving uncertainty in intrusion response. The Adaptive, Agentbased Intrusion Response System (AAIR) uses the false alarm rate of an intrusion response system to temper the response to the belief that the system is really under attack. It has explicit methods for measuring the success of a response and adapting the response plan based on the success or failure of the plan. Finally, AAIRS uses several metrics to determine if an incident report is part of an ongoing attack or is a new attack. Each of these techniques for limiting uncertainty is discussed below.

III. METHODOLOGY A. Overview The AAIRS methodology is summarized in Figure 2. Multiple intrusion detection systems monitor a computer system and generate intrusion alarms. Interface agents translate IDS detection messages into a common message format and maintain a model of each IDS based on number of false positives/negatives previously generated. It uses this model to generate an attack confidence metric and passes this metric along with the intrusion alarm to the Master Analysis agent. The Master Analysis agent classifies whether the incident is a continuation of an existing incident or is a new attack. If it is a new attack, the Master Analysis agent creates a new Analysis agent to develop a response plan to the new attack. If the incident is a continuation of an existing attack, the Master Analysis agent passes the attack confidence metric and intrusion alarm to the existing Analysis agent handling the attack. The Analysis agent analyzes an incident until it is resolved and generates a course of action to resolve the incident. To generate this course of action, the Analysis agent involves the Response Taxonomy agent to classify the attack and Policy Specification agent to limit the response based on legal, ethical, institutional, or resource constraints. The Analysis agent also decomposes the abstract course of action into very specific actions and then invokes the appropriate components of the Response Toolkit. The

Analysis agent employs adaptive decision-making based on the success of previous responses. B. Uncertainty in Classifying Attacks The Master Analysis agent classifies events as either part of an ongoing attack or as a new attack. To make this determination, the Master Analysis agent maintains an event list history for each Analysis agent and uses three internal metrics: time metric, session identifier metric, and attack type metric. 1) Event List History The Master Analysis agent maintains an event history list for each Analysis agent. While it was initially envisioned that this functionality would be provided by each Analysis agent, it became apparent that the Master Analysis agent had to have that functionality to complete its task. As such, the Master Analysis agent adds and deletes events from event lists. Events are added if the Master Analysis agent determines that the received report is a continuation of an ongoing attack. Events are removed when they are older than the incident longevity limit. The incident longevity is set by the system administrator and can be adjusted through the AAIR GUI. 2) Time Metric The time metric evaluates the amount of time between the last received incident report for each Analysis agent and the current report. Time is classified as short, medium or long. If the difference between the two times is less than 10 minutes, then the time metric is set to short. If the difference is more than 10 minutes but less than 60 minutes, then the time metric is set to medium. If the time metric is longer than 60 minutes, then the time metric is set to long. 3) Session Identifier Metric The session identifier metric looks at the IP address and user name to determine if the session information supports classifying the new report as either the continuation of an old attack or a new attack. The session identifier is classified as low, medium or high and is a combination of the IP address and user name metrics. The IP address metric returns high if the IP address is the same. It returns medium if the IP addresses are different but part of the same subnet. It returns low if the IP addresses are from completely different networks. The user name metric returns high if the user name is the same on the two reports or low if the user name is different on the two reports. The decision table for the session identifier metric is listed in Table 2. Table 2: Session Identifier Decision Table User Result High High High High Low Medium Medium High High Medium Low Medium Low High Medium Low Low Low

Session Identifier

4) Attack Type Metric The attack type metric looks at the process initiating the attack and returns high if the attacking process is the same in the two incident reports or low if it is not. Using these four metrics, the Master Analysis agent classifies the attack as either part of an ongoing attack or as a new attack. The decision table is listed in table 3. C. Uncertainty in Detection Intrusion detection systems generate false alarms and no current intrusion systems are perfect. This introduces uncertainty into the formulation of a response. Is the system really under attack or is this a false alarm? The response system must temper the response to the system's trust in the detection system. AAIRS addresses this uncertainty through an IDS confidence metric. The IDS confidence metric is the percentage of times the IDS has detected an intrusion previously to the total number of attacks. It provides a measurement of the false alarm rate of the IDS. Instead of believing all intrusion detection systems equally, those systems with a low false alarm rate are believed more than those with a high false alarm rate and the response is tempered appropriately. The determination of if an attack is a false alarm or an actual attack is beyond the scope of AAIRS or any automated system. As such, the system administrator must adjust the IDS confidence metric after reviewing each attack and determining if it was a false alarm or a real attack. Once set, AAIRS uses the confidence metric as a component in building a response plan.

Table 3: Master Analysis Agent Decision Table Time Session Attack Result Identifier Short High High Same Short High Low Same Short Medium High Same Short Medium Low Same Short Low High Same Short Low Low Different Medium High High Same Medium High Low Same Medium Medium High Same Medium Medium Low Same Medium Low High Same Medium Low Low Different Long High High Same Long High Low Different Long Medium High Same Long Medium Low Different Long Low High Different Long Low Low Different

D. Uncertainty in Response The response system also generates uncertainty that must be addressed. The success or failure of a response is often not clear. Did the response stop the attack or is the attacker continuing the attack? The response system must adapt over time and measure the success of its responses so as to limit uncertainty. To address this uncertainty, AAIRS builds a response plan and adapts that plan as the additional incident reports are received. 1) Plan Generation The Analysis agent provides long-term analysis of an incident and determines a plan to respond to an intrusion. This plan consists of a response goal, two or more plan steps, and associated tactics and implementations for accomplishing the plan steps. The response goal is specified by the system administrator and provides a general response approach. Examples of response goals include: catch the attack, analyze the attack, mask the attack from users, sustain service, maximize data integrity, maximize data confidentiality, or minimize cost. Plan steps are techniques for accomplishing a response goal. Examples of plan steps include: gather evidence, preserve evidence, communicate with the attacker, slow the attack, identify compromised files, notify the system administrator, or counterattack the attacking system. Tactics are methods to carry out a plan step. For example, given a plan step of gather evidence, there are a variety of tactics for accomplishing this plan step such as enabling additional logging, enabling remote logging, enabling logging to an unchangeable media, enabling process accounting, tracing the connection, communicating with the attacker, or enabling additional IDSs. The tactics can be further decomposed into a number of implementations that are environment dependent. As an example, consider a subnet consisting of the machines Limbo, Saint Peter, and Heaven. If Saint Peter is attacked, the tactic of remote logging could be implemented by logging to computer system Limbo or Heaven or both. The analysis agent determines what plan steps, tactics and implementations are appropriate. Each plan step, tactic, and implementation has a success factor. The success factor is the ratio of the number of times the plan step, tactic, and implementation had been successfully deployed to the number of times it had been deployed. This weights plan components so that those plan steps, tactics, and implementations that are most successful are used more frequently than those that have been less successful. Plan generation consists of the following steps:  Applying policy constraints: removes any plan steps, tactics, and implementations constrained due to environmental, institutional, or legal constraints.  Setting response taxonomy weights: weights the remaining plan steps, tactics, and implementations according to their suitability to the situation. The Response Taxonomy agent uses a variety of factors including the time of attack, type of attack,







type of attacker, strength of suspicion, and implications of the attack to make this determination. Determining system response goal weights: weights the plan steps, tactics, and implementations according to their support for the system's response goal. Building a tentative plan: builds an initial plan that employs the weights and constraints generated in previous steps and contains all of the plan step, tactic, and implementation that are viable. For each plan step, tactic, and implementation, the following calculation is performed: TentPlan[i] = ((ResponseTaxonomy Weight[i] + System ResponseWeights[i])/2)* successfactor[i]. This generates a number between 0 and 1 for each plan step, tactic, and implementation. A random number is then rolled for each plan step, tactic, and implementation and if lower than TentPlan[i], the plan step, tactic, or implementation is considered viable for the situation. Building a final plan: cleans up the tentative plan to enforce relationships between plan steps, tactics, and implementations and apply additional rules so that the final plan is feasible. The result is a viable, feasible, nondeterministic response plan.

2) Implementation Success or Failure The Analysis agent and Response Toolkit are responsible for limiting response uncertainty due to success or failure of implementations. As AAIRS receives incident reports, the Analysis agent invokes the Response Toolkit to check the success of the implementations being deployed. Each implementation has different success metrics that are implementation specific. For example, the success metrics associated with the implementation unchangeable logging to write-once CD-ROM are quite different from the success metrics for the implementation counterattack using denial of service attack Tribe Flood. If there is a failure in the existing plan, then the Analysis agent attempts to adapt the plan. 3) Plan Adaptation Plan adaptation starts at the implementation level and works up to the plan step level. Each failed implementation is checked to determine if there is an alternate implementation that has not previously failed and is not already in the plan. If there is an alternate implementation that is already in the plan, the failed implementation is simply removed from the plan. If there is a viable alternate implementation, it is added to the plan and the failed implementation is removed from the plan. If there is no viable alternative, the failed implementation and its tactic are removed from the plan. If there is a failure at the tactics level, each of the plan steps are checked to ensure that they are still viable. If there has been a failure at the plan step level, a significant change has taken place and the Analysis agent attempts to substitute an appropriate tactic for the failed tactic. If all other tactics have failed, the Analysis agent instructs the response toolkit

to shut down the host until the system administrator can take an active role in the defense of the system. In summary, the Analysis agent nondeterministically generates a feasible, viable response plan. It monitors the current plan and, with the Response Toolkit, detects intrusion response failures. These failures lead to plan adaptation to limit uncertainty in response. IV. CONCLUSIONS This paper has discussed techniques for limiting uncertainty in intrusion response and specifically the techniques employed by AAIRS. There is uncertainty in automatic intrusion response due to imperfect intrusion detection systems, imperfect intrusion response, and clever attackers that can defeat most static defenses. Automatic adaptation to a rapidly changing situation is key to limiting this uncertainty and effectively responding to an intrusion. V. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

CERT Coordination Center, “CERT/CC Statistics for 1988 through 1998,” Available at http://www. cert.org/stats/cert_stats.html, January, 2000. CERT Coordination Center, “CERT Coordination Center 1998,” Available at http://www.cert.org/ annual_rpts/cert_rpt_98.html, January, 2000. CERT Coordination Center, “Results of the Distributed-Systems Intruder Tools Workshop,” Available at http://www.cert.org/reports/ dsit_workshop.pdf, March 2, 2001. F. B. Cohen, “Simulating Cyber Attacks, Defenses, and Consequences,” Available at http://all.net/journal/ntb/simulate/simulate.html, May 13, 1999. E. A. Fisch, “Intrusion Damage Control and Assessment: A Taxonomy and Implementation of Automated Responses to Intrusive Behavior,” Ph.D. Dissertation, Department of Computer Science, Texas A&M University, College Station, TX, 1996. G. B. White, E. A. Fisch, and U. W. Pooch, “Cooperating Security Managers: A Peer-based Intrusion Detection System,” IEEE Network, vol. 10, no. 1, January/February, 1996, pp. 20-23. P. A. Porras and P. G. Neumann, “EMERALD: Event Monitoring Enabling Responses to Anomalous Live Disturbances,” in Proc. 20th National Information Systems Security Conf., Baltimore, MD, October 7-10, 1997, pp. 353-365. P. G. Neumann and P. A. Porras, “Experience with EMERALD to Date,” in Proc. 1st USENIX Workshop on Intrusion Detection and Network Monitoring, Santa Clara, CA, April 11-12, 1999, Available at http://www2.csl.sri.com/emerald/ downloads.html.

Suggest Documents