School of Electrical and Computer Engineering, Seoul National University, Seoul,. Korea ... focused on how to collect the forensic evidence for both analysis and.
An Efficient Forensic Evidence Collection Scheme of Host Infringement at the Occurrence Time Yoon-Ho Choi1 , Jong-Ho Park1 , Sang-Kon Kim1 , Seung-Woo Seo1 , Yu Kang2 , Jin-Gi Choe2 , Ho-Kun Moon2 , and Myung-Soo Rhee2 1
School of Electrical and Computer Engineering, Seoul National University, Seoul, Korea, 151-744 2 KT Information Security Center, Seoul, Korea
Abstract. The Computer Forensics is a research area that finds the malicious users by collecting and analyzing the intrusion or infringement evidence of computer crimes such as hacking. Many researches about Computer Forensics have been done so far. But those researches have focused on how to collect the forensic evidence for both analysis and proofs after receiving the intrusion or infringement reports of hosts from computer users or network administrators. In this paper, we describe how to selectively collect the forensic evidence of good quality from observable and protective hosts at the time of infringement occurrence by malicious users. By correlating the event logs of Intrusion Detection Systems(IDSes) and hosts with the configuration information of hosts periodically, we calculate the value of infringement severity that implies the real infringement possibility of the hosts. Based on this severity value, we selectively collect the evidence for proofs at the time of infringement occurrence. As a result, we show that we can minimize the information damage of the evidence for both analysis and proofs, and reduce the amount of data which are used to analyze the degree of infringement severity.
1
Introduction
With the advent of transaction using Internet, the infringement of personal information and information leakage with many serious damages has been reported. However, the evidence for both analysis and proofs can be modified by the malicious or naive behavior so that it becomes not easy to investigate these crimes effectively when they happen. Therefore, Computer Forensics, or simply called Forensics, has become an important security area, which considers the collection of the non-damaged forensic evidence, its analysis, the inference of the behavior, and the trace-back of the malicious user. In the traditional Forensics, the gathering, saving and analysis of the evidence have been done at the time of the infringement occurrence such as system hacking, as shown in Fig.1-(a). That is, based on the time of the infringement report
This research was supported by the University IT Research Center Project.
M.S. Rhee and B. Lee (Eds.): ICISC 2006, LNCS 4296, pp. 206–221, 2006. c Springer-Verlag Berlin Heidelberg 2006
An Efficient Forensic Evidence Collection Scheme
Infringement (analysis) data gathering and analysis at the time that user or administrator reports.
207
col./anal. col./anal. col./anal. col./anal.
Log generation time t Gathering (a)
t The time of infringement occurrence (b)
Fig. 1. Comparison of infringement report time: (a)previous(gathering) (b)proposed (collection) where, col. means ‘collection’ and anal. means ‘analysis’
from users or administrators, all the evidence for analysis and proofs from the damaged system has been gathered and analyzed entirely. But, it may occur that the amount of data for analysis of infringement becomes too much because every set of evidence data that has been gathered has to be investigated on. Also, the evidence for analysis and proofs may have been damaged by an attacker or changed by the activity of a normal user. Especially, when the intentional infringement against the system happens, the damage of the evidence for analysis and proofs becomes more serious. 1.1
Previous Approaches for Correlation Analysis
The previous approaches based on correlation analysis aim at low False Positive(FP) and low False Negative(FN) rates, and log reduction by combining a refining algorithm with a fast search algorithm. The correlation approaches are classified into ones that do not require a specific knowledge and ones that rely on a certain knowledge. The proposed approach is categorized into the approach that relies on a certain knowledge. As the approaches that use a specific knowledge, there have been made several proposals: the Advanced Security Audit-trail Analysis on UniX(ASAX)[AS1] [AS2] based on the rules that are formed as (prerequisite, actions), JIGSAW[SK1] based on attack scenario, Chronicles[BH1] based on time reasoning, and others such as EMERALD Mission-Impact Intrusion Report Correlation System(MCorrelator)[Em1] and M2D2[BM1]. ASAX aims at supporting intelligent analysis of audit trails. ASAX uses a general rule-based language to carry out its analysis, named RUSSEL(Rule-baSed Sequence Evaluation Language) which aims at recognizing particular patterns in files and triggering appropriate actions. The audit trail is analyzed sequentially, record by record, by means of a collection of rules. Active rules encapsulate all the relevant knowledge about the past of analysis and it is applied to the current record by executing the rules for that record. And then, it generates new rules and the process is initiated by a set of rules activated for the first record. ASAX has some limitations in that there is no real rules database and data types are limited within RUSSEL. JIGSAW is based on the preconditions and consequences of individual attacks. It correlates alerts if the preconditions of some later alerts are satisfied by
208
Y.-H. Choi et al.
the consequences of some earlier alerts. However, it does not correlate an alert if it does not prepare for other alerts. For example, if an attacker tries several variations of the same attack in a short period of time, JIGSAW will treat them separately, and only correlate those that prepare for other alerts. Chronicles aims at providing an interpretation of the system evolution given dated events. It is a time series reasoning system that relies on the reified temporal logic formalism. It predicts forthcoming events relevant to its task; it focuses its attention on them and maintains their temporal windows of relevance. It is efficient in recognizing stereotyped attack scenarios such as the ones launched by automatic intrusion tools. EMERALD M-Correlator was designed to consolidate and rank a stream of alerts relative to the needs of the analyst, given the topology and operational objectives of the protected network. It uses a relevant score produced through a comparison of the alert target’s known model against the known vulnerability requirements of the incident type. M2D2 is a formal information model for security information representation and correlation which includes four types of information: information system’s characteristics, vulnerabilities, security tools and events/alerts. M2D2 reduces and conceptually interprets multiple alarms, i.e. it models alert aggregation method by utilizing relations between vulnerabilities and topology, between topology and security tools, as well as between security tools and vulnerabilities. However, these approaches are focusing on the efficient detection of intrusion attempt. 1.2
Proposed Approach
Before we describe the proposed approach, we first define the two following terminologies that are used throughout this paper. – Gathering means that we collect the evidence entirely for the investigation of host infringement at the report time from administrators or users; – Collection means that we collect the evidence selectively for the investigation of host infringement at the occurrence time. As shown in the Fig.1-(b), different from the previous approaches that gather the evidence for analysis and proofs entirely as in the Fig.1-(a), the proposed approach focuses on the detection of real infringement against the observable and protective hosts and focuses on collecting the forensic evidence at the time of occurrence of intentional infringement against them. Noting that when the infringement against the host occurs, it usually takes the form of a multi-step(or stage) attack, we describe how to collect the evidence for both analysis and proofs from the early stage of infringement such as Host Scanning(HS), Port Scanning(PS) and Vulnerability Exploit(VE) to the final stage such as Distributed Denial of Service(DDoS). After we classify the intrusion into the intrusion attempt and infringement that means a real damage at the system, we calculate the value of infringement severity representing the real infringement possibility of the hosts in a multi-step attack. Based on the value of infringement severity of the system for each attack step, we determine the time instant for the forensic
An Efficient Forensic Evidence Collection Scheme
209
evidence collection. To minimize the possibility of wrong decision on the time of the forensic evidence collection, we correlate the event logs of the Intrusion Detection System(IDS) like the SNORT[Sn1] with the event logs and the security configuration of the host during calculation of the value of infringement severity. We store the evidence for both analysis and proofs at each step of a multi-step attack as a status information for each stage. We summarize the contributions of this paper in Table.1 by comparing with the previous approaches. This paper is organized as follows. In section II, we describe the proposed approach that calculates the value of infringement severity of the hosts and collects the forensic evidence from the hosts based on the value. We describe the verification result of the proposed approach in section III. In section IV, we summarize the paper. Table 1. Comparison with the previous log correlation and analysis methods Previous approaches Target
Detection of intrusion attempt Forensic evidence gathering at the time of administrator(user) report Correlation analysis among host event logs Correlation analysis among IDS event logs Accurate intrusion attempt detection i.e., low FP and FN
Proposed approach Detection of host infringement Forensic evidence collection at the time of infringement occurrence
Correlation analysis among event logs of both IDS and host Calculation of the infringement severity of hosts Measurement for the degree of Effect real infringement of hosts Low loss of the forensic evidence Reduction of evidence for analysis Loss of evidence for both analysis No way to collect some volatile Limitation and proofs data Dummy analysis for not infringement but intrusion attempt No way to collect any volatile data Analysis method
2
Infringement Decision and Forensic Evidence Collection
Now, we describe how to calculate the value of infringement severity of the hosts for each attack step and which information should be collected. We assume the general multi-step attack shown in Fig.2, for example, DDoS whose final attack is given as Denial of Service(DoS). 2.1
Overall Description of the Analysis Objects and Its Procedure
As shown in Fig.3, the evidence for analysis is collected to a collaborative analysis server which monitors IDSes and the observable and protective hosts. The
210
Y.-H. Choi et al.
Client Install
host scan
port scan
VE
Infecti on
DoS attack
Fig. 2. Infringement represented by stages and the possible attack flow, where the sold line implies the beginning stage of infringement; the dotted line implies the infringement path selection on networks; the dashed line implies a final attack; Client is an attacker; DoS is assumed to be an example of a multi-step attack. For example, Host scan→Port scan→NETBIOS SMB NT Trans NT CREATE oversized Security Descriptor attempt(represents the solid line up to this part)→Trin00 daemon install(represents the dotted line)→DoS attack(represents the dashed line).
Table 2. Definition for the parameters or terms to the proposed approach Parameters(terms) Description State for infringement decision at stage s for the host with ip state(s, IP) address as IP, where {the evidence for analysis at stage s, the forensic evidence for stage s, infringed host IP(port), attacker IP(port), patterns for event analysis}, s=1∼5 or HS(:1), PS(:2), VE(:3), D(:4), DoS(:5) Sub-state ss for infringement decision at stage s of host IP state(s, ss, IP) when there are different evidence for analysis that shows the same symptom of infringement at stage s, where same as state(s, IP) v(s, IP) The value of infringement severity of host IP at stage s v(s, ss, IP) The value of infringement severity of host IP for sub-state ss at stage s pre-condition Conditions that the host infringement succeeds post-condition Evidence for analysis of the infringement ck (s) kth condition for the infringement decision at stage s , where (pre-condition, post-condition) cm (s, ss) mth condition for the analysis of sub-state ss at stage s , where (pre-condition, post-condition) and s, m={1,2,...} C(s) The set of pre-conditions at stage s composed of ck (s) and cm (s, ss), where {ck (s), cm (s, ss)} |C(s)| Number of the elements in C(s) wt(s) Weighting factor for stage s which is tunable wt(s, k) Weighting factor for kth condition at stage s which is tunable wt(s, ss, m) Weighting factor for mth condition of sub-state at stage s which is tunable LoN Number of files for analysis of infringement LiN Number of lines in each log file State information at stage s for all the hosts where, S(s) {analysis time, v(s, IP), total value of infringement severity at stage s for all the hosts}
An Efficient Forensic Evidence Collection Scheme
211
Evidence collection for infringement analysis Intrusion IDS1 ...
...
IDSm
Intrusion attempt Yes Yes Infringement
Host1 Host2 ...
...
Forensic evidence collection Analysis and inference of forensic evidence
Hostn Traceback for attacker
Fig. 3. Diagram of the proposed approach for infringement decision and forensic evidence collection
proposed approach analyzes the event logs of IDSes for detection of intrusion attempt. The event logs and configuration informations of the hosts are used to analyze the degree of infringement severity. From the beginning stage to the end stage of a multi-step attack as shown in Fig.2, the approach identifies the infringement for stage s of the hosts and store the evidence for analysis as the state information. The state information includes the followings: who did the malicious behavior such as hacking(Attacker IP(port)), what one did(the forensic evidence at stage s), how one did(the evidence for analysis), where one did(Infringed host IP(port)), when one did(time information of log). And then, if the value of infringement severity exceeds a criterion for the forensic evidence collection at each stage, the sever requests hosts to send the forensic evidence selectively and stores them to the data storage. 2.2
Calculation of the Value of Infringement Severity and Collection of the Forensic Evidence
Based on the parameters or terms that we define in Table.2, we now describe how to calculate the value of infringement severity of the hosts and how to collect the forensic evidence. We assume that the timer of the IDSes and hosts are synchronized and the time stamps of logs from the IDSes and hosts are reliable. – Step 1. Preprocessing After we periodically, with time period T, collect the state information, i.e., state(s, IP) and sub-state(s, ss, IP), for analysis from the IDSes and the hosts, we classify them into the corresponding information for each stage. – Step 2. Calculation of the degree of infringement severity Considering the time sequence of the evidence for analysis and the state information at each stage, the Severity Management Server, called SMS, calculates the degree of infringement severity of each host at each stage.
212
Y.-H. Choi et al. Start C(s) state(s, IP) Update the time stamp of the overlapped logs
The state information given from IP is new?
NO
YES s=1, f=1, l=1 YES s