A Study on Efficient Log Visualization Using D3 Component Against APT How to visualize security logs efficiently? Jaehee Lee
Jinhyeok Jeon
Department of Information Security Korea University Seoul, Korea
[email protected]
Department of Information Security Sungkyunkwan University Seoul, Korea
[email protected]
Changyeob Lee
Junbeom Lee
Department of Computer Science and Engineering Sogang University Seoul, Korea
[email protected]
Department of Information and Telecommunications Engineering University of Suwon Seoul, Korea
[email protected]
Jaebin Cho
Kyungho Lee
Deparment of Convergence Secuirty Kyonggi University Seoul, Korea
[email protected]
Department of Information Security Korea University Seoul, Korea
[email protected]
Abstract— APT attack has caused chaos in society since 2006. Especially, the vulnerability of the infrastructure is exposed to the outside a lot due to the development of the IT infrastructure in Korea. In addition, APT attacks targeting companies’ major confidential information are increasing every year. APT attack causes negative publicity for the company and financial damage. APT is completely different from the problem which most organizations have been dealt with. Cyber-attack threats were visible in the past. But currently, APT attacks were invisible and focused on confidential data. Therefore, we need a new approach to solve this problem. We have to find traces of prejudice in the circumstances, everything seems normal. If we perform a correlation analysis of the log acquired from all the devices, systems and applications, we can easily understand the problems which occur in our information systems Current commercial SIEM has the ability to visualize the correlation analysis and the log. But the security officer takes a lot of time to understand the visualized security logs. Moreover, due to expensive cost of SIEM solution, small companies have difficulty introducing SIEM solution. For these reasons, we have developed a SIEM solution based on open-source program such as D3 component which results in decreasing the cost of the program. In addition, we analyzed the D3 components which can visualize the security logs, and matched D3 components with the security logs. In this paper, we propose the visualization methods using D3 components for analyzing the security logs efficiently. Keywords—SIEM, D3 component, Log visualization, Log correlation analysis, Bigdata Visualization, APT
I. INTRODUCTION The Research regarding text-based log to visualization has been constantly conducted from the past. In that, the range of log has been rapidly increased by the development of IT technology, it is essential to visualize the log for efficient analysis. This is also important in information security field. Due to diversification of security equipment and development of mass storage, there is a limit to analyze security logs with limited human resource. Therefore, development of SIEM which analyze and visualize various security logs has conducted. SIEM (Security Information and Event Management) solution is effective defense solution for detecting APT attack. APT(Advanced Persistent Threat) is a set of stealthy and continuous computer hacking process, often orchestrated by human targeting a specific entity [1]. Mandiant research called Chinese hacking group which tried to leak United States defense technology APT1 [2]. After that, APT attacks on global companies are increased. On December 2014, KHNP (Korea Hydro & Nuclear Power Co. Ltd) which manages nuclear power plants had an accident in which their valuable data was leaked. The Cyber terrorist had been threatening KHNP through Twitter and disclosed the data. Citizens in Korea expressed their frustration at the fact that nuclear power plants’ information assurance had been violated. For this reason public interest also increased. Most of the victim companies were not aware of APT attack by themselves. According to M-trends 2015 report , 69 percentage of the victims were notified by an external entity, and only 31
This research is sponsored by Best of the Best education program in KITRI.
978-1-4673-8685-2/16/$31.00 ©2016 IEEE
percentage of the victims discovered the breach internally [3]. This means it takes six or seven months to be aware of the severity of the problem. Thus, we need to concentrate on detecting APT attack more precisely. If we detect APT attack in two weeks, we can prevent a lot of financial damage to the company and leakage of critical information. Because SIEM which we developed is based on opensource program, it is cost-effective and consume low resource. In addition, it is efficient to analyze APT attack in phases, because we apply various visualization function from D3 components. The reason we develop this SIEM solution is small enterprise or research group analyze security log without worrying about the cost. We will upload this solution on Sourceforge, and users can use it for free. By providing our solution for free, we expect the small enterprise can improve security capability and research group use it for conducting study on cyber-threat.
II. RELATED WORK We researched recent issue of APT attack for analyzing and detecting APT attack. And we investigated an research on security-log visualization. A. Research on APT attack process The APT penetration attacks occurred after several attacks for many years; mainly short step takes three months from configured precisely planned. According to Mandiant report, if the security administrator observed the first evidence of infringement, it takes 205 days to find infringement, and the longest period was about three thousands days [4]. Definition of APT attack step is slightly different for each research paper. First, researcher defined a wide range of the APT attack process in Korea [5]. In this paper, author divided APT attack process into four steps: incursion, discovery, capture, exfiltration. Symantec research also divide the process into four steps like previous research [6]. In Mandiant research paper in 2015 [7], author divide APT attack process into seven steps and explained each steps in detail. De Vries et al divided APT attack process into 8 steps [8]. We summarized APT attack process from researches as mentioned. First, in the penetration stage, hacker conduct a reconnaissance to find out company’s information, personnel, organization, etc. to the feasibility study. Using this information, attacker exploits a vulnerability or sends phishing mail to employees for induce them to access malicious webpage. Through these methods, attacker intrude into network. In discovery step, attacker investigates local network for escalating his access right because attacker cannot access into targeted asset by staff’s PC. In Capture step, attacker checks a location of targeted asset and investigates the system, which contains targeted asset. By this step, attacker inject malware, which infects the targeted system. In exfiltration step, attacker sends the information to C&C server and erase their logs. In addition, attacker install backdoor for intruding consistently [1].
B. Research on detecting APT attack Recently, research on symptom detection methods by correlation analysis of security logs in progress. Detecting method by correlation analysis of security logs means the system collects logs from several security equipment, and conducts a cross correlation analysis [9]. For this analysis, researcher uses attack graph, attack tree and makes several detection scenarios [10], [11], [12], [13], [14]. Especially, B. Scheneier presented attack tree in 1999 [13]. In these days, some researchers proposed Attack-Defense Tree method considering attack as well as defense system [14]. Furthermore, Dain et al proposed a detection method using heuristic and datamining in 2001 [15]. In addition, Wang et al proposed a detection method using clustering algorithm in 2005. However, previous researches concentrate on theory. Therefore, we need a research for using security logs efficiently based on big data platform in the industry. As mentioned earlier, we need to conduct correlation analysis of several security logs for detecting and analyzing complicated APT attack with a period of 6 months or more. To overcome these problems, Han et al proposed an implementation model for security log analysissystem using Big Data platform [17]. However, the empirical research on the systemic security log analysis and the Rule did not proceed. In addition, these researches APT attack detection method had been limited to basic network anomaly-detectionalgorithms, such as IP or suspected infection is detected and scanning attempt. In addition, these researches are not considering any log visualization considering the administrator. In this paper, we propose several models to visualize security logs for detecting APT attack effectively. C. Research on security-log visualization Our researchers have studied security log based visualization tools. For example, SnortView, Tudumi and ELVIS. SnortView is a visualization system of Snort logs, which was designed to help administrators in judging false detection [18]. Its best thing of SnortView is real time monitoring process. The problem of SnortView is the amount of information displayed on the screen. Currently, only four hours of logs are visualized on the screen. If the attack period is longer than 4 hours, it is difficult to check the whole steps of attacks. Tudumi is a log visualization system [19]. This system monitors and audits computer logs. It combined three different logs (syslog, wtmp, sulog) and visualized them into one image. “Syslog" contains the logs about a network access from other computers. “Wtmp" contains the logs about the time when a user logs in and logs out from the server. “Sulog” contains the logs about user substitutions. It assists security administrators to monitor and audit the logs and makes detecting anomalous activities easier. However, it does not provide real time monitoring function. In contrast with Tudumi, our solution does not have audit functions but have real time monitoring functions. Similar to Tudumi, Our researchers agree that security tools are essential tools. However, it is not easy to use security tools practically. Therefore, our solution can complements security tools.
Tools such as SnortView and Tudumi follow a different approach from our solution. SnortView handles and visualize only snort log data, and have no way to handle huge amount of log data. In addition, Tudumi integrates only three kinds of computer logs. In contrast with SnortView and Tudumi, our solution is designed to provide relevant visualizations for APT attacks by using many types of security-related log files. New types of log formats can be added by using regular expressions in our solution. And ELVIS is the closest visualization tool to our solution [20]. ELVIS, which stands for Extensible Log visualization, is a security-based log visualization tool. Using this tool, it is possible that the security administrators check and analyze easily a lot of log data. To analyze the log automatically, the ELVIS had been defined information about the fields of the logs. As a result, there also be more concise representation of the log. In addition, it is effective because we can observe more detail part we wants to see. We can add new type of logs and define information about the fields of the logs by using regular expressions. There are two objects and advantages of the ELVIS. First, it helps security administrators to make it easier to analyze large amounts of log data. Second, security administrator needs to know only the format and type of log. It can make visualization immediately. On the other hand, there are some limitations. First, to further analysis, security administrators must be examined logs one by one. Second, it is impossible to visualize further representations because it uses only the summary of the log data. Third, each of the log data are independent of each other, so it is impossible to observe several combined logs. Finally, an optimization task. It takes a lot of time to analyze log and correlation analysis. ELVIS is quite close to our solution in a sense that ELVIS accept many types of data files and can generate many different representations. However, it does not clearly distinguish between normal and abnormal activities visually. As results of the above tools, we consider that useful tool for resolving these problems is an interactive, dynamic, and user-friendly visualization for assisting security administrators to monitor a log of logs. We believe that the dynamic and userfriendly representations using D3.js components make our solution a real help for the security administrators who observe the security-oriented log data. It reduces much time for exploring logs. Our solution benefits from its security-based to provide more relevant visualization about APT attacks for the security experts.
III. LOG ANALYSIS PLATFORM A. System Archtecture Our SIEM solution’s architecture is based on opensource program. It collect various logs from security equipment by fluentd, and analyze them by MySQL-plugin. Finally, it visualize them on dashboard which is based on D3.js. Detailed architecture is below in Fig. 1.
Fig 1. System Archtecture
B. Detailed functions 1) Risk management monitoring
Figure 2. Log clustering graph
Above this D3 component will be convenient to manage a risk management by grouping suspects separately to prepare in advance for confidential documents and trade secrets leaked. It is useful to detect malicious behavior from collecting information step about the target of the APT attack process to extracting enterprise key assets. It is very important to be able to detect attacker at the first moment of occurrence. The special characteristics of the above D3 components is that simple to manage risk management classification and visualization of the target. According to recent trend, usually, APT intruders attack from outside, however it is only enough to be considered a threat by insiders in many domestic and foreign cases. Insider threat is that prospective retirement, Information leak suspects, core people and HR disciplined employees. The biggest circle means risk management subjects. The inner circles in the risk management subjects is classified prospective retirement and HR disciplined employees etc. In each classified circle, configure the individuals who belong to the group, and then this area includes employee’s activities such as searching, writing document, etc., by examining the logs and network packets during office hours. The size of the circles is the number of implementation and repetition. It is handy to examine the circle by utilizing zoom in and out functions. In summary, Managers are able to monitor their malicious behavior by classifying threat targets and observing people’s activity. The potential retirement referred to as dignitaries, mainly have several privileged account in their company. It is considered to attacker’s top prior targets by social engineering methods to conciliate them. Therefore, the potential retirement have to be managed and regarded as suspects that they may
proceed with the theft of major property or other confidential document. If unusual movements occurred, such as transferring files through FTP from the prospective retirement’s host to an external network or connecting VPN, not the use of a fixed time but suddenly becomes large number of connections, which is intended to be displayed in caution or warning, administrators can detect this and take a warning and disciplinary measures actor
In this case, the visual alarms, as well as the absence of an administrator, if it can send an alarm signal with a message or e-mail to Manager, it will be able to work more effectively in company. 3) Server based monitoring
Upon detection of suspected malicious behavior, is not limited to detection, just one click, if it is possible to interrupt the network of the host, it is expected to be significantly efficient in terms of risk blocking. 2) Regional traffic monitoring
Figure 4. Packet flow graph
Figure 3. GeoIP graph
The above D3 component, at the viewpoint of the administrator, it is possible to track the location like the source of local IP traffic by regional groups and measure the amount of packet traffic. It is useful to detect from the penetration stage attempting the acquisition of access rights and external reconnaissance to the threat of discover, capture and exfiltration stage. In other words, it can be apply to the entire APT attack process. The intruder does not only exist in the country. The cyberattack threatened the all over the world is the APT. According to Advanced Threat Report 2013 published by FireEye, South Korea is the second location of the world that has been under APT attack in 2013. [7] Thus, Enterprises need to widen the detection range. Following the recent trends, this D3 component appropriated to detect the location of the attack source. It can check in real time from a wide range of continental, national level to area of small units such as city, town, etc. in the country. Beforehand, it is necessary that the information is recorded and stored the average of inbound and outbound network packets classified weekly, monthly and annually traffic volume. Administrator referring this data establish the warning standard. If the outbound packets exceed the standard, it can be set to show a visual change. For example, if outbound packets are the larger than the standard packet amount of monthly average traffic is produced, it can be inferred a kind of APT attack, such as Command and Control server communication or using transmission path (VPN, FTP), which transmits the confidential documents and data to the home.
The Security log is important to measure how often the event occurred on different devices, based on the Key for IP, Port. In addition, it is possible to analyze security log data tracking IP by measuring whether any security devices occurred in regular sequence and detected. Above the D3 components, gradually, it can numerically express IP packets that have passed through the security equipment such as firewall, IDS, IPS etc. in the enterprise. The attacker starts fullfledged navigation activities to expand access and internal reconnaissance through the penetration step in APT attack process. From Incursion stage, the first step of the APT attack process, to Discover stage expanding second step, it can be easy to visual detect malicious behavior of this period. There are several elements in this D3 component. The innermost circle symbolizes the security equipment in the outermost within the enterprise such as Firewall, usually inbound packets through this device. Then, the order in which the packet passes through the security equipment, Firewall – IDS – IPS – etc. – End node (Host). The outermost slim circle is often defined personnel PC, however we assumes that is terminal located Intellectual properties and assets. Fundamentally, inbound packets are through several of security devices to connect the last node. Therefore, in the above case, it is verified 0.113% of the inbound packet traffic reached the target terminal. But, unlikely normal case, the packets through the several gateway like in order Firewall – IDS – IPS – etc. – End node, this ignored the process, it can be seen that the rate of reaching directly End node from Firewall. If you think the log information collected in the firewall and the End node separately, it is very plain packet not unlike any other packet. However, if you give a sense of direction from Firewall to End node, at this moment, it is different meaning. This suggests that it is likely to be the act of penetration straightly to where intruder wants to bypass the detection of security equipment are present in the middle. In this position
control for administrators, Manager is able to recognize immediately that is malicious behavior. 4) Malicious acts monitoring
level are examined the right side in the middle of node. Like this case, it seemed that the intruder came to bypass the defense and security equipment in the previous stage. In addition, it is suspected that installed and executed backdoor in the targeted device preliminarily. 5) Time based monitoring
Figure 5. Multi-level monitoring graph
Above this D3 component is to see easily the network inside and outside packets in enterprise. Moreover, at the viewpoint of manager, it is helpful to detect from the Incursion is the first step of APT attack process to Capture stage. Beforehand, to leverage this D3 component, there is defined several parts. For taking a measurement the severity of malicious behavior, it needs to be classified the risk level such as safe, low, moderate, high, unacceptable risk going through set up risk scoring the degree of damage and the possible of data leakage about intellectual properties and assets. Moreover, they have to match the color of risk level. For example, Unacceptable-risk means the image of the enterprise come down the reason leakage of confidential information, altered, destruction, and denial of service which color is red. High-risk level is the situation information spill already occurred is likely to develop into a large-scale disaster in enterprise which color is orange. Etc. In this way, it can analyze visually about IP, bad acts which a computer user came to any equipment inside through the abnormal traffic. Then it is possible to detect the current dangerous condition. Following paragraph defined elements of the above D3 components. Definition: • Node is the security equipment in company. • End node is the personnel PC in company. • Link Edge’s width is amount of network traffic. • Time edge’s width is the number of times network traffic’s activity. • The color is degree of malicious behavior When expressed in the same way as above, it is convenient to represent visually starting point a malicious act from which stage equipment is, or determining whether this malicious traffic flows in real time. Additionally, it is possible to track the source location of doubtful packets. For instance, there is at six in the horizontal axis of the above figure, traffic flowed without any symptoms or interrupt to the left side of the device. However, malicious packets of the unacceptable risk
Figure 6. Time-based graph
The above the D3 component, include concept of time in contrast with the other component so it is useful to analyze the log data flow in detail. It can detect over all of the APT process. As it introduced previously D3 components, it is also important to detect attack threat of intruder in real-time. However, when detecting malicious acts, the position of administrator need to analyze the packet or log visualization. The main advantage of this D3 component check the influx and the movement route of the packet per hour. Looking into the configuration separately, it consist of left and right side. On the left side, it is defined network packet and user log flow in enterprise. Vertical axis of the right side show the timeline and horizontal axis represent node indicating on the left side. Ordinary day, Administrators detects attack by observing a variety of D3 components formerly proposed, on the other hand, this component is able to help using linkage analysis suspicious packet flow according to timeline. One of the advantages of the graph is to take a note that analyze about traffic pattern they thought. It can leverage to detect many threats. Most companies work from 8:00 to 18:00, inbound and outbound packets of the enterprise activated in this period. Malicious attacker usually begins to gather information about the target or extract their secret, confidential documents, when cyber security personnel left the office at dawn. When all of the employees left the office and did not signs of somebody present, some traffic exist that have nothing to do with this timeline. It must doubt whether malicious actions implemented in the source, and suppose the possibilities of progressing APT attack process such as penetration, discovery, capture and extract. In this case, this D3 component is convenient for administrator to examine threat of the intruder exhaustively by visualizing graph.
IV. EXPERIMENTAL RESULTS A. Dataset We installed our honeynet system is composed of our SIEM solution and Honeydrive which includes several honeypots. We organize a honeynet system with our log analysis and visualization platform and Honeydrive. We expose the honeypot to a DMZ. And we collected logs from November 6th to 8th in 2015. Through this test, we can check our solution’s performance. B. Implementation We analyze the security log while Kippo (SSH Honeypot) and Conpot (SCADA Honeypot) were implemented. In addition, we installed Suritcata IDS for analyzing various attack logs. Used Public IP address is 125.131.189.xxx. It works well. So we collected meaningful data.
system every six months. This research will be helpful for Korean SCADA system. ACKNOWLEDGMENT (Heading 5) This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2015-R0992-15-1006) supervised by the IITP (Institute for Information & communications Technology Promotion) REFERENCES [1]
[2] [3]
C. Results Even though we had implemented it only three days, we collected a lot of security logs and our SIEM solution analyzed and visualized them efficiently. Visualized logs are below in Fig. 7.
[4] [5]
[6] [7] [8]
[9]
[10]
[11]
[12] Figure 7. Dashboard Design
V. CONCLUSION Not only security company but also academia have to research on visualization of security logs. Because recent APT attacks are targeted on national organization or Critical Infrastructure. For analyzing those APT attacks, we need to analyze correlated security logs and visualize them for understanding those connections. Therefore, we need to research the security-log visualization constantly. This research is also focused on visualizing security logs efficiently. Our SIEM solution will be uploaded on Sourceforge. Users can use our solution and they will feedback to us for improvement of it. We will develop a Cyber-Threat Profiling System composed of our SIEM solution and Honeydrive. Research on cyberprofiling system is also conducted by various institutes, but not many researches on SCADA honeypot are conducted in Korea. We will release cyber threat report by using our cyber-profiling
[13] [14] [15]
[16]
[17]
[18] [19]
[20]
Binde, Beth, Russ McRee, and Terrence J. O’Connor. "Assessing outbound traffic to uncover advanced persistent threat." SANS Institute. Whitepaper (2011). Center, Mandiant Intelligence. "APT1: Exposing one of China’s cyber espionage units." Mandiant. com (2013). Center, Mandiant Intelligence. "M-Trends 2015: A VIEW FROM THE FRONT LINES." Mandiant. com (2013). Center, Mandiant Intelligence. "APT1: Exposing one of China’s cyber espionage units." Mandiant. com (2013). Youseok Lim. “Review on the Cyber Attack by Advanced Persistent Threat.” The Korean Association for Terrorism Studies. 6.2 (2013): 158178 Symantec. “Advanced Persistent Threats : A Symantec Perspective.” (2011). FireEye. “FIREEYE ADVANCED THREAT REPORT: 2013” (2013). De Vries, J. A. Towards a roadmap for development of intelligent data analysis based cyber attack detection systems. Diss. TU Delft, Delft University of Technology, 2012. Kyungho Son, Taijin Lee, and Dongho Won. "Design for Zombie PCs and APT Attack Detection based on traffic analysis." Journal of Korea Institute of Information Security and Cryptology. 24.3 (2014): 491-498. Abad, Cristina, et al. "Log correlation for intrusion detection: A proof of concept." Computer Security Applications Conference, 2003. Proceedings. 19th Annual. IEEE, 2003. Noel, Steven, Eric Robertson, and Sushil Jajodia. "Correlating intrusion events and building attack scenarios through attack graph distances." Computer Security Applications Conference, 2004. 20th Annual. IEEE, 2004. Wang, Lingyu, Anyi Liu, and Sushil Jajodia. "Using attack graphs for correlating, hypothesizing, and predicting intrusion alerts." Computer communications 29.15 (2006): 2917-2933. Schneier, Bruce. "Attack trees." Dr. Dobb’s journal 24.12 (1999): 21-29. Kordy, Barbara, et al. "Foundations of attack–defense trees." Formal Aspects of Security and Trust. Springer Berlin Heidelberg, 2011. 80-95. Dain, Oliver, and Robert K. Cunningham. "Fusing a heterogeneous alert stream into scenarios." Proceedings of the 2001 ACM workshop on Data Mining for Security Applications. Vol. 13. 2001. Wang, Qiang, and Vasileios Megalooikonomou. "A clustering algorithm for intrusion detection." Defense and Security. International Society for Optics and Photonics, 2005. Ki-hyoung Han, et al. " A Study on implementation model for security log analysis system using Big Data platform." JOURNAL OF DIGITAL CONVERGENCE 12.8 (2014): 351-359. Hideki Koike, Kazuhiro Ohno, SnortView: Visualization System of Snort Logs Tetsuji Takada, Hideki Koike, Tudumi: Information Visualization System for Monitoring and Auditing Computer Logs, Proc on Information Visualization (IV2002), IEEE/CS, pp.570–576, 2002. Christopher Humphries, Nicolas Prigent, ELVIS: Extensible Log VISualization