May 28, 2014 - Cloud computing has gained popularity over recent years due ... and attacks in the cloud environment and one of these concerns include ...
Capturing keystroke logs in the cloud environment for Digital Forensic Readiness Purposes by MARK SHEUNESU MAKURA 13090012 A mini project submitted in partial fulfilment of requirements for the degree BACHELOR OF SCIENCE IN COMPUTER SCIENCE HONOURS in the FACULTY OF ENGINEERING, BUILT ENVIRONMENT AND INFORMATION TECHNOLOGY University of Pretoria May, 2014 Supervisor: Professor Hein Venter
2 Abstract. Cloud computing has gained popularity over recent years due to its ease of use, flexibility, availability and ability to work in a virtualised environment. This in turn, has provided a platform for cybercrime criminals who have found ways of exploiting these services to the core through the deployment of malicious attacks in the cloud environment. Digital Forensic Readiness (DFR) is a proactive measure used in preparation and planning of an incident to detect and collect digital evidence to be used in Digital Forensic Investigations (DFI). The problem that this project addresses is premised on the notion that there is no easy way of attaining DFR in the cloud architecture without modifying the cloud architectures. It presents a method for harvesting digital information in a cloud making use of a non-malicious cloud based key logger. The approach will assist digital forensic investigators in acquiring potential digital evidence to be used in attaining forensic preparedness for a DFI.
3
1
Introduction
The use of cloud computing by many organisations has risen over the past decade where advancements in technology and web-design are being used on many cloud services. Due to the various services offered by cloud computing, it has provided a platform for cybercrime criminals to perform malicious attacks within the cloud(Anthes, 2010). A security concern has been raised by many clients of Cloud Service Providers (CSP’s) regarding external attacks from malicious software (Clark et al, 2011). There has also been concern in the way DFIs are performed to counter the threats and attacks in the cloud environment and one of these concerns include chiefly the lack of DFR. Tan (2001) defines DFR as the capability of an DFI body in maximising the use of digital evidence data whilst minimising the expense of a forensic investigation to an incidence response. The main objective of this research is to provide a method by which digital evidence can be gathered from a virtualised environment to be used in improving digital forensic preparedness. Hence this research project tries to address the following question: How can we extract potential digital forensic information in a cloud environment that can be used in DFR? The remainder of this research project is structured as follows. The following section (Section 2) presents a background of the study that brought about the research question posed above. Section 3 outlines research work done by other researchers in this field. Section 4 presents the experimentation performed to describe how the keylogger can be implemented. Section 5 discusses the findings of the research and section 6 is the conclusion and future work.
2
Background
The following sections discuss about cloud computing, its origin and issues surrounding cloud computing. It also gives an outline of the various malicious software that are of date, major security threats in cloud environments. The section goes in discussing about DFR, what entails and the components of DFR .Lastly the legal perspective involved in a DFI.
4
2.1
Cloud Computing
Cloud computing is defined by the National of Standards and Technology (NIST, 2001) as, ”a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.” The computing resources include storage, networks and applications. Cloud computing consists of three service models and four deployment models which have been briefly discussed below. 2.1.1 Service Models (a) Software as a Service (SaaS) Is cloud service model that provides the consumer with exclusive use of the Cloud Service Provider’s (CSP) applications provided on the cloud infrastructure (NIST, 2001). The consumer has no rights in modifying or controlling the applications provided by the CSP. (b) Platform as a Service (PaaS) Is a cloud service model that provides the consumer with capabilities of using user-created applications of known programming languages and tools recognised by the CSP (NIST, 2001). The user has control of the user-applications only but not the underlying cloud infrastructure. Many organisations make use of this cloud service model since it provides the organisations with rights to use their own applications. (c) Infrastructure as a Service (IaaS) Is a cloud service model were the user provided with capabilities of processing, storage and abilities to run provided software (NIST, 2001).The user as in PaaS, has no control of the cloud infrastructure and its underlying features but of deployed applications and storage (NIST, 2001). Deployment Models (a) Private cloud Is a model where exclusive rights to close infrastructure use are given to a single organisation (NIST, 2001). (b) Public cloud The model provides exclusive rights to cloud infrastructure use to the general public (NIST, 2001). An example of an organisation which provides such services is Google.
5
(c) Community cloud Is a model where exclusive rights to cloud infrastructure use is given to communities of organisations that have something in common be it in the network technologies they use, resources or applications (NIST, 2001). (d) Hybrid cloud Is a combination of two (or even more in some cases) cloud models mentioned above (NIST, 2001). This can be implemented by an organisation that wants to provide various services to its consumers. The following section discusses about the various types of malicious software.
2.2
Malicious Software
Malicious software (Malware) can be defined as any piece of software that is used to solicit confidential information or disrupt normal services of a workstation (Tamassia,2001). Cybercrime criminals make use of malware in many of their activities for example in attempts to gain access to private systems.
2.2.1 Malware types There are various types of malware and it is necessary to know them and their effects in the cloud computing environment since this research focuses on the virtual environment. Fig.1 is a diagram that classifies malware according to the propagation mechanisms. The classification tree above shows the various malware types and the different levels of threat depicted by the arrow. It can be observed that worms pose a higher threat than trojans mainly due to the propagation mechanisms of the worm. The various types of malware have been described below.
(a) Viruses A computer virus is a piece of coded malicious software that has the capabilities of self replicating through modifying files or programs it infects by inserting its own code so that it replicates further (Tamassia, 2010). A computer virus needs a host inorder to propagate further. Viruses perform various activities when they have infected a host for example using up hard disk space, corrupting critical data and boot sectors and display of various messages on a user’s screen. In the clouding environment, viruses are capable of performing the same scheme of corrupting data in the cloud. Types of viruses include boot sector viruses, file viruses and macro-viruses(Tamassia, 2010).
6
Fig. 1. malware-tree(Kaspersky, 2014)
(b) Trojan Horses A trojan horse is another type of malicious software whereby the software is disguised as legitimate software when it actual fact its not (Tamassia, 2010). Trojan horses when in execution seem as if they will be performing a useful task but at the same on the contrary, they will be performing another malicious type of task. Users often install trojan horses unknowingly due to their appearance that appears to be legitimate software. Once installed they can remain dormant for sometime then unleash a malicious attack later that can have disastrous consequences on systems and workstations (Tamassia, 2010). Below is a description of malicious software that many trojan horses make use of. (i) Keylogger A keylogger is a coded piece of malicious software that is capable of recording computer activities for example keystrokes made on a keyboard and record it in a log file (Rouse, 2010). Trojan horses makes use of keyloggers to solicit confidential information from an unsuspecting user by logging keystrokes made on the keyboard or capture screen-shots and send them via e-mail to the trojan creator.
7
(c) Root Kits A root kit is another type of malicious software whereby it conceals itself within an operating system through modification on the operating system such that it becomes undetected (Tamassia, 2010). Rootkits are very hard to detect as they conceal themselves within an operating system and the operating system does not record or reveal their presence when in execution. (d) Worms Worms are malicious software capable of replicating themselves and infects systems through the network (Tamassia, 2010). They however do differ from computer viruses in that they do not need to infect programs through injecting themselves into the programs like what viruses do. The first worm to land its mark on the internet was the Morris worm (Tamassia, 2010). Worms propagate through the internet by making use of known vulnerabilities in hosts. 2.2.2 Uses of Malware Malware are used for various malicious activities that include the following:(i) Distributed Denial of Service (DDoS) attacks like in the case of botnets.(ii) Click fraud which is a a fraudulent practice that involves a user who might be forced to repeatedly click on a ”pay per click” advertisement on a website with the intention of generating revenue.(iii) Ad-ware distributioninvolves the use of software that automatically pops up advertising material without user intervention on a website with the intent of luring a user to click on the advertisement inorder to download the software.(iv) Identity theft which is another fraudulent practice that involves making use of someone else’s identity or personal data or pretending to be someone with the intention of obtain confidential information like credit card details, PINs and passwords.(Tamassia, 2010) Having discussed the various malware types, the following section gives a brief background about digital forensics and DFR and the related international standards. 2.3
Digital Forensic Science
The Digital Forensic Science Research Workshop (DFRWS) defines Digital Forensic Science (DFS) as, ”the use of scientifically derived and proven methods toward the preservation, collection, validation, identification, analysis, interpretation, documentation and presentation of digital evidence derived from digital sources for the purpose of facilitating or furthering the reconstruction of events found to be criminal, or helping to anticipate unauthorised actions shown to be disputive to planned actions” (DFRWS, 2001). Hence deriving from this definition, DFS
8
entails the use of scientific methods in a criminal investigation that (possibly) contains digital evidence. DFRWS is an annual workshop where digital forensics experts meet share their knowledge and ideas about digital forensics research. Due to the ever increasing threats of cybercrimes, DFS is necessary in providing means in which these cybercrimes can be solved. 2.4
Digital Forensic Readiness
DFR is defined by Tan (2001) as the capability of an organisation in maximising the use of digital evidence data whilst minimising the expense of a forensic investigation to an incidence response. It is therefore necessary for any organisation to be prepared prior to an incident occurring. Tan (2001) identified the following factors that affect DFR namely (i) Logging Methods (ii) Evidence Handling (iii) Forensic Acquisition and (iv) Intrusion detection methods. These factors are very critical for any DFR procedure. For instance, logging methods are of importance in checking how potential evidence is logged. If there exists flaws in the process of how the digital information is stored then this will inadvertently affect the strength of the evidence and its admissibility in court. As is the case with any process, it has to conform to certain standards, the following subsection discusses about an international standard relevant to DFR. 2.4.1 ISO 27043 ISO 27043 is currently a draft international standard proposed by (Valjarevic & Venter, 2012) which is aimed at implementing an international standard that provides a, ”harmonised digital forensic investigation process model that can be used as a standardised set of guidelines for digital forensic investigation.” ISO 27043 is important as it provides a standard digital investigation process model that can be used for any digital investigation. By the time of writing this project, the ISO 27043 standard was at it last stages of being published. 2.4.1.1 Classes of Digital Investigation Processes The ISO 27043 consists of five digital investigation processes namely (ISO, 2012): 1. Readiness Processes 2. Initialisation Processes 3. Acquisitive Processes 4. Investigative Processes
9
5. Concurrent Processes Fig. 2 below describes the relationship between the classes stated above
Fig. 2. Classes of Digital Investigation Process(ISO, 2012)
Fig.2 shows that the process classes are a multi-layered architecture beginning with the readiness process. The concurrent processes class as the name implies runs concurrently with all the other process classes. The concurrent processes are namely, (i) Managing information flow (ii) documentation (iii) Obtaining authorisation (iv) Preserving chain of custody and (v) preserving digital evidence(ISO, 2012). This project focuses specifically on DFR hence it falls under the readiness processes class. The readiness processes class has other sub-processes classes also namely (i)planning processes group (ii) implementation processes group and (iii) assessment processes group (ISO, 2012). Fig.3 below shows the complete readiness processes class. Various activities are performed in each sub-processes group. In the planning processes group, planning activities like identification of potential digital evidence, pre-incident collection planning and storage are performed (ISO, 2012). The implementation processes group follows after the planning processes group and includes activities like system architecture implementation, pre-incident
10
Fig. 3. Readiness Processes Class(ISO, 2012)
collection implementation and pre-incident analysis implementation etc (ISO, 2012). The last readiness processes group is assessment and includes activities like implementation process assessments and assessment output procedure implementation (ISO, 2012). After the readiness processes group, the incident detection process from the initialisation processes group receives input from the readiness processes group. The following section discusses about the legal aspects of a DFI.
2.5
Legal Perspective
The section that follows focuses on the legal perspective on the admissibility of digital evidence. This has been presented by difference acts that vary across different jurisdictions . This project involves harvesting digital information from a cloud user on a cloud instance. It is therefore important to know the various acts that govern a person’s personal information so that there will be no compromise to a user’s privacy rights.
2.5.1 US Acts
11
(a) Electronic Communications Privacy Act (ECP) It is a United States government Act that deals with the regulation in transfer of transmissions of electronic data and the restriction of access to stored electronic communications.The US ECP Act puts restrictions to law enforcement and investigators on their capability to intercept transmitted communication which the law says it is a privacy violation(Weiz, 1992).
(b) Stored Communications Act (SCA) SCA is a United States government law that deals with the protection of ”stored wire and electronic communications and transactional records” for a person by an Internet Service Provider (ISP) (Kerr, 2004). It prohibits an ISP for disclosing personal information, records and transactions of a user to a third party.
2.5.2 UK Acts
(a) Data Protection Act (DPA) It is a United Kingdom law as the name entails that deals with the protection of personal data. The act provides laws needed for the regulation of a person’s privacy in terms of processing of the person’s data.(Steinke, 2002)
(b) Computer Misuse Act (CMA) It is a UK government Act that governs access to computer material. The act states that it is a criminal offence to perform an unauthorised access or modification to computer material and it is punishable.(Taylor et al, 2010)
2.5.3 South African Acts This section discusses the current South African legislative laws focusing on an individual’s privacy rights and digital evidence gathering.
(a) Protection of Personal Information Act (POPI) Is a South African Act and as the name implies, the act seeks to protect a person’s privacy. In South Africa, every individual has the right to privacy that includes protection from unlawful use of personal information (POPI Act, 2013).
12
(b) Electronic Communications and Transactions Act (ECT) Is another South African Act that deals with electronic communications and transactions. Its major role is the regulation of electronic communications and transactions of individuals (ECT Act, 2002).
(c) Regulation of Interception of Communications and Provision of Communication related Information Act (RICA) Is a South African Act responsible for the regulation of interception of communications, monitoring of signals and other various regulations that pertain to communications (RICA, 2002). It is evident that there are legal aspects of Digital Forensics that are necessary considering the admissibility of digital evidence in court. The next section discusses the related work to this research study.
3
Related Work
In this section, the focuses is on the work done by other researchers which is related to the current research.
Research by (Trenwith and Venter, 2013) on DFR focused on how to speed up the digital forensic process by making use of DFR in a cloud environment.Traditional methods of digital forensic investigations are often slow and pose a risk of for example loss or manipulation of volatile data. They proposed a model for DFR that focused on centralised logging of all operations within the cloud before an investigation is initialised. Their work complements the work of Tan (2001) were the researcher stated the importance of centralised logging in developing efficient digital forensic methods. Tan (2001) identified 4 major origins of incident data namely: ”(a) Random Access Memory (RAM), registers and raw hard drive of the victim(s). (b) Random Access Memory (RAM), registers and raw hard drive of the attacking system(s). (c) Logs (from the victim(s) and attacking systems, also intermediary systems). (d) The physical security at the attacking system for example a CCTV camera”.
13
It can be observed that logs are of importance as they are the ones that carry potential sources of incident data. The logs can include logs of programs in execution on the system, keystrokes on the keyboard, system or application errors etc.
A study by (Van Staden & Venter, 2012) made use of performance monitoring tools in the implementation of DFR. They used the Learning Management System (LMS) where they implemented LMS in the Software as a Service (SaaS) of a cloud through hosting it outside an organisation. The results revealed that it was possible to collect live digital forensic data while users access services. Cohen (2013) stated that faults can occur during the process of gathering digital forensic evidence. Furthermore he identifies three cloud computing features link to the issues that can occur during digital forensic process being namely (i) The evidence maybe present and perform various tasks in many workstations due to distributed computing (ii) The workstations can be at several different sites (iii)During issuing out of the workstations, there might be differences in terms of ownership to the content present.Hence these three differences impact on how a digital forensic investigation is performed.
4
Experimentation
This section describes the experiments performed in demonstrating how potential digital information is captured. This was done by capturing keystrokes typed on the keyboard through infecting a virtual instance with a non-malicious code. A computer was used which had the following specifications: (i) Intel Core i3 processor (ii) 4GB RAM The computer was running the Windows 7 operating system. The non-malicious code was developed in Visual C++ programming language. The non-malicious keylogging code was to be used as a tool to harvest potential digital data that will possibly be used for DFR. The non-malicious keylogger is meant to accomplish the following objectives (i) Monitor cloud activities (ii)Gather incident information as evidence (iii) Record data into a logfile
14
It has to be emphasised that the keylogger used in this research was being used in incident detection evidence gathering. The potential digital evidence gathered can then be used for DFR. The non-malicious keylogger is to operate in an instance of a virtualised environment. The cloud based keylogger described in this section has been presented using a workflow diagram shown in fig.4.
Fig. 4. Work flow diagram
Fig.4 shows two possible cloud instances within a virtual environment. The non-malicious code is injected into the cloud instances and will be operating in
15
stealth mode for the purposes of capturing keystrokes made by a user within the virtual instances. The non-malicious keylogger operates in stealth mode so that a user does not tamper with the keylogger and disrupt its normal operations. The following are detailed steps to describe the workflow diagram shown in Fig.4. 1. The non-malicious key-logging code is injected into two virtual instances and operates in stealth mode. 2. The non-malicious keylogger captures keystrokes typed on the keyboard 3. The captured keystrokes (potential digital information) are stored within a text file 4. The text file is hashed using MD5 or SHA1 to preserve data integrity 5. The text file is sent to a database for storage The stored text file containing potential digital information can then be used in DFR preparedness. The following section will describe how the non-malicious keylogger captures key strokes. 4.1
Capturing key strokes
The non-malicious keylogger captures the keystrokes by making use of the American Standard Code for Information Interchange (ASCII) numbers of the keyboard characters both special characters and the alphabetical characters. The keylogger operates in stealth mode which important so that the keylogger does not get terminated or a user might not compromise the way it operates. Keylogger in Operation The non-malicious keylogger operates in stealth mode, and hence its operations cannot be viewed by a user in the cloud instance. Fig.5 below shows the non-malicious keylogger in ”non-stealth” mode for the purposes of illustration of how it captures the key strokes. A pangram was used to test if it can capture all the letters of the alphabet. A famous pangram, ”a large fawn jumped quickly over white zinc boxes” (Marinoski, 2009) was used. When a user strikes a key on the keyboard, the keylogger captures the ASCII number for instance, if a user types in the letter ”A”, the ASCII value for the letter ”A” (which is 65) is captured as shown in Fig.5. The ASCII number is then converted to its corresponding character and the character is logged in a text file. Fig.6 shows the logfile of the key strokes that were typed on the keyboard. They are stored according to the way they were typed (in sequence). The following section discusses about the performed experiments.
16
Fig. 5. Keystroke capturing
Fig. 6. Log file of captured key strokes
5
Discussion
The experiments conducted illustrate a way by which potential digital information can be harvested in a cloud for DFR preparedness. DFR is important in any DFI as it allows incident detection investigations to be performed quickly before losing data or disrupting the integrity of digital information. One of the most important points this project focused on was the legal perspective of the admissibility of digital evidence. This research focused on DFR in the cloud and hence it becomes a problem when the cloud infrastructure is spread across several different countries (which most often is the case) with several different jurisdictions having different legal requirements. Various acts that impact the acquisition of digital evidence and user’s privacy rights to privacy have been addressed examples being the US ECP Act and the UK DP Act. It is evident that there exists differences across several jurisdictions about a person’s right to privacy. The US ECP Act for example puts restrictions to law enforcement and investigators on their capability to intercept transmitted communication which the law says its a privacy violation (Casey, 2011). The ECP Act further goes on to give restrictions on the way a company monitors the network activities of its employees. Hence there still exists legal issues in the
17
way digital information can be acquired from an individual. Finally the non-malicious keylogger can be improved further if it can include a time stamp of the time the keystroke was made. This will assist to a greater extent the DFI in providing an accurate time when a specific event occurred. The major disadvantage of the non-malicious keylogger is that it cannot identify the person infront of the workstation. Anyone can be seated in front of the computer even though a legitimate user is logged in on the workstation. This is one of the problems faced in a DFI (that is finding the perpetrator). The next section concludes the research and discusses the future direction in this research.
6
Conclusion
DFR in the cloud has been faced by many challenges which have been outlined above one being that it makes use of the old traditional methods of harvesting digital information. Due to the high development speed of technology and the ever increasing threats and cybercrimes, it is important to keep up to date with the latest security tools and packages necessary to prevent intrusion of for example malicious code. The keylogger mentioned in this research is an instance of the many ways on how digital evidence can be gathered for DFR in the cloud. It is important however to note that keyloggers are considered a threat to security but on the contrary they can also be used assist in harvesting potential digital information to be used in DFR in cloud environments. The harvesting of digital information has its legal issues which have been discussed in this research. It is therefore important to ensure that the method to be implemented for potential digital information harvesting conform to known legal acts inorder to maintain its admissibility in court.
Acknowledgments The author would like to thank firstly, the Almighty God for giving the wisdom, strength and courage in doing this project. Concomitantly, I would like to thank Professor Hein Venter, my supervisor illustrious dedication and guidance he provided me in doing this project. I would also like to thank Victor Kebande, a DFR expert who assisted in the development of the keylogger and the overall architecture of the model. Lastly I would like to thank my Dad, Mum and
18
siblings for their encouragement and support throughout the development this project.
19
References Anthes, G., 2010, ’Security in the cloud’, Communications of the ACM, Volume 53, Issue 11, pp 16-18. Casey, E.,2011, Digital Evidence and Computer Crime, 3rd Edition, Academic Press, London, UK. Clark, K., et al, 2011 ’Bot-clouds, the future of cloud based botnets’, CLOSER, SciTePress ,pp 597-603 Cohen, F, 2013 ’Challenges to Digital Forensic Evidence in the Cloud’, in CyberCrime and Cloud Forensics: applications for investigation processes, Chapter 3, pp 59-78 DFRWS Technical Report.,2001,’A Road Map for Digital Forensic Research’, viewed 28 May 2014, from www.dfrws.org/2001/dfrws-rm-final.pdf Electronic Communications and Transactions Act (ECT), 2002. South African government, viewed 27 May 2014, from: http://www.doc.gov.za/documentspublications/acts.html?download=33:electronic-communications-andtransactions-act-2002 ISO/IEC 27043., 2012, ’Investigation principles and Processes’, unpublished draft international standard. Kaspersky.,
2014,
Types
of
Malware,
viewed
28
May
2014,
from
http://usa.kaspersky.com/internet-security-center/threats/malwareclassifications#.U4hTh3ZPYZ8 Kerr, O., 2004, ’A User’s Guide to the Stored Communications Act, and a Legislator’s Guide to Amending it’. George Washington Law Review. Marinoski, G.E.,2009., Alphabet Sentences, viewed 26 May 2014, from: http://www.dancingpencalligraphy.com/howto/AlphabetSentences.html Mell, P., & Grance, T.,2009, ’The NIST definition of cloud computing’, National Institute of Standards and Technology, 53(6), 50. Protection South
of
African
Personal
Information
government,
viewed
Act 27
May
(POPI),
2013.
2014,
from:
http://www.gov.za/documents/download.php?f=204368 Regulation of Interception of Communications and Provision of Communication related Information Act (RICA), 2002. South African government, viewed 27 May 2014, from: http://www.gov.za/documents/detail.php?cid=371532 Rouse, M.,2010, ’Keylogger(keystroke logger, keylogger, or system monitor)’, viewed 23 May 2014, from http://searchmidmarketsecurity.techtarget.com/ definition/keylogger Steinke, G, 2002. ’Data privacy approaches from US and EU perspectives’. Telematics and Informatics, 19(2), 193-200.
20 Tamassia, R.,2001, ’Malware: Malicious Software, Lecture notes, viewed 28 May 2014, from:http://cs.brown.edu/cgc/net.secbook/se01/handouts/Ch04Malware.pdf Tan, J., 2001, ’Forensic Readiness’, Cambridge, MA @ Stake, 2001. Trenwith, P, Venter, H.S, 2013, ’Digital Forensics in the cloud’ In proceeding of Information Security for South Africa; South Africa. Taylor, M., Haggerty, J., Gresty, D., & Hegarty, R.,2010., ’Digital evidence in cloud computing systems’. Computer Law & Security Review, 26(3), 304-308. Valjarevic, A., & Venter, H. S.,2012, ’Harmonised digital forensic investigation process model’. In Information Security for South Africa (ISSA),IEEE, pp. 1-10 Van Staden, R. F.; Venter, H.S.; 2012; Using Performance Monitoring Software to Implement Digital Forensics Readiness; 8th Annual IFIP WG 11.9 International Conference on Digital Forensics; Pretoria, Gauteng, South Africa. Weis, A. H.,1992. ’Commercialization of the Internet’. Internet Research, 2(3), 7-16.