executable whitelists and process authentication for protection

CRYPTOGRAPHIC AUTHENTICATION OF PROCESSES AND FILES TO PROTECT A TRUSTED BASE

by

Jeremy A. Hansen

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Science

at The University of Wisconsin-Milwaukee May 2005

CRYPTOGRAPHIC AUTHENTICATION OF PROCESSES AND FILES TO PROTECT A TRUSTED BASE

by

Jeremy A. Hansen

A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Science

at The University of Wisconsin-Milwaukee May 2005

Major Professor

Date

Graduate School Approval

Date

ii

ABSTRACT CRYPTOGRAPHIC AUTHENTICATION OF PROCESSES AND FILES TO PROTECT A TRUSTED BASE by Jeremy A. Hansen

The University of Wisconsin-Milwaukee, 2005 Under the Supervision of Professor G. I. Davida Executables are often blindly trusted for extended periods of time, often without checking their integrity since they were first installed. Given that executables can be secretly replaced with malicious copies on disk or in memory, viruses, Trojans and worms can cause incredible damage on unprotected computers. The proposed scheme verifies the integrity of a binary at execution time and periodically while in memory using cryptographic hashes. Programs without a valid hash are not permitted to execute, or are halted if currently running.

This is a “whitelist” approach, rather than the

traditional “blacklist” approach taken by antivirus vendors. Keywords: Antivirus, Authentication, Hashing, Malware, MD5, Whitelists

Major Professor

Date

iii

© Copyright by Jeremy A. Hansen, 2005 All Rights Reserved

iv

TABLE OF CONTENTS Introduction.......................................................................................................................... 1 Current Solutions............................................................................................................. 2 Pattern Matching.......................................................................................................... 3 Heuristic Strategies...................................................................................................... 3 Problems with Current Schemes...................................................................................... 4 Authenticating Executable Files.......................................................................................... 6 The Binary Whitelist........................................................................................................ 6 Whitelist Architectures.................................................................................................. 10 Software Authenticator on Untrusted Media with an Untrusted OS ........................ 10 Software Authenticator on Untrusted Media with a Trusted OS............................... 10 Software Authenticator on Trusted Media with an Untrusted OS............................. 11 Software Authenticator on Trusted Media with a Trusted OS.................................. 11 Hardware Authenticator with Untrusted OS.............................................................. 12 Authenticating Running Processes................................................................................ 12 Practical Issues with Whitelisting.................................................................................. 13 Implementation.................................................................................................................. 15 MD5 Hashing................................................................................................................. 16 Authenticating Running Processes................................................................................ 17 Performance Data...........................................................................................................19 Conclusions.................................................................................................................... 20 Future Directions............................................................................................................... 22 DSA Signatures..............................................................................................................22 Hardware Implementations............................................................................................ 23 References.......................................................................................................................... 26

v

LIST OF FIGURES Figure 1: Organization of Signatures.................................................................................17

vi

LIST OF TABLES Table 1: A Selection of Antivirus Vendors and Technologies............................................ 2 Table 2: Speed Tests.......................................................................................................... 19

vii

ACKNOWLEDGEMENTS I would like to express my thanks to my thesis advisor, Professor George Davida for the inspiration for this thesis, for many lengthy and productive discussions and answering my seemingly unending stream of questions. Without his insight, many of the ideas contained herein would not be as developed and polished as I might have been able to produce on my own. That said, any mistakes or oversights contained in this paper should not be attributable to him – those are my fault. More thanks go to my friend and coworker Mike Silbersack, with whom I had many discussions about this paper, security in general, UNIX, programming, and “connecting” with students. Thanks for pointing out the ugly bits in my research and showing me some things I had missed, even if it was a brick wall directly in front of my face. Thanks also for providing the pointer to FreeBSD remote kernel debugging with Firewire. Mike will be happy to know that the proposed system in this paper can stop a fork bomb. Thanks to Ann Marie Grobarek, Bill Hogan, Ram Iyer, Rick Neff, Scott Smith, Dave Sorenson and the other coworkers that provided technical and non-technical feedback throughout the process of writing this thesis. I also appreciate all my students that had to suffer through my lectures when I constantly referred to my research. Thanks for not giving me too hard of a time when I told you about whitelists for the third time. Thanks to Michael Williams of NetXsecure, the developer of the “TrojanXproof Anti-Trojan and Trojan Detection” kernel patches, who kindly allowed me to use his work in my research. His code allowed me to avoid reinventing the wheel.

viii

My father, my mother, my stepmother, my sisters and my brothers all deserve kudos for smiling and nodding when I just had to tell them about all the various things that I was working on at the moment. Extra special thanks goes to Cara, who, along with proofreading, has helped to keep me sane throughout the whole process of writing this thesis, and continues to provide the light at the end of every tunnel for me.

ix

1

Introduction Viruses, worms, Trojan horses and other malicious software, malware, have become a serious problem throughout the world in network-connected computer systems. According to Virus Bulletin, there were nearly 100,000 instances of infection by the three most prevalent viruses (Netsky, Bagle and Sober) in January 2005 alone [24]. Viruses are programs designed to make copies of themselves, usually by replacing or modifying existing, legitimate programs. Trojan horses are seemingly useful programs that have virus-like effects, such as compromising the system so that attackers can get in easily, collecting confidential information, or destroying data. Internet worms, which are viruses that do not require any human interaction, can spread at alarming rates. Worms exploit vulnerabilities in software (usually network services), make copies of themselves, and then attempt to infect other hosts. Research has been done to analyze the possibility of much more virulent strains of malware that may be able to infect large numbers of computer systems without hope of remediation or human interaction [25]. The industry’s current methods of combating these threats are simply inadequate given the damage and inconvenience they can cause. One of the major difficulties encountered when designing systems to combat malware is that in modern operating systems, the malicious application runs in the context of the current user. Anything that the user can do, the virus, worm, or Trojan can also do, which can lead to leakage of sensitive materials, escalation of the intruder’s privileges, system downtime, and further spreading of the malicious code. This paper proposes a method to ultimately prevent the damage done by viral or other malicious code by checking the integrity of a digital signature or cryptographic hash every time a

2 Antivirus Vendors TechnologiesHeuristic Checking Vendor Table 1: A Selection AntivirusofTitle Patternand Matching F-Secure F-Secure Anti-Virus Yes Yes Kaspersky Labs Kaspersky Anti-Virus Yes Yes Network Associates McAfee VirusScan Yes Yes Symantec Norton Antivirus Yes Yes Open Source ClamAV Yes No

program is run – if it is not on the “whitelist” of accepted applications, it will not be executed, regardless of the user running it. This is not a new idea, as research mentioning similar methods can be found dating back to 1989. Naor and Yung write, “To prevent computer viruses from modifying the programs, the users would like to authenticate the programs before their use.” [15] They then go on to describe using hash functions to perform this authentication. By exercising this level of control at the operating system level, users can be more certain that a binary that is replaced by a Trojaned or otherwise compromised version will not be executed, regardless of its similarity to the original. Several variants of this scheme are explored in this paper, including a hardware-based implementation that does not trust the host’s operating system.

Current Solutions The current trend in combating malware involves one of two strategies: pattern matching and heuristic rules. Pattern matching, or signature-based detection, relies on a database of binary strings, against which potentially malicious files are compared. Heuristic strategies involve detecting patterns of activity consistent with malicious code. With this strategy, there may not be a single binary string to match in the malware, but certain activities or characteristics in a program may raise an alarm. A list of several vendors, their software products and whether or not they support pattern matching or heuristic scanning can be found in Table 1.

3

Pattern Matching Almost all current anti-virus software relies on databases of signatures or footprints that identify a certain executable file as being suspicious. Freely available collections of these signatures are also offered on the Internet, and similar databases are included with commercial antivirus software [6]. These signatures are not trivial to generate, as the party generating them must make sure both that the signature works and also that it does not trigger a false positive – an alert on a legitimate, harmless program. For example, it took ten hours for the fastest antivirus vendor to produce a signature to detect the Sober.C virus in February of 2004, which is clearly too slow to catch a piece of fast-spreading malware [22]. Regardless of the speed that updated signatures can reach end users, it is easy enough for virus writers to develop a dynamic strain of virus that changes its structure, and therefore its digital footprint, every time it replicates. These polymorphic viruses might be detected in several of their forms, but certain mutations may go unnoticed.

Heuristic Strategies A heuristic is an algorithm that tries to approximate a solution rather than getting it perfect. This “good enough” solution may reveal information about a system that an algorithm that runs (often inefficiently) straight for the perfect solution may miss. The heuristic approach does not try to categorize or fingerprint individuals – instead it attempts to identify suspicious behavior [20]. A neighborhood watch operates on this same premise, as average citizens do not have access to a fingerprint database against which they can screen possible criminals; instead, residents keep an eye out for suspicious behavior and react accordingly. Some activity will certainly go unnoticed

4 using a heuristic scanner, but again, this strategy aims for a “good enough” solution, not an exact “Top Ten” list of malicious programs. What sort of behavior is considered suspicious in an executable program? To be considered suspicious, the program should have a functionality characteristic of malware, but not of legitimate programs. Some features of executables that are used primarily by malware include “self encrypted code or code that appears to have been appended to an existing program.” [20] Even so, the heuristic scanner may flag a desirable program as malicious due to some feature of the program that seems to be suspicious, but is in fact part of the expected behavior. Conversely, a virus that avoids typical virus-like behavior can slip by undetected. The higher rate of false positives (erroneous categorizations of programs as hostile) and false negatives (malicious programs that are missed by the scanner) makes the still-developing field of heuristic antivirus scanners not nearly as effective as pattern matching. Because of this, vendors package heuristic scanners with their software to supplement the basic functionality of their pattern-matching software.

Problems with Current Schemes While the existing schemes used to combat malware are somewhat effective, they still leave much to be desired. On home computers with antivirus software installed, the end user is allowed to make a choice about what to do with a detected suspect program – delete the file, attempt to “clean” it (remove any malicious code in the hope of restoring the original file), quarantine the file, or even allow the potentially malicious code to continue running. This is a decision that most users are not prepared to make, so many antivirus vendors choose to quarantine the suspicious executable by default unless told ahead of time to do otherwise. In a corporate environment, there is a similar default

5 setting that treats positive virus detections the same way – either delete, clean, quarantine, or ignore. The user has no assurances, though, that the file he or she is attempting to run is in fact legitimate (or malicious for that matter); they trust the antivirus vendor to make that determination for them.

6

Authenticating Executable Files The Binary Whitelist Conventional antivirus software uses a list of patterns that indicate applications that should not be run. A list of patterns used in this way is often referred to as a blacklist. Anti-spam software uses similar blacklists to automatically drop emails from known open-relay mail servers which may pass along thousands of pieces of unsolicited commercial email [19].

However, these spam and virus blacklists are a reactive

countermeasure. There must be evidence of undesirable behavior in the case of a virus for the virus’ pattern to be added to the blacklist. Authentication schemes, such as those used in popular operating systems, require an entity to provide information to a system in order to prove that the entity in question is allowed to access the system. The system then consults an access control list that describes the amount of access for which the entity is authorized. A similar scheme would be beneficial in protecting against malware, since these illegitimate applications would then be treated as unauthorized users, rather than blindly assuming that every application should be executed with the same privilege. With this in mind, a better approach to virus protection would be to make a list of all of the applications that are already known to be legitimate and to only allow those applications to execute on the system. This method of enumerating only the authorized entities is called whitelisting. To create a whitelist, a “trusted base” of the overall system state is established, which in this case includes the currently installed applications. A point is chosen in the system’s configuration to take a snapshot so that it can be verified that the machine has not been

7 compromised. If an application that has been tampered with is mistakenly trusted, the system checking the whitelist will assume the illegitimate binary file is one that the administrator wants to be running on the system! Thus, it is important to ensure that the machine is actually free of malware before the snapshot is taken An area where similar whitelists have proven useful is in spam blocking. Some anti-spam software relies on a whitelist of authorized email senders for each recipient, blocking or otherwise hampering the delivery of messages not from listed senders. Unfortunately, when a name is added to the list, any email purporting to be from that sender is allowed, so a cunning spammer might try to discover the addresses on the whitelist by searching the Internet for possible colleagues of the recipient. The spammer can then use these forged addresses to masquerade himself, as there is no true authentication in this scheme. As email addresses frequently change and the number of possible senders tends to be quite large, whitelisting can be inconvenient for spam prevention. Whitelisting is useful because it is proactive – the system knows ahead of time those processes that can be run, and does not require a list of the processes or patterns therein that may cause undesirable effects. In its simplest form, the whitelist contains the names of allowed applications and their locations in the file system. A trivial improvement would be to include the original files’ sizes in the list. Unfortunately, in both of these simple examples, the whitelisted applications can be overwritten with malicious copies which could still harm the system. The attacker installing the malicious version may even be so sneaky as to make the replacement’s size the same as the original authorized file.

To the end user, this compromised file would look harmless in a

8 directory listing, but may cause havoc when executed. Software tools such as Tripwire, md5deep and md5sum are effective in tracking changes to essential system files, applications and configuration files because they generate a fingerprint of the file, a cryptographic hash of each file in question, and monitor any changes in the files’ contents that may occur [23, 13]. These hashes are stored in a database, usually on separate or read-only media to avoid contamination by a dedicated attacker. Public databases of hashes for the binaries in many popular operating systems and the applications that run on them are available from the National Institute of Standards and Technology’s (NIST) National Software Reference Library [16]. These hashes are intended to be used in conjunction with these tools, though a user could simply generate hashes for himself as previously mentioned. A user might also run an application from a location that he or she did not anticipate. The most common example of this occurs in UNIX where the user’s shell checks a list of possible application locations, so a user might run a program other than the one he intended, depending on how that list is constructed. Other tools, like DigSig (which in turn uses a program called bsign), use a similar approach in that they embed a digital signature or cryptographic hash into the executable files themselves, making a separate protected database unnecessary [1, 3]. DigSig also requires that the key used to sign the executable be protected against intruders; otherwise, if the secret key is compromised, an attacker can replace programs on the system with malicious versions and never be detected by the signature checks. Other drawbacks to this system are that it requires kernel modifications and only allows for a single digital signatory.

9 As mentioned by Davida, allowing software vendors to create their own digital signatures for the software they distribute can be powerful, since it is computationally infeasible for an attacker to generate a fake signature after tampering with the legitimate program [7]. A provable correlation between vendors and their software products needs to be maintained for the software to be considered as trustworthy as the vendor. In research by Harn, Lin and Yang, a similar signature scheme is proposed, regarding which the authors state, “Without proper authentication one can never be sure that a received program is indeed from the original licensed vendor and therefore it has no way to resolve the dispute, if any, between a software vendor and a user.” [11] Commercial software vendors are certainly not the only beneficiaries of tamper-proof software. Open-source software distributors might have central repositories of signatures for common binaries, but since it is frequently the case that open-source software is tweaked and recompiled to individual users’ tastes and computers, it should be possible for the end user to add signatures for any programs they customize. The proposed system requires a whitelist to be stored on each computer to be protected. The whitelist is consulted every time there is a question concerning the authenticity of an executable. The web and FTP sites of many software vendors already make MD5 hashes and digital signatures available for their products, so the amount of extra work required by users of these applications to confirm that these products originated from the proper vendor is limited. Vendors not currently providing proof of authenticity would need to make hashes or digital signatures available to users of this system in order for their software to run on protected computers. Alternatively, end users can generate their own hashes and assume that they have reliably received the software.

10 However, for this solution to be most effective, updating the signature database on the protected computer should not be trivial, nor should it happen frequently, since having an easily modified whitelist increases the risk of tampering.

Whitelist Architectures At least four different software models of a whitelisting system can be imagined, while one hardware model will be discussed. The four software models combine an authenticator whose database resides on both trusted and untrusted media and trusted and untrusted operating systems.

Software Authenticator on Untrusted Media with an Untrusted OS The simplest of the software models, this model assumes that the operating system will be reliable and that it will not be compromised. The whitelist itself will reside on an unprotected medium, such that malicious processes could potentially modify or delete it. Therefore, while a whitelist under this model will provide some security, it is not difficult to bypass and therefore is the least desirable of the four models. However, this model (and the next, for that matter) has the benefit of a whitelist that is extremely easy to update. A typical installation of Tripwire falls under this category.

Software Authenticator on Untrusted Media with a Trusted OS Only moderately better than the previous model, this scenario’s whitelist is still not protected from compromise by a dedicated attacker. Since the authenticator is guaranteed that the operating system will not misbehave, no threat is posed to the whitelist and the operating system will always use the whitelisting system when

11 appropriate. However, processes outside of the operating system may still change the signatures, which could lead to unauthorized code being executed.

Software Authenticator on Trusted Media with an Untrusted OS When the whitelist is put on a read-only medium such as a compact disc or a write-protected floppy disk, the likelihood of the signatures becoming compromised is nearly eliminated. If these signatures will only be updated periodically, it may be practical for a replacement floppy disk or CD-R to be written separately, possibly on another computer. There are two approaches an attacker could use to bypass the security of an authenticator of this sort. First, the attacker could attempt to modify the operating system in order to bypass the runtime check, since the authenticator assumes that the operating system will not be compromised. The other method would involve the attacker modifying the in-memory copy of the signatures after they are loaded from the trusted media but before they are validated. This is the model that I will explore in this paper and implement, since existing operating systems cannot be completely trusted, and using trusted media is quite easy,

Software Authenticator on Trusted Media with a Trusted OS In this, the ideal of the four software scenarios, the operating system can be trusted and even legitimate processes may not modify the whitelist, since it resides on a secure medium. Unfortunately, there are currently no consumer-level operating systems that can be completely trusted to avoid exploitation.

12

Hardware Authenticator with Untrusted OS This is the ideal solution to the authentication of processes, but is much more difficult to implement. This solution requires a separate hardware device to be developed that must be installed on every system requiring protection. This type of authenticator is described in “Future Directions” below.

Authenticating Running Processes In the methods previously described, we assume that the execution of a program is checked only at the time the program was executed. There is also value in authenticating the process at regular intervals during its execution, since a previously trusted running process could be compromised and therefore would no longer be trustworthy. A relatively common attack of this sort is the buffer overflow, where a process is made to execute instructions not originally found in the executable by using vulnerabilities in the program. If the signature of the process is checked periodically, modifications could be detected while it is running. The authenticator could then kill the program and hopefully stop any damage. It should be noted that a malicious program will still have a certain amount of time for destructive behavior between checks, since the authenticator is not validating the program all of the time. This reactive system is not as effective as one protecting running processes that proactively prevent write access to any executable page in memory at the operating system level. As of version 3.3 of OpenBSD this is precisely the case, in that there are “no pages that are both [writable] and executable.” [8] This technology in OpenBSD is referred to as “W^X” (read “W or X”) which prevents a memory page from having both write permission and execute permission set on it at the same time. The executable

13 portions of running processes are thus immune to overwriting.

While it may be

redundant to check hashes of running processes with these protection systems already in place, authenticating running processes within the OpenBSD operating system is a good stepping-stone towards autonomous hardware-based authentication systems. Verifying hashes of running processes also has merit in operating systems which lack the page-level write protection of OpenBSD, such as Linux or Windows.

Practical Issues with Whitelisting Though a system using whitelists can enjoy considerable security benefits, there are several drawbacks and unresolved issues with which to contend. By simply making the signature database unavailable, a denial of service attack can be performed on any of the whitelist implementations previously discussed. Another problematic behavior of current antivirus software, especially heuristic scanners, is that the software may detect certain executables as hostile, when in fact they were installed or downloaded intentionally. Sometimes legitimate security tools, like port scanners that may moonlight as hacker tools, are detected and quarantined by default. This can cause headaches for users, especially when using enterprise antivirus software that requires the network administrator’s privileges to disable. Other software, such as distributed computation software like distributed.net, has been detected as suspicious in the past [26]. Antivirus software, like most other user-level software, will stop doing what it was programmed to do if it is halted. This makes the antivirus software “fail open”, which means that the security offered by the system is nullified when the system is turned off. Viruses can kill or deactivate the antivirus software and go about their malicious

14 business without being caught by the scanner. The recent viruses Bagle.M and Lirva both do this, targeting several popular antivirus packages [17, 2]. This behavior is not desirable for a truly secure system. The more secure system that “fails closed” would render the entire computer inoperable in the event that the operating system could not successfully query the authenticator. Hardware contention could also cause significant slowdown on the machine if signatures were located on a device that was slow or very busy. Placing the whitelist on a dedicated device can mitigate some of the associated slowdown, but the device could remain a performance bottleneck if the system is under an overall heavy load. When verifying hashes of running processes, not all of the text segment pages for a given process may be available in memory at the time of the check. The authenticator may only need to use a small portion of the process’ text segment to verify the integrity of a process. The verification system may not want to trigger page faults for every process it checks since performance can suffer. To do this partial checking, techniques like Merkle Hash Trees can be used to validate small blocks of an entire file, when only a subset of those blocks are immediately available in memory [5]. These systems of authenticating processes through whitelists do not remedy the case where an attacker hijacks a legitimate user’s account and uses legitimate programs (that are part of the trusted base) on the computer to perform malicious tasks. The computer is not protected against these types of attacks; it is only protected against a subset of possible attacks involving modifying the computer’s executable programs. Further levels of access control and security are still necessary.

15

Implementation I decided to implement two whitelist systems as proofs-of-concept. The first system resides inside the operating system and verifies MD5 hashes of programs on disk before they are allowed to execute. The second system runs as a user process with root privileges and checks running processes to make sure they match the MD5 hashes in a separate whitelist. The operating system that would benefit the most from virus and Trojan protection is Windows, since roughly 99 percent of known viruses target the Windows operating system [14]. Unfortunately, as Windows is a closed-source operating system, modifying the way the operating system handles certain system calls is virtually impossible. It is possible to write a program that “hooks” into the relevant system calls, bypassing or adding to existing operating system functionality, though a crafty attacker can bypass these restrictions. With these drawbacks in mind, I considered three popular open-source candidates for receiving this software authentication system: Linux, FreeBSD and OpenBSD. Besides my own familiarity with the operating system, the feature that made OpenBSD a better choice over the others was its in-kernel support for cryptographic algorithms like MD5 and SHA-1, which are used frequently by my implementations to authenticate programs on disk and in memory. The OpenBSD operating system also contains support for hardware cryptographic accelerators, which can take over cryptography-related calculations for the operating system when necessary. A hardware accelerator would significantly reduce the time required for checking hashes or digital signatures, and may prove helpful to future versions of this system.

16

MD5 Hashing This system simply compares the MD5 hash of a program on disk to a previouslygenerated hash stored in a secure database to determine whether or not that program has been compromised or replaced. There is an existing implementation by Michael A. Williams of NetXsecure NZ Limited which patches some older OpenBSD and FreeBSD operating systems to offer precisely this scheme [27]. I used a modified version of this patch to provide support for simple binary whitelisting on a machine with OpenBSD 3.5 installed. Williams’ implementation can use either the SHA-1 or MD5 hash algorithms on the binaries and also included a UNIX shell script that generates the database of signatures, so it requires only a trivial change for my implementation to use SHA-1 hashes instead of MD5. Once the code was modified to work properly with the 3.5 kernel, I created an empty directory at /etc/sigs and mounted a separate hard drive at that point. I ran the included signature-generating shell script to create the trusted base as a hierarchy of files in that directory which corresponded to the locations of uncorrupted binaries on the system (See Figure 1). Once the whitelist was created, the disk was remounted as readonly to avoid any corruption of the signatures. In the borrowed implementation, the kernel’s securelevel, an operating system variable, must be raised to 2 in order to enable the whitelisting system [18]. I left this in the derived implementation for the purposes of debugging. Since OpenBSD sets the securelevel to 1 in the regular boot sequence, if any problems were encountered with the implementation, the computer could be rebooted to return it to an unprotected state. With the securelevel set to 2, when an unauthorized or modified program is run, the user (even

17 as the superuser!) gets the same message as if the user did not have sufficient file system permissions to execute the file in question. The computer is now safe from viruses and attackers that modify executable files.

18

Figure 1: Organization of Signatures

19

/ etc/ sigs/ bin/ ls

MD5 hash of /bin/ls

sh

MD5 hash of /bin/sh

bin/ ls sh

Authenticating Running Processes The second component of this proof-of-concept is the ability to check the cryptographic hashes of processes already running. This required the isolation of the text segment of the executables (as the data segments in running processes may frequently change) and a new set of hashes and signatures to be generated. I placed these signatures in /etc/sigs/running to distinguish them from the on-disk binary hashes that we used previously. The Executable and Linkable Format (ELF) specification, according to the OpenBSD ELF manual page, was used to locate the text segments of binaries on disk. The header information of the executables describes precisely where the text segment could be found within the file. I created a utility named elfText to extract just the text segment of the executable file. A modification of the previously described signature treegenerating shell script fed the output of elfText through MD5 and stored the result in

20 the appropriate location for every executable. The source code for elfText may be found on the included CD. Extracting the text segment from running processes required both OpenBSD’s procmap command and its support for the /proc file system, which give information about and direct memory access to the processes. After the procmap command was run on every running process on the test system, I determined that OpenBSD places the text segment of processes at virtual memory address 1C000000. Once I had established that, it was just a matter of opening the process’ memory space via /proc, determining the text segment’s size from the binary’s header information, and retrieving the block of data. Unsurprisingly, passing this data to MD5 yielded the same result as the hash of the text segment on disk. I modified elfText to retrieve this information from a specified process ID and called the resulting program procText. This program’s source code is also available on the included CD. I created a userland (outside of the kernel, that is) shell script named bouncer to extract the text segment from a running process to verify that the static signatures generated match the running signatures. This script requires root privileges to do its job, of course. Bouncer checks the MD5 hashes of every running process on the computer,

21 Test whitelist.speed

Kernel compilation: “make”

then sleeps for 10 seconds.

Table 2: Speed Tests

Authentication Used

Time Elapsed

No authentication

4.814 sec

With MD5 hash verification

5.456 sec

No authentication

7547.88 sec

With MD5 hash verification

8629.58 sec

If the process’ hash is not valid or the whitelist is

unavailable, the process is killed to prevent it from potentially damaging the system. A difficult decision was to be made regarding the frequency of checking the running programs. Checking each process once a second would slow the machine down an undesirable amount, but the less frequently the processes are checked, the greater the chance that a compromised program has a chance to do something malicious. The (admittedly slow) test machine became bogged down with even a five second wait time, so I decided that a ten second window would suffice.

Performance Data While the hash-checking implementation worked acceptably in terms of speed on my test machine, a Pentium 90 with 16 megabytes of memory, there was a noticeable slowdown in program initialization times. This speed differential was measured using a script (available on the included CD) called whitelist.speed, which invoked 11 different (uncompromised) programs. The script was run five times with the whitelisting system disabled, and five times with the whitelisting system enabled. The average of these running times can be found in Table 2. As an additional speed test, a complete compilation – just the “make” step – of the OpenBSD 3.5 kernel was done once with the

22 MD5 whitelisting enabled, and once with the system disabled. These results are also shown in Table 2.

Conclusions The MD5 hash verification system caused a noticeable increase in the running time of the test cases. Using whitelist.speed, I recorded a 13.34% increase in execution time on average, though it totaled a mere three-fifths of a second slowdown. The kernel compilation slowdown seems to approach this same percentage too, with a 14.33% increase in execution time. This seems to be a fairly acceptable slowdown given the extra security the in-kernel verification provides, and while the system might introduce a noticeable latency to the end user, I do not feel that it is unreasonable. Bouncer, the program which checks the text segments of running processes, slows down the system an unacceptable amount unless the time between its checks is fairly large. This might be remedied somewhat by developing a version of bouncer using the C programming language instead of shell scripts, so that the overhead of program execution and I/O redirection could be removed. Building it into the kernel and checking hashes between context switches might also improve bouncer. The benefit of these systems over traditional antivirus solutions is that they will catch any new malware that is introduced into an operating system. The operating system-based solution would likely match or beat existing virus scanning systems in terms of speed and effectiveness. The primary drawback is found in the maintenance of the databases. If an existing program is updated, or if a new program is installed, the trusted base must be revised with the hashes of the new software. The average user will

23 probably not be prepared to maintain the security of this system if it requires complex upkeep, as it does currently.

24

Future Directions In the course of a day, a typical user of a computer system may have to authenticate himself to multiple services. Those services do not assume that, since the user was authenticated elsewhere, he must have the right to whatever he is requesting. However, most operating systems like Windows and UNIX assume that all processes invoked with his credentials should have the same amount of access as the user himself once a user has authenticated to the system. The Principle of Least Privilege requires that a process should only be given the minimum amount of access required to perform its task [9]. If each individual process were required to authenticate itself to the operating system in order to justify its use of the user’s privileges, the Principle would hold intact.

DSA Signatures The next logical step to the above implementations would be to add support for digital signature verification to the authenticators, which much more closely parallels the authenticators proposed in [7].

The Digital Signature Algorithm (DSA) is a likely

candidate for this implementation, as source code is readily available, and it is a government standard. Once this is added, we could expand our database in /etc/sigs to include the subdirectory public_keys, where we could find a key ring of public keys of the entities signing the system’s programs. Any binary that was signed by an entity whose public key could not be found in this directory would not be allowed to run. This way, we would have the ability to add programs created by these trusted parties at a later time. Another record for each executable would be added to a whitelist residing in /etc/sigs/dsa.

Thus, the DSA signature for /bin/sh could be found at

25 /etc/sigs/dsa/bin/sh.

To determine which entity in the public_keys

directory signed the binary, yet another file could be created for each program. We could take the original DSA signature filename and append “.signer”, so the signer for /bin/sh could be found at /etc/sigs/dsa/bin/sh.signer.

For a simple

implementation, a single signer named “local” could be created with a corresponding key pair that signs every binary on the OpenBSD machine. This single signer’s public key would be placed in /etc/sigs/public_keys/local and the content of every *.signer file would be “local”. After installing this particular kernel modification, we could be sure that the programs on our system have not been tampered with since the corresponding entity signed it.

Hardware Implementations The section above entitled “Hardware Authenticator with Untrusted OS” describes an authentication system that bypasses the operating system entirely. Since it is conceivable that the entire operating system could be altered or replaced to bypass the computer’s security measures, the authenticator should not rely on the operating system. A common method of bypassing operating system security is to simply reboot the computer with a bootable floppy CD-ROM in the drive, and load an alternate operating system like Knoppix [12]. Once the new operating system has loaded, the security measures provided by the operating system on the hard disk are no longer in place. A malicious user equipped with a bootable alternate operating system has carte blanche with the existing data on the computer’s hard drives. Similarly, rootkits can be installed to modify the behavior of the operating system, usually to the users’ detriment [21]. The effects of the rootkit may blind user programs, but the rootkit’s and other operating

26 system components still reside in memory, and can therefore be validated. The hardware authenticator should have complete access to the host system’s memory, so that it can verify any running process or the operating system itself. This method, like the in-kernel authentication schemes discussed earlier, requires that a trusted baseline is established before the authenticator can do its job. If the hashes of our trusted base do not match what the hardware authenticator finds, the computer can be shut down completely. That said, we would not necessarily authenticate the kernel’s boot up or the execution of other programs. Instead, this hardware authenticator would act similarly to bouncer in that it would peek into memory periodically to ensure the proper operation of the computer. The 1394 standard, also known as Firewire, much like its cousin Universal Serial Bus (USB), is typically used by peripheral devices for video and audio capture or for fast data storage. Firewire offers a connected device direct access to a host’s memory, which is currently used in FreeBSD to debug the operating system of a computer that has crashed and has no functioning operating system [10]. On a host with 256 megabytes of memory, the entire memory image can be captured in less than a second, since Firewire supports data transfer rates of up to 400 megabytes per second. One hardware device named Tribble already uses this functionality of the PCI bus to take a forensic snapshot of a system [4]. This also means that a malicious Firewire device could replace the entire contents of a system’s memory nearly instantaneously, regardless of the operating system.

It could replace specific parts of programs or memory-resident data.

A

beneficial Firewire device could similarly read any portion of the host’s memory in order to “keep an eye on things” and verify the integrity of the system.

27 Researchers have explored the possibilities for whitelist-based security systems in the past, but much of what is possible has not yet been implemented. Tamperproof hardware-based authenticators show the most promise, as they do not require any interaction with an operating system that might not otherwise cooperate with security software. With existing software, a minimal amount of hardware, and a bit of creative programming, these proposed systems could easily be developed to protect computer systems more effectively than current solutions.

28

References [1] Apreville, Axelle, Makan Pourzandi, David Gordon, and Vincent Roy, “Stop Executable Code Execution at Kernel-Level,” Linux World vol. 2, no. 1 (Jan. 2004). [2] “Avril, Lirva, Naith: A worm by any other name is still a worm,” Apr. 2005; http://antivirus.about.com/library/weekly/aa010803a.htm [3] “bsign FTP Site,” Apr. 2005; ftp://ftp.buici.com/pub/bsign/ [4] Carrier, Brian D. and Joe Grand, “A Hardware-Based Memory Acquisition Procedure for Digital Investigations,” Digital Investigation, vol. 1, 2004, pp. 50-60. [5] Chapweske, J. and G. Mohr, “Tree Hash EXchange Format (THEX),” Mar. 2003; http://www.open-content.net/specs/draft-jchapweske-thex-02.html [6] “Clam Antivirus,” Apr. 2005; http://clamav.net/ [7] Davida, George, Yvo Desmedt, ad Brian Matt, “Defending Against Viruses Through Cryptographic Authentication,” Proc. IEEE 1989 Comp. Soc. Symp. Security and Privacy, pp. 312-318. [8] de Raadt, Theo, “i386 W^X,” OpenBSD-Misc Mailing List, Apr. 2003; http://www.sigmasoft.com/~openbsd/archive/openbsd-misc/200304/msg01127.html [9] Ferraiolo, David and Richard Kuhn, “Role-Based Access Controls,” Proc. 15th Nat’l Comp. Security Conf., vol. II, 1992, pp. 554-663. [10]“FreeBSD Firewire Kernel Debugging,” Apr. 2005; http://kerneltrap.org/node/145/ [11]Harn, Lein, Hung-Yu Lin, and Shoubao Yang, “A Software Authentication System for the Prevention of Computer Viruses,” Proc. 1992 ACM Ann. Conf. Communications, pp. 447-450. [12]“KNOPPIX Homepage,” Apr. 2005; http://www.knoppix.com/ [13]“md5deep Homepage,” Apr. 2005; http://md5deep.sourceforge.net/ [14]Melia, Michael, “Computer Worms: How Schools Are Fighting a New Type of Virus,” Nov. 2003; http://www.pbs.org/newshour/extra/features/ july-dec03/virus_11-24.html

29 [15]Naor, Moni and Moti Yung, “Universal One-Way Hash Functions and their Cryptographic Applications,” Proc. 21st Ann. ACM Symp. Theory of Comp. (STOC 1989), http://citeseer.ist.psu.edu/naor89universal.html [16]“The National Software Reference Library (NSRL) Project,” Apr. 2005; http://www.nsrl.nist.gov/ [17]“New Bagle variant disables antivirus programs,” Apr. 2004; http://smallbusiness.itworld.com/4385/040315newbagel/page_1.html [18]“OpenBSD Manual Pages: securelevel,” Apr. 2004; http://www.openbsd.org/ cgi-bin/man.cgi?query=securelevel [19]“Open Relay Database,” Apr. 2005; http://ordb.org/ [20]Polk, W.T. and L.E. Bassham, “A Guide to the Selection of Anti-Virus Tools and Techniques,” NIST CSRC SP 800-5, Dec. 1992; http://csrc.nist.gov/publications/nistpubs/ [21]“ROOTKIT: The Online Reverse Engineering Magazine,” Apr. 2005; http://www.rootkit.com/ [22]Seltzer, Larry J., “Why Your Antivirus Program Won’t Catch the Next Attack,” Jun. 2004; http://www.pcmag.com/article2/0,1759,1586106,00.asp [23]“The Tripwire Open Source Project,” Apr. 2005; http:/tripwire.org/ [24]“Virus Bulletin Prevalence List,” Jan. 2005; http://www.virusbtn.com/resources/malwareDirectory/prevalence/ index.xml?200501 [25]Weaver, Nicholas, “Potential Strategies for High Speed Active Worms: A Worst Case Analysis,” Mar. 2002; http://www.cs.berkeley.edu/~nweaver/worms.pdf [26]“What is Distributed.net?” Apr. 2003; http://service1.symantec.com/SUPPORT/nav.nsf/docid/2000101607584306 [27]Williams, Michael A., “Anti-Trojan and Trojan Detection with In-Kernel Digital Signature Testing of Executables,” Apr. 2002; http://trojanproof.org/sigexec.pdf

executable whitelists and process authentication for protection

executable whitelists and process authentication for protection

Suggest Documents

Compiling Process Graphs into Executable Code - CiteSeerX

DATA PROTECTION; IDENTIFICATION AND AUTHENTICATION IN ...

Deriving executable process descriptions from UML

Deriving executable process descriptions from UML

Hardware Protection and Authentication through Netlist Level

33_From Conceptual to Executable BPMN Process Models.pdf ...

DATA PROTECTION; IDENTIFICATION AND AUTHENTICATION IN ...

Compiling Process Graphs into Executable Code - CiteSeerX

Correctness of Executable Process Models Beyond ...

Conditional Disclosure of Encrypted Whitelists for ...

for Process Safety and Environmental Protection ...

Multimedia Protection, Authentication and Advanced Digital ...

for Process Safety and Environmental Protection ...

Auto-completion for Executable Business Process ... - Semantic Scholar

Process Protection - Siemens

Practical Biometric Authentication with Template Protection

strengthening user authentication for better protection of ... - JATIT

Authentication Method for Privacy Protection in Smart Grid Environment

strengthening user authentication for better protection of mobile ...

Collective Classification for Packed Executable

Executable source code and non-executable source code ... - CiteSeerX

interscholastic athletics and due process protection

Destabilizing Due Process and Evolutive Equal Protection

Using the CoSMoS Process to Enhance an Executable ... - CiteSeerX