SECDEL: Sanitization Tool for Deleting Sensitive ...

5 downloads 380 Views 424KB Size Report
allocated. This work introduces a secure delete application called “SECDEL” in order to safe deleted files from recovery. It overwrites a deleted file's on-disk data.
ICENCO 2006

SECDEL: Sanitization Tool for Deleting Sensitive Information Tarek S. Sobh Egyptian Armed Forces, Cairo, Egypt E-mail: [email protected] Abstract- Most computer users believe that they eradicate data in a file when they delete these files off storage media like hard or floppy disks. It would be to their astonishment that most of the commonly used methods of deletion (e.g. delete, erase, and even format) do not physically remove data off these storage devices and that this data can easily be recovered using readily available forensics software. This paper will then show how to securely delete a file so that current software tools cannot recover them. This paper also touches on more advanced techniques beyond the means of most end users that can recover even the most securely deleted files, proving just how difficult it can be to remove data without leaving a trace of it behind. Based on this information this work presents a secure data deletion tool called SECDEL. Using this tool you can securely delete files from magnetic storage media. It combines the most power secure deletion algorithms in only one tool. 1. Introduction Be it corporate or home security, private data is something that should remain just that, and few have any wish for “old faithful” pc’s to give up data they thought securely deleted [8][16]. FAT is a stand for File Allocation Table. In FAT the disk is divided into clusters, the unit used by the file allocation, and the FAT describes which clusters are used by which files. The FAT12/16/32 types are important because it is useful for data exchange between different operating systems, and because it is the file system type used by the majority of devices [16]. Upon saving a file, the file name, size and location of the file is stored in the FAT. When the file is deleted no data is touched, only the entry in the FAT is removed, which is the flag to tell the OS that it can reuse that disk space. Windows NT file system (NTFS) provides a combination of performance, reliability, and compatibility not found in the FAT file system [7]. Formatting a volume with the NTFS file system results in the creation of several system files and the Master File Table (MFT), which contains information about all the files and folders on the NTFS volume (see http://www.ntfs.com). The MFT can be moved if there is a bad sector in its normal location. However, if the data is corrupted, the MFT cannot be located, and Windows NT/2000 assumes that the volume has not been formatted. Magnetic force microscopy (MFM) is a recent technique for imaging magnetization patterns with high resolution and minimal sample preparation. The technique is derived from scanning probe microscopy (SPM) and uses a sharp magnetic tip attached to a flexible cantilever placed close to the surface of hard disk or floppy disk to be analyzed, where it interacts with the stray field

AG - 1

emanating from the sample. An image of the field (stored data) at the surface is formed by moving the tip across the surface and measuring the force as a function of position [3][5][6]. The stored data on magnetic media such as hard disk or floppy disc may be recovering outside computer system. Much research work on methods of securing (or at least safely deleting) the original plaintext form of the encrypted data against sophisticated new analysis techniques seems difficult to find. By deliberately under-stating the requirements for media sanitization in publicly available guides, intelligence agencies can preserve their information-gathering capabilities while at the same time protecting their own data using classified techniques [1][2][14]. One feature of Windows operating system is that it implements object reuse protection. This means that when an application allocates file space or virtual memory it is unable to view data that was previously stored in the resources Windows allocates for it. Windows zerofills memory and zeroes the sectors on disk where a file is placed before it presents either type of resource to an application. However, object reuse does not dictate that the space that a file occupies before it is deleted be zeroed. This is because Windows is designed with the assumption that the operating system controls access to system resources. However, when the operating system is not active it is possible to use raw disk editors and recovery tools to view and recover data that the operating system has deallocated. This work introduces a secure delete application called “SECDEL” in order to safe deleted files from recovery. It overwrites a deleted file's on-disk data using techniques that will be shown to make disk data unrecoverable, even using recovery technology that can read patterns in magnetic media that reveal weakly deleted files. SECDEL is such an application that you can use for securely deleting both existing files, as well as any file data that exists in the unallocated portions of a disk including files that you have already deleted or encrypted. This paper is organized as follows: Section 2 is a brief discussion of normal deletion methods and the risks they pose on the security of data in files then discusses data recovery methods from deleted files. Section 3 discusses how the secure deletion is achieved and the implementation details of this work. Section 4 shows results of the performance measures of this work. Section 5 introduces related work and then comparison with proposed tool. Section 6 is a conclusion. 2. Magnetic deletions and overwriting In the Windows family, the default method of deletion is to send the deleted files to the recycle bin. This is insecure by design. Any file sent to the recycle

ICENCO 2006

bin is simply moved to a hidden folder and kept there. Any person with access to the recycle bin can very easily restore the file. Advanced users may go to the recycle bin and delete files from there or use the shift-delete combination keys to bypass the recycle bin and thus rendering the files unrecoverable by the operating system. For most users it might seem that files deleted this way are secure from recovery and that their data is lost forever but this is not the case. There are many software packages that have the ability to bypass the operating system and access magnetic storage devices scanning them for data and can recovering entire files. Normal delete marks the disk space occupied by the file as free so that new files can overwrite it but if no such operation takes place, the file will remain in storage implicitly and can therefore be recovered using recovery programs (Forensics). To start getting useful images of a particular track requires more than a passing knowledge of disk formats and once the correct location on the platter is found a single image would take approximately 2-10 minutes depending on the skill of the operator and the resolution required [10][13]. Faced with techniques such as MFM, truly deleting data from magnetic media is very difficult. The problem lies in the fact that when data is written to the medium, the write head sets the polarity of most, but not all, of the magnetic domains. This is partially due to the inability of the writing device to write in exactly the same location each time, and partially due to the variations in media sensitivity and field strength over time and among devices [10][13]. Overwriting data once is not usually good enough to prevent data recovery, instead it is recommended that a minimum of three passes are made writing alternating zero and one patterns over the data and then further passes with random data, the more passes the better the chance that no data can ever be recovered. The approach recommended by Peter Gutmann, author of “Secure Deletion of Data from Magnetic and Solid-State Memory” is a total of thirty-five passes. It is said that this method is the minimum that will guarantee that every spot on the hard drive is written to and can even prevent what is known as “Ghosts” on the hard drive. Ghosts are residual traces of data, caused by the fact that like charges used to write the binary data to the hard disk repel one another. This means that (assuming 0 is a “-“ charge and 1 is a “+” charge) when alternating positive and negative charges are written next to one another the width of those bands is greater than if it had been written next to another positive band, because opposites attract [5]. From this computer forensic specialists can attempt to recover some data on the drive despite it having been overwritten multiple times. A more secure deletion method is to overwrite the data stored on the disk. This ensures that even if the disk space previously occupied by the file is not used by

AG - 2

another file, the data that is stored on the disk is not the same as the data that was intended to be deleted [13]. There are many ways to overwrite data in a file so that recovering it becomes very difficult. Here in this work we used a combination of existing overwriting algorithms 1) overwriting the data with zeroes or ones 2) overwriting the data with random characters 3) overwriting the data with special patterns. Note that these algorithms are not suitable for top-secret information if used alone. You have to use combination of all algorithms. The general concept behind an overwriting scheme is to flip each magnetic domain on the disk back and forth as much as possible (this is the basic idea behind degaussing) without writing the same pattern twice in a row. 2.1 Overwriting the data with zeroes or ones The simplest overwriting algorithm is to overwrite each addressable character with a zero or a one. If the data was encoded directly, we could simply choose the desired overwrite pattern of ones and zeroes and write it repeatedly. However, disks generally use some form of run-length limited (RLL) encoding, so that the adjacent ones won't be written. This encoding is used to ensure that transitions aren't placed too closely together, or too far apart, which would mean the drive would lose track of where it was in the data. 2.2 Overwriting the data with random characters US Department of Defense (DoD) algorithm is one of the important overwriting data algorithms. It is called “DoD Cleaning and Sanitizing Matrix” [4]. The flowchart in Figure (1) explains our SECDEL tool implementation steps of this algorithm. Start

Initialize random seed

Write random character at current position

Open file Write its complement at current file position Move file pointer to beginning of file Increment file pointer No

Is EOF? Yes Close file

Stop

Figure (1) overwriting the data with random characters flowchart

ICENCO 2006

2.3 Overwriting the data with special patterns To erase magnetic media, we need to overwrite it many times with alternating patterns in order to expose it to a magnetic field oscillating fast enough that it does the desired flipping of the magnetic domains in a reasonable amount of time. Unfortunately, there is a complication in that we need to saturate the disk surface to the greatest depth possible, and very high frequency signals only "scratch the surface" of the magnetic medium. Disk drive manufacturers, in trying to achieve ever-higher densities, use the highest possible frequencies, whereas we really require the lowest frequency a disk drive can produce. Even this is still rather high. The best we can do is to use the lowest frequency possible for overwrites, to penetrate as deeply as possible into the recording medium. The write frequency also determines how effectively previous data can be overwritten due to the dependence of the field needed to cause magnetic switching on the length of time the field is applied. In order to understand the theory behind the choice of data patterns to write, it is necessary to take a brief look at the recording methods used in disk drives. The main limit on recording density is that as the bit density is increased, the peaks in the analog signal recorded on the media are read at a rate which may cause them to appear to overlap, creating intersymbol interference which leads to data errors. Traditional peak detector read channels try to reduce the possibility of intersymbol interference by coding data in such a way that the analog signal peaks are separated as far as possible. The read circuitry can then accurately detect the peaks. Since a long string of 0's will make clocking difficult, you need to set a limit on the maximum consecutive number of 0's. The separation of peaks is implemented as some form of runlength-limited, or RLL, coding. Peter Gutman [10] introduced a set of 22 overwrite patterns that should erase everything, regardless of the raw encoding. The basic disk eraser can be improved slightly by adding random passes before and after the erase process, and by performing the deterministic passes in random order to make it more difficult to guess which of the known data passes were made at which point. To deal with all this in the overwrite process; we use the sequence of 35 consecutive writes shown in Table (1). Table (1) contains 22 overwriting patterns these patterns implemented in the proposed SECDEL tools as shown in Figure (2). The MFM-specific patterns are repeated twice because MFM drives have the lowest density and are thus particularly easy to examine. The deterministic patterns between the random writes are permuted before the write is performed, to make it more difficult for an opponent to use knowledge of the erasure data written to attempt to recover overwritten data.

AG - 3

Start

Initialize Gutmann’s Patterns Array

Open file in native mode

Get first Gutmann’s pattern from array

Move pointer to beginning of file

Get Next Gutmann pattern from array

Is EOF? No

Write Current Gutmann’s pattern

Yes Is EO Array?

Increment file pointer

No Yes Close file

Stop

Figure (2) Flowchart of Peter Gutmann patterns in SECDEL If the device being written to supports caching or buffering of data, this should be disabled to ensure that physical disk writes are performed for each pass instead of everything but the last pass being lost in the buffering. Another consideration which needs to be taken into account when trying to erase data through software is that drives conforming to some of the higher-level protocols such as the various SCSI standards are relatively free to interpret commands sent to them in whichever way they choose [9][10][13]. Thus some drives, if sent a FORMAT UNIT command may return immediately without performing any action, may perform a read test on the entire disk, or may actually write data to the disk. The number of passes specifies the number of times the deletion algorithm is going to be performed on the file(s) to be deleted. The higher the number the more secure the deletion is but the longer it will take.

ICENCO 2006

Table (1) Overwrite data patterns [10] Overwrite Data Pass No.

Data Written

Encoding Scheme Targeted

1 -4

Random

5

01010101 01010101 01010101 0x55

(1,7) RLL

MFM

6

10101010 10101010 10101010 0xAA

(1,7) RLL

MFM

7

10010010 01001001 00100100 0x92 0x49 0x24

(2,7) RLL

MFM

8

01001001 00100100 10010010 0x49 0x24 0x92

(2,7) RLL

MFM

9

00100100 10010010 01001001 0x24 0x92 0x49

(2,7) RLL

MFM

10

00000000 00000000 00000000 0x00

(1,7) RLL

11

00010001 00010001 00010001 0x11

(1,7) RLL

12

00100010 00100010 00100010 0x22

(1,7) RLL

13

00110011 00110011 00110011 0x33

(1,7) RLL

14

01000100 01000100 01000100 0x44

(1,7) RLL

15

01010101 01010101 01010101 0x55

(1,7) RLL

16

01100110 01100110 01100110 0x66

(1,7) RLL

17

01110111 01110111 01110111 0x77

(1,7) RLL

18

10001000 10001000 10001000 0x88

(1,7) RLL

19

10011001 10011001 10011001 0x99

(1,7) RLL

20

10101010 10101010 10101010 0xAA

(1,7) RLL

21

10111011 10111011 10111011 0xBB

(1,7) RLL

22

11001100 11001100 11001100 0xCC

(1,7) RLL

23

11011101 11011101 11011101 0xDD

(1,7) RLL

(2,7) RLL

(2,7) RLL MFM (2,7) RLL

(2,7) RLL MFM (2,7) RLL

24

11101110 11101110 11101110 0xEE

(1,7) RLL

25

11111111 11111111 11111111 0xFF

(1,7) RLL

26

10010010 01001001 00100100 0x92 0x49 0x24

(2,7) RLL

MFM

27

01001001 00100100 10010010 0x49 0x24 0x92

(2,7) RLL

MFM

28

00100100 10010010 01001001 0x24 0x92 0x49

(2,7) RLL

MFM

29

01101101 10110110 11011011 0x6D 0xB6 0xDB

(2,7) RLL

30

10110110 11011011 01101101 0xB6 0xDB 0x6D

(2,7) RLL

31

11011011 01101101 10110110 0xDB 0x6D 0xB6

(2,7) RLL

3235

Random

(2,7) RLL

3. Implementation Details SECDEL is a graphical user interface tool that allows you to delete one or more files and/or directories; it also accepts folders and directories. Here, to get disk information like cluster size and number of sectors etc, the Windows API provides a function called GetDiskFreeSpace() which provides information that can be used as follows: GetDiskFreeSpace(volumeRoot, §orsPerCluster, &bytesPerSector, &freeClusters, &totalClusters); You can then get the cluster size, from the above information using the following equation: ClusterSize = bytesPerSector * sectorsPerCluster;

AG - 4

Securely deleting a file that has no special attributes is relatively straightforward where the SECDEL tool simply overwrites the file with the secure delete patterns according to the overwriting technique used. To do this, SECDEL tool use the CreateFile() API call to open the file and overwrite its bytes as follows: CreateFile(FileName, GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, FILE_FLAG_WRITE_THROUGH, NULL);

Notice that the flag FILE_FLAG_WRITE_THROUGH is used to make sure that no buffer is used to store data before it is flushed to the actual disk. This makes sure that any changes are directly on the storage medium and not in memory. What is trickier is securely deleting Windows compressed, encrypted and sparse files. Compressed, encrypted and sparse are managed by NTFS in 16cluster blocks. If a program writes to an existing portion of such a file NTFS allocates new space on the disk to store the new data and after the new data has been written, de-allocates the clusters previously occupied by the file. NTFS takes this conservative approach for reasons related to data integrity, and in the case of compressed and sparse files, in case a new allocation is larger than what exists (the new compressed data is bigger than the old compressed data). Thus, overwriting such a file will not succeed in deleting the file's contents from the disk. To handle these types of files SECDEL tool relies on the defragmentation API. Using the defragmentation API SECDEL can determine precisely which clusters on a disk are occupied by data belonging to compressed, sparse and encrypted files. Once SECDEL knows which clusters contain the file's data, it can open the disk for raw access and overwrite those clusters. Before using the defragmentation API, SECDEL needs to open the file exclusively to ensure data integrity. This is done using the Windows API function CreateFile() as follows: CreateFile( FileName, GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL);

Figure (3) High Security deletion level & number of times to repeat the process

ICENCO 2006

higher the security level the longer it will take. Tim e versus Filesize 3000000 2500000 Time (ms)

To overwrite file names of a file that you delete, Secure Delete renames the file 26 times, each time replacing each character of the file's name with a successive alphabetic character. For instance, the first rename of "foo.txt" would be to "AAA.AAA", and finally it renames the file to random generated file name Moreover SECDEL tool added itself to the context menu of the file manager operations as shown in Figure (4). This feature makes SECDEL tool easy to use.

2000000

High Level

1500000

Medium Level

1000000

Low Level

500000 0 0

100

200

300

400

500

Filesize (million bytes)

Figure (5) Deletion times of files of different sizes using SECDEL

Figure (4) SECDEL tool in the context menu of operating system One of the main arguments against the use of secure deletion utilities is that they overwrite the same part of memory many times over and over. Some people argue that this can affect the lifetime of the magnetic medium. This is not the case (see http://www.cciicc.gc.ca/PID/faq_2_e.shtml#2); the lifetime of a hard disk is anywhere from 5 to 15 years and the factors that have the greatest impact on this lifetime is the temperature, dust and humidity [1]. Accidents also are a factor that should be considered but the usage of the medium has a low rank amongst these factors. 4. Performance measurement and related work To study the effects of SECDEL we conducted several experiments to compare the time taken to delete files of different sizes using different security levels and different number of passes. It is obvious from the results that the higher the level of security, the longer it will take to securely delete the file. Also the higher the number of passes to perform the chosen algorithm, the longer it will take to delete the file. Figure (5) represents the time taken to delete a file increases linearly with the increase in file size and the

AG - 5

5. Related work There are several products that perform secure deletion of data. Some of these products are commercial ones and some are either educational or free [12]. These products are either created for the sole purpose of secure deletion, or include it as an option in a much larger product that contains much other functionality. The following are overviews of selected products. 5.1 Eraser for Windows There are programs available that will help perform the multiple overwrites that are desirable for secure data removal [12]. On Windows Eraser is just such a tool, which is available from http://prdownloads.sourceforge.net/eraser/eraser53.zi p. After download and install it was a simple matter of adding the files to a list of tasks and then running that task at the level of deletion required. The levels of deletion range from one pass to the three and seven passes recommended by the Department of Defense, through to the thirty-five passes recommended by Guttman. 5.2 WIPE Wipe is a secure delete, open-source application for Linux operating systems that is tightly based around the Peter Gutmann patterns. The user interface is through the command-line and a lot of parameters can be added to change the behavior of the program (see http://sourceforge.net/projects/wipe). 5.3 Comparative Study Table (2) compares between three different secure delete programs and SECDEL.

ICENCO 2006

Table (2) Secure delete programs comparison Program Tool Properties Standard Passes Maximum Passes Fewer passes Good RNG Rename file/directory Recursive mode Part of OS Works on Windows Works on *nix Time: 1 File, 1MB

SRM

Wipe0.2

Wipe0. 56-2a

SECDEL

38 38 Yes Yes Yes

35 35 Yes Yes No

35 35+ Yes Yes No

15 380 Yes Yes Yes

Yes No Yes

Yes No No

Yes No No

Yes Yes Yes

Yes 25s

Yes 40s

Yes 26s

No 60s

In Table (2) standard passes is the default number of passes for overwriting data for the program. Maximum passes is the maximum number of allowed passes. Fewer passes indicates whether the program allows the user to use fewer passes. Good RNG indicates whether a good random number generator is employed to generate the random characters used by the program. Discussion of random number generators is out of scope of this paper. Rename file/directory indicates whether the program changes the original file or directory name as an extra means of security. Recursive mode indicates if the program recursively deletes the contents of a folder by traversing subfolders and deleting their contents. Part of operating system indicates whether the program becomes an integral part of the operating system once installed. Works on Windows indicates if the program works on the Windows family of operating systems. Works on *nix indicates whether the program works on Unix systems and their derivatives like Linux. The last field shows how much time is taken by the program to delete 1 file, 1 MB in size. 6. Conclusion This work intends to show that the deletion of files cannot be left to the delete key if those files are supposed to be disposed of securely. It proves how simply files can be recovered under Windows if necessary security policy has not been extended to include the deletion of sensitive data. This paper covers some of the different ways data can be deleted using sanitization tools and how recovery of this data could be made difficult then discusses the implementation details of the proposed SECDEL tool. Data overwritten once or twice may be recovered by subtracting what is expected to be read from a storage location from what is actually read. Data which is overwritten an arbitrarily large number of times can still be recovered provided that the new data isn't written to the same location as the original data (for magnetic media), or that the recovery attempt is carried out fairly soon after the new data was written (for RAM). We have therefore developed a program which deletes these files in a secure way using several different methods that not only renders software products useless

AG - 6

in recovering data from these files but also makes the job of magnetic detection hardware considerably difficult and economically unfeasible. However by using the relatively simple methods presented in this paper the task of an attacker can be made significantly more difficult, if not prohibitively expensive. This, in addition to that the proposed SECDEL tool does not greatly effect the medium’s lifetime, makes our product both convenient to use as well as a good replacement to normal deletion methods in the cases of secret files. References

[1] Alexander Grau, Disk Rescue, URL: http://home.arcor.de/christian_grau/rescue/index.html [2] Anton Chuvakin, Ph.D , Linux data hiding and recovery, 10 March 2002, URL: http://www.linuxsecurity.com/feature_stories/datahiding-forensics.html [3] Associated Press, Sleuths probe Enron E-mails, 16 January 2002, URL: http://www.wired.com/news/print/0,1294,49774,00.html [4] DoD, “Cleaning and Sanitization Matrix”, DoD 5220.22-M, US Department of Defense, Washington, D.C., 1995. [5] Ghosts, http://security.tao.ca/ghosts.shtml [6] Jouni Vuorio, Regcleaner, URL: http://www.jv16.org [7] Kurt Seifried, Multiple Windows file wiping utilities do not properly wipe data with NTFS file system, 21 January 2002, URL: http://www.seifried.org/security/advisories/kssa003.html [8] Paul Festa and Lisa M. Bowman, Computer hinder paper shredders, 4 February 2002, URL: http://news.com.com/2100-1023-829004.html [9] Peter Bedrosian, Scanning Tunnel Microscopy, URL: http://www.llnl.gov/str/Scan.html [10] Peter Gutmann, “Secure Deletion of Data from Magnetic and Solid-State Memory”, Sixth USENIX Security Symposium Proceedings, July 1996, URL: http://www.cs.auckland.ac.nz/~pgut001/secure_del.html [11] Rovert Vamosi, I know what you did on your PC last summer, 16 October 2001, URL: http://zdnet.com.com/2100-1107-504091.html [12] Sami Tolvanen, Eraser, URL: http://www.tolvanen.com/eraser/faq.shtml [13] Simson L. Garfinkel and Abhi Shelat, “Remembrance of Data Passed: A Study of Disk Sanitization Practices”, IEEE Security & Privacy (January/February 2003) p.17. [14] Tom Pycke, Recover, URL: http://recover.sourceforge.net/linux/recover [15] Van Hauser, Secure Delete, URL: http://freshmeat.net/projects/securedelete/?topic_id=43

Suggest Documents