International Journal of Applied Engineering Research ISSN 0973-4562 Volume 9, Number 23 (2014) pp. 19899-19907 © Research India Publications http://www.ripublication.com
A Survey on Assured File Deletion in Cloud Environment B.Tirapathi Reddy1, M.V.P Chandra sekhar2, L.S.S Reddy3 , V.Krishna Reddy4 , P. Sai Kiran5
[email protected] , 2.
[email protected] [email protected] [email protected],
[email protected]
Abstract: Cloud computing has been one of the most prominent buzzwords of the IT industry on account of its revolutionary model of computing as utility. Cloud computing promises increased flexibility, scalability and reliability with decreased operational and maintenance costs. As every organization is accumulating tons and tons of data, it is quite difficult for them to manage huge amount of data. Instead of self-maintaining the data, the organizations are outsourcing their data to the cloud storage and are trying to reduce the data management overhead. Cloud storage offers an abstraction of infinite storage space for clients to outsource their data and access them in a pay-as-you-use manner. While we are outsourcing the data to the cloud storage services, managed by third parties, several security issues arise in terms of privacy and integrity of the data. To ensure the privacy and correctness of data Outsourced to the cloud storage, we must develop a practically implementable and readily deployable application which ensures security of the outsourced data and provides the data owner the assurance that the data was in a safe state, and all the copies of the data were deleted. This paper discusses several mechanisms available to ensure the assured file Deletion. Key words: cloud storage, policy based file assured deletion, time based deletion, and vanish.
1. Introduction Now a days, cloud has become the vehicle for delivering resources such as computing and storage to customers in a pay-as-you-use manner.NIST defined cloud computing as “a model for enabling convenient, on-demand access to a shared pool of configurable computing resources (e.g., servers, networks, applications, storage, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [1].
Paper Code: 28111 - IJAER
19900
B.Tirapathi Reddy et al
With the wide acceptance of cloud computing, and rapid growth of day to day data, every organization wants to make use of cloud storage and thereby wants to reduce the data storage and management cost. The need for efficient and secure cloud storage is increasing day by day. Cloud storage: “Cloud storage is a model of networked enterprise storage where data is stored in virtualized pools of storage servers which are generally hosted by third parties” [2]. Hosting companies operate large data centers, and people who require their data to be hosted buy or lease storage capacity from them. However as data are maintained by third party cloud service providers, several security questions will come into picture in terms of the privacy and accuracy of data. The traditional and cloud storage mechanisms are becoming highly reliable in recovering the data from a disaster/ failure. In order to ensure reliability, cloud service providers are creating multiple redundant copies of the data and they are spread through the cloud for reliability and availability, without the knowledge of the data owner. One specific issue is that when many number of copies of the data are created, it is hard to delete all those copies; The cloud service provider may forget to remove all the copies or intentionally may not delete some of the copies of data upon request of deletion by the data owner. One of the straightforward approaches to ensure privacy for the data is to apply some cryptographic techniques with some encryption keys, thereby the data will be converted to some secret form and it becomes inaccessible for the unauthorized users. But the security of the outsourced data depends on the complexity of the cryptographic technique applied, and the key size. The encryption mechanism should not require any modifications to the existing cloud architecture; it should work as an overlay on top of the existing cloud architecture.
Figure 1.Cloud Storage 2. Existing Mechanisms: With the rapid improvement in technology, users of the computer system are heavily
A Survey on Assured File Deletion in Cloud Environment
19901
dependent on the web- services for everything, when they store and retrieve data from the Web, personal data would be cached, copied and can be archived by any unauthorized user without the knowledge and control of the original user. The privacy and integrity of the archived, copied, and cached data must be ensured for the data owner. If the user does not want to keep the data in the cloud, such data should be completely deleted from the cloud storage. In order to securely delete the data, many techniques have been proposed which, in general, overwrite the data in order to delete the data. However techniques to overwrite the data heavily depend on the properties of the physical storage in which data are stored, but with cloud computing, providing virtualized storage models, physical control over data storage locations is no longer possible. (i) Time Based File Assured Deletion: Time based file assured deletion was first introduced in [3], according to this, files can be securely deleted and can be made inaccessible after the pre-defined time. In this the file is encrypted with a data key, and the data key is again encrypted with a control key that is available with the key manager. The key management will be done by a separate key manager. The control key is time-based, i.e. it will be deleted by the key manager once the time is expired. The expiration time will be specified when the file is created for the first time. If the attacker does not have the control key, the data key and the data file will remain encrypted and inaccessible. Advantages: The main advantage with the time based file assured deletion is that, it reduces the management overhead by a greater extent, the reason being once the time is expired, the file will be deleted (becomes inaccessible) automatically, for example in medical fields the tests about a particular patient can be used for a certain time, thereafter, we need not use the same data, we can delete that data, for these kind of applications we can go for time based file assured deletion. Drawback: the drawback of time based deletion is, the files become inaccessible, after the expiration of time, but the files in an encrypted form will still remains in the cloud storage. If an intruder figures out the key, he can try to recover the data and he may succeed too. (ii) Vanish: Vanish guarantees that all the copies of certain file become inaccessible after expiration of user specified time, though the attacker gets the copy of the file and key, with which the file was encrypted .Vanish guarantees security for the data file with the novel integration of cryptographic techniques, on a global scale, with the P2P, distributed hash tables (DHTs).Vanish encrypts the data locally with a random key K, not known to the data owner and Removes the local copy of the key and distributes the bits of the key across random indices in the DHT. The effectiveness of the vanish depends on the properties of the DHTs. vanish takes a data object X and encapsulates
19902
B.Tirapathi Reddy et al
it into a VDO V, to encapsulate the data object X vanish encrypts the data locally with a random key K, not known to the data owner, removes the local copy of the key and distributes the bits of the key across random indices. Vanish uses threshold secret sharing to split the data key K into N pieces .The threshold will give an indication of how many of the N sharers are needed to regenerate the original Key K from the key sharers .The Vuze-based system can support 8-hour timeouts in the basic Vanish usage model and open DHT-based system can support timeouts up to one week. Vanish is available as a Firefox plug-in for the Gmail service, that provides the opportunity to send and read self-destructing emails. Vanish does provide the security against the store sniffing attack [4] .but it does not ensure the protection against the Sybil attack [5].Vanish basically comes in two prototypes Fire vanish: Fire vanish comes in as a plug-in to Firefox in order to offer Gmail services which allows for composing and receiving self-destructing mails Sybil Attack [5]: the Sybil attack will defeat the purpose of Vanish by taking the drawbacks available in Large DHT (Distributed hash table).the Sybil attack work by continuously sending forward the DHT and all its values before they expire because of which more than 90% of the keys which are used by vanish for sending the messages can be recovered. All these operations can be performed by directly sending the RPC (Remote Procedural Call) commands to the peer in remote as a single UDP packet. Sybil attack will take the advantage of decentralized key management servers that comes with P2P Distributed hash tables (DHT).the vuze DHT will distribute multiple replicas and replicates the entries around twenty times. They will create multiple false participants in the network and will act like genuine clients and will create too much burden on the client. They will create too much traffic, which is very difficult to analyze without actually participating in the real network. Advantages: Vanish provides privacy for the sensitive data like emails, text data, files, by making them self destructive, by which the burden of the data owner can be minimized a lot. It will not create any new threats for the privacy of the data. It provides security against most dangerous attacks like Store Sniffing Attack. Drawback: Though vanish provides security against store sniffing attack it is still not able to provide the security against Sybil attack, which mainly focus on the Drawbacks of Large DHT’s(Distributed hash tables).vanish makes the data inaccessible only after the expiration of time, which does not give the fine grained control to the data owner. (iii) FADE: Both vanish and time based file assured deletion focus only on the assured file deletion upon expiration of time. But to give more fine grained control to the data owner a policy based file assured deletion (FADE) mechanism was proposed. FADE is a practically implementable and readily deployable cloud storage system that focuses on assured file deletion upon requests of deletion. Policy-based file assured deletion is developed upon conventional cryptographic techniques. It encrypts the
A Survey on Assured File Deletion in Cloud Environment
19903
outsourced data files to guarantee their confidentiality and integrity[14]; it assuredly deletes the files to make them unrecoverable upon request of delete with revocation of file access policies. FADE divides the management of encrypted data and management of encryption keys. The encrypted data remains on the untrusted cloud storage, while encryption keys are maintained by a separate key manager which follows a quorum scheme [6]. Quorum scheme: Quorum scheme will ensure the security to the data by distributing the key to several groups without giving them any chance to reconstruct the key. For reconstructing the key the individual groups will have to unite. The main goal of this quorum scheme is to share the secret key S among N groups G1, G2, G3... Gn, i.e. at least some k groups are needed to reconstruct the key. All the N groups need not participate in order to recover the secret key S. each group will receive a share of the secret key S, then cooperation among K groups is sufficient to reconstruct the key S. but any groups less than K cannot re construct the key on their own . In FADE, each file is assigned a single atomic file access policy or a Boolean combination of atomic policies. Every policy is allotted a control key, and all the control keys will be with a separate key manager. The key manager will take up the responsibility of managing all the keys. The data file is encrypted with a data key, and the data key is further encrypted with a control key corresponding to the policy combinations. When a policy is revoked, the corresponding control key will be removed from the key manager, with this the data key and hence the encrypted data file cannot be recovered. The main idea of assured file deletion is to make the file inaccessible upon request of delete, even though it was not physically removed from the cloud storage. The FADE contains three categories of participants: data owner, key manager, cloud storage. Data owner: The data owner is the entity, whose data is to be stored in the cloud storage, the data owner can be a mobile device, a user level program or it may be a file system in a Personal computer. The data owner encrypts the data to be outsourced to the cloud, interacts with the key manager to perform the required cryptographic operations. Key Manager: The key manager maintains the keys according to the policies which are used to encrypt the data. It performs the operations like encryption, decryption, revocation and renewal of control keys. Cloud storage: The cloud storage is maintained by a third party cloud service provider like Amazon S3 and maintains data on behalf of data owner. FADE guarantees that we need not make any protocol or implementation changes to the cloud storage to support assured file deletion because the storage clouds are owned and managed by third party cloud service providers. There should not be any structural changes in the way it is
19904
B.Tirapathi Reddy et al
implemented. But FADE does not ensure version control, which is very much important as the need for cloud storage is increasing, in order to eliminate the redundant copies of the data. FADE does not ensure the access control to the genuine data owners. Access control to the outsourced data is provided in [7], in this in order to ensure the access control, cipher text-policy attribute Based Encryption [8] can be used. In cipher text –policy attribute based encryption the user will request the key manager to decrypt the file by supplying authentication credentials to show that he will satisfy the policies associated with the file. In this each user will get a cipher text policy attribute based encryption –based private access key that depends on a set of attributes that the user satisfies , then this private access key is sent to the users who satisfy the corresponding policies . The key manager will maintain the cipher text – policy attribute based encryption public access key and uses it to encrypt the message responses returned to the users. In this way having two sets of keys will ensure both access control and assured deletion to the data owners. Advantages: File assured deletion gives the user more control on the data when compared with vanish and time based file deletion, which depend only on the expiration time. File assured deletion will work as another layer of security, which works on top of the existing cloud architecture. Drawback: The basic File assured deletion (FADE) does not provide access control and version control, which are very much required in a cloud environment. (iv) FADE Version: As every organization is accumulating a large pool of data, data storage management has become a big overhead task for the organizations. Every organization wants to minimize the data management overhead by choosing the cloud storage. A Company named, Smug Mug which is a photo sharing website, hosted terabytes of photos on Amazon S3 and saved so much of money and reduced the data management overhead. Not only organizations like Smug Mug, even individuals are also storing their sensitive data on the cloud with the increasing popularity of mobile devices, which does come up with limited storage capacity. But with the increased use of cloud storage the necessity of providing security to the outsourced data is also increasing day by day. One such important feature is to ensure assured file deletion and version control. There are many ways like secure overwriting, cryptographic protection in order to achieve the assured deletion. FADE does provide the assured file deletion with access control, but it does not provide the version control, which is an important feature for any backup system. Existing studies also state the same that the FADE does not ensure version control. There are some cloud backup systems like Jungle Disk [9], Nasuni [10], Drop box [11], as well as the open-source Cumulus system [12], which will provide version control and also will maintain different versions of backups. Especially Cumulus, which considers a thin cloud interface, provides put, get, list, delete functionalities for cloud outsourced data. Cumulus divides the data
A Survey on Assured File Deletion in Cloud Environment
19905
file into different data chunks and only the updated/modified data chunks will be sent to the cloud storage. Existing version controlled cloud backup systems like cumulus, Nasuni, etc, and assured file deletion techniques like FADE, Vanish are incompatible with each other. In order to provide assured deletion with version control a secure cloud backup system Fade Version was developed. In Fade Version the data owner can specify which particular file or version of the file is to be deleted, while keeping the other versions/files unaffected. Fade Version is a cloud backup system which is similar to the Cumulus, an open-source cloud backup system, in which different data objects to be stored on the cloud, will be created. On top of the version control system a layered approach of cryptographic protection will be added. In Fade Version each backup version divides the data files in to file objects with a configurable maximum-size threshold. By dividing the data file in to file objects it will be easy for us to upload only the modified objects, by which we can save the storage and upload costs. Fade Version uses deduplication [13] in order to reduce the storage costs further .i.e. if two file objects have the same content, then only one object will be stored on the cloud and pointers are created to have a reference to the stored file object. To decide whether two objects have the same content, objects will be applied to cryptographic hash functions like secure hash algorithm (SHA-1) or HMAC, and we will verify whether two objects will have the same hash value. For each version of the data a Metadata object, which describes the file objects will be created. All the file objects and Metadata objects will be uploaded to the cloud storage. Fade Version uses a two layer encryption to achieve assured deletion. For each data object we generate a data key, and encrypt the data object with the data key using the symmetric key encryption. Then for each version of the data we generate a control key, and we will encrypt all data keys of the objects associated with that version using the control key via symmetric key encryption and later they will be uploaded to the cloud storage. All the control keys are maintained by a key escrow system. In order to get the file object of a particular version, we have to take the corresponding control key from the key escrow system, and decrypt the data key and the corresponding encrypted file object. If we want to delete a particular version v, the Fade Version will remove the control key s corresponding to that version v from the key escrow system. As the control key is removed, we cannot decrypt the encrypted data keys associated with that version V, and Fade version does not affect the other versions even if data dependency is there among the versions. Fade Version also ensures assured deletion for multiple policies too. The Fade version can make use of threshold secret sharing, data deduplication in order to reduce the data management overhead to a greater extent. Advantages: Fade version which is an extended version of general FADE does provide version control and access control, which will take deduplication and threshold secret key, sharing. Which are essential in the cloud environment in order to ensure version control and access control.
19906
B.Tirapathi Reddy et al Table-1
Mechanism/features’
Time based Self deletion destruction
Fine grained control ---
Time based file √ --assured deletion VANISH √ √ --FADE ----√ FADE Version ----√ √indicates that the mechanism supports the feature --- indicates that the mechanism does not support the feature
Access control
Version control
---
---
----√
----√
Table-1, will describe the comparison of various file assured deletion mechanisms, which will guarantee the data owner, whether the file is assuredly deleted upon request of the data owner for deletion of the data. Now a days with the advent use of mobile devices, the mobile device users also are using the cloud storage effectively, because mobile devices does not come up with bigger memories, the mobile users can outsource their mobile data to the cloud storage and can access effectively, but with more usage there is a high demand for providing the secure access to the data stored in cloud storage. The mechanisms like time based file assured deletion, vanish, file assured deletion (FADE), file assured deletion with version control, and file assured deletion with access control have been used rigorously in order to have secure file access to the data stored in the cloud storage. In order to make use of these mechanisms which provide secure file deletion the existing cloud architecture need not be changed, all these mechanisms works on top of the existing cloud architecture as an overlay.
Conclusion: Cloud computing is the most popularly used computing paradigm that everyone is using to offer better business solutions. Using the cloud storage so many organizations have been benefited by reducing their data management overhead by greater extent. All the methods discussed above have been used rigorously by many cloud users according to the feasibilities. Our research work proceeds in ensuring the file assured deletion in multi cloud environment and reducing the meta data management, there by the cloud storage becomes more attractive and many users will adopt the cloud storage in order to reduce their data storage and management costs.
REFERRENCES 1. 2.
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf http://en.wikipedia.org/wiki/Cloud_storage
A Survey on Assured File Deletion in Cloud Environment 3. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14.
19907
R. Perlman. File System Design with Assured Delete. In ISOC NDSS, 2007. www.packetwatch.net/documents/papers/layer2sniffing.pdf www.few.vu.nl/~mconti/teaching/ATCNS2010/ATCS/Sybil/Sybil.pdf https://www.cs.umd.edu/~elaine/docs/sybil.pdf A. Shamir. How to Share a Secret.CACM, 22(11):612–613, Nov 1979 http://www.cse.cuhk.edu.hk/~pclee/www/pubs/tdsc12_tech.pdf www.cs.utexas.edu/~bwaters/publications/papers/cp-abe.pdf Jungle Disk. http://www.jungledisk.com/, 2010 Nasuni. Nasuni Announces New Snapshot Retention Functionality in Nasuni Filer; Enables Fail-Safe File Deletion-in the Cloud, Mar2011.http://www.nasuni.com/news/press-releases/nasuni-announcesnewsnapshot-retention-functionality-in-nasuni-filer-enables-fail-safefile-deletion-inthe-cloud/. Drop box. http://www.dropbox.com, 2010.11. M. Vrable, S. Savage, and G. Voelker. Cumulus: File system backup to the cloud. In Proc. of USENIX FAST, 2009 S. Quinlan and S. Dorward. Venti: a new approach to archival storage. In Proc. USENIX FAST, 2002. http://www.gjcat.com/Issue/GJCAT_2011_0319.pdf
19908
B.Tirapathi Reddy et al