Survey on Data Remanence in Cloud Computing Environment

2 downloads 0 Views 286KB Size Report
address the problem of residual data in a cloud-computing environment, which is characterized by the use of virtual machines .... another common technique, which targets only magnetic media by using a ... random. U.S. Air Force System.
Survey on Data Remanence in Cloud Computing Environment Khalid AISSAOUI, Hafsa Aitidar, Hicham BELHADAOUI, Mounir RIFI RITM Laboratory, CED Engineering Sciences Ecole Supérieure de Technologie, Hassan II University Casablanca, Morocco [email protected]

Abstract—The Cloud Computing is a developing IT concept that faces some issues, which are slowing down its evolution and adoption by users across the world. The lack of security has been the main concern. Organizations and entities need to ensure, inter alia, the integrity and confidentiality of their outsourced sensible data within a cloud provider server. Solutions have been examined in order to strengthen security models (strong authentication, encryption and fragmentation before storing, access control policies…). More particularly, data remanence is undoubtedly a major threat. How could we be sure that data are, when is requested, truly and appropriately deleted from remote servers? In this paper, we aim to produce a survey about this interesting subject and to address the problem of residual data in a cloud-computing environment, which is characterized by the use of virtual machines instantiated in remote servers owned by a third party. Keywords— Cloud computing security; cryptography;

confidentiality; sanitization.

virtualization;

data

remanence;

I. INTRODUCTION The cloud computing is a new paradigm that has significantly evolved these last years. According to the NIST, “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction…”[1]. It has potential advantages; computing capacity provided as needed, much lower cost than in-house infrastructure, better reliability depending on the Service Level Agreement (SLA), enhanced and simplified IT management and maintenance capabilities through a central administration of resources ensured by the Cloud Service Provider (CSP). In such an environment, the user or data owner (DO) pays only for the resources that he actually uses (pay as you go) without need for more expenses; there is no invested IT capital. Finally, access to the data stored on cloud provider’s servers is possible using a simple internet connection. Depending on the users requirements, three different models can be deployed on the provider’s servers; a public, a private or a hybrid model [2]. In addition, the providers usually offer to their customers three diversified services [3].

 Software as a Service (SaaS): Users get access to applications and software using a web browser. SaaS enables providers to control and manage the use of their products.  Platform as a Service (PaaS): In this model, providers offer a development environment (IDE, toolkits, DBMS…) to developers in order to design, develop and deploy their own solutions.  Infrastructure as a Service (IaaS): This model provides the infrastructure to run customer’s applications; it enables organizations and companies to, exclusively, use hardware (Servers, routers, connectors…). In this paper, we aim to give an overview of the security challenges that face the adoption of the cloud computing in a larger scope. The contributions of this paper can be summarized as follows. First, we provide a number of security issues in the cloud computing. Second, we give an explanation of the data remanence and the risk of sensitive data disclosure that may occur. Finally, we highlight some solutions related to residual data especially in cloud environment that deserve substantial further research and development. II. SECURITY IN CLOUD There is, without a doubt, an increasing use of the Cloud Computing (Private Cloud in particular) due to its major advantages [4]. However, many concerns exist with regard to the dependence on the cloud provider, Internet connectivity issues, lack of existing regulations, exposure to government laws (e.g. USA’s Patriot Act) and, substantially, data security that requires more attention and needs to be resolved in a costeffective manner [5]. Within this new IT framework, traditional means of protection are not efficient. Sensitive data like technical requirements, financial information or scientific formula are stored on shared remote servers handled by a third party, which must guarantee security as agreed with clients (data owners or users). In a cloud environment, data are stored on shared remote VMs created and assigned by a service provider who must guarantee confidentiality. Access to this data should not be allowed to certain users or entities including the provider. Some efficient techniques are used like cryptography or fragmentation [6], which is an interesting alternative too.

Furthermore, to guarantee data integrity, the DO has to protect their files from modification or deletion that may occur (accidentally or deliberately). The access to sensitive data is an issue that many papers addressed in different ways. A proxy re-encryption [7] in which the DO encrypts with symmetric keys blocks of data before sending them to cloud servers. Those keys are encrypted with the DO’s master public key. The DO’s master private key and users public keys are then combined to generate proxy re-encryption keys, which are used to retrieve plain text intended to a specific user. In this model, any collaboration between the CSP and the final user would compromise data security (e.g. disclosure of decryption keys). In [8], Yu et al propose an encryption based on attributes assigned to data. Every single file is associated to an attribute. The access structure of each user (defining his authorized files) is created according to a logical expression over these attributes. Each file is encrypted with the public key corresponding to its attribute. The user is not able to decrypt a file if its attribute does not match their access structure. In this scheme, the updating after access rights modification or revocation is a real burden as the number of users is extremely high in a cloud environment. A solution based on a key derivation system was proposed in [9] where each file is encrypted with symmetric keys. A user can only decrypt the files they are allowed to, by using both their private key and a set of public tokens that the owner generates and sends to the cloud server, which is responsible for distribution. Hota et al [10] propose a solution in which the data owner shares a list of access rights (a capability list) with the provider. It is updated in case of modification or insertion. The CSP uses the capability list to manage access requests. This model was revised in [11] with the aim of preventing the CSP from any involvement. Besides, the authentication process allows the server to identify users in order to enable access to data or not [12]. There are several methods implemented to strengthen this process (one-time passwords, strong authentication…).

fact, when one drags a file into trash, he does not actually delete it, although he empties the trash, the system will simply not be able to reach it. Consequently, there is risk that information that has been erased can still be retrieved from memory; this could be a major threat with regard to the confidential nature of presumed deleted files (passwords, encryption keys, private account information, financial or health data…). Data remanence is found through computer forensics to locate and recover files that may have been deleted from a device. Skorobogatov [14] was interested in the remaining charges on transistor’s gate in non-volatile programmable devices (e.g. flash memory, EPROM…) from which an attacker could restore information. He conducted a number of experiments to show how one can retrieve residual data from such memory devices and he proposed countermeasures to make recovery more difficult including the use of encryption. In their work, AlBelooshi et al [15] prove the presence of residual data on both VM random memory and hard disk. They used open source software to capture swap files, analyze memory dump, extract information or recover deleted files. Actually, virtualization is an essential feature in cloud computing solutions. Virtualization is an underlying technology that permits a rapid provisioning, on demand resources, and the reduction of expenditures “Fig. 1”. It offers the ability to run multiple operating systems on a single machine by sharing all the resources, allows the pooling to reduce costs from users’ perspective and improves service performance with efficient techniques such as load balancing, file swapping, ballooning... However, these advantages raise concerns about data security, especially in a public cloud deployment (CSP could run only mutually trusted VMs on the same physical medium to enhance protection). How can a DO be sure that data are properly deleted? Can a DO execute or request a more secure erasing operation? Fortunately, some solutions can reduce the consequences of this threat as is discussed below (cf. section IV).

Another security concept is the non-repudiation feature, which refers to the ability to ensure the authenticity of a signature on a document or the sending of a message. It is performed with digital signature algorithms. Finally, among all these threats, the data remanence is the less addressed one and even ignored by some cloud providers. Indeed, data remanence refers to the residual data remaining in storage media such as hard disks. How can a DO be sure that, after a deleting action, their files are correctly removed from the provider’s servers? Could an intruder, another customer or even the CSP recover them? We will give more explanations on the subject and an outline of the existing solutions, especially in a cloudcomputing environment. III. DATA REMANENCE Sensitive data need to be protected from unauthorized access as well as data recovery. Data remanence is the existence of residual data even after a deletion operation, reformatting or re-allocating a media to another user [13]. In

Fig. 1. Virtualization in Cloud Computing

IV. EXISTING SOLUTIONS A number of methods have been used to eliminate or minimize the existence of residual data. Wei et al [16] produced an interesting work about this issue, which contains an experimental comparison between several techniques. The cloud was not the concern; some of these solutions, as we will see, are not relevant in the case of this environment “Table 1”. A. Sanitization Sanitization, called also purging, refers to the removal of sensitive data from a storage device in order to prevent recovery with any known method or technique. The NIST published the DoD 5220.22-M [17], a guideline for these techniques which contains general procedures to perform a data sanitization. There are specific methods that one can execute such as wiping, which consists of overwriting the medium with new data (usually a sequence of zeros or random values). Nevertheless, some areas of the medium may be inaccessible. Although this technique is generally sufficient against standard system functions, one could retrieve substantial data using particular programs. Degaussing is another common technique, which targets only magnetic media by using a degausser (a device that can generate a magnetic field so that it can purge a kind of media deeply and efficiently) [18]. However, degaussing equipment might pose some risks, when it is inappropriately used (the media may be removed before the degaussing is complete) or because of the possibility of its failure (Preventive maintenance should be done regularly). These methods require a physical access to the media, which is not the case in DO’s point of view, so they need to be done by the CSP. Unfortunately, most of the providers do not include this feature on the SLA or claim that they follow the NIST recommendations [19]. A sanitization of virtual machine images could be done [20], which might allow DO to perform it (in association with the CSP, or not). It is an interesting lead for further research. B. Encryption Data encryption is a powerful method for protecting data. Residual encrypted information is obviously useless. In Virtual Private Storage (VPS), encryption and decryption take place on a private server (where the keys are protected from disclosure) that communicates with the Cloud. Encryption and sanitization may be combined by purging only disk area that contain keys in order to improve and accelerate the process. Nevertheless, encrypted data cannot be processed in the cloud; it has to be decrypted first. This is a major limitation, although Gentry [21] addressed the issue with the FHE, it is still insufficient. C. Destruction There are many techniques to destroy a storage media. These include physical breaking (shredding), chemical altering (incineration or corrosion), vaporization, liquefaction, raising temperature (e.g. magnetic media), electromagnetic fields… It should be noted that even a small piece of the media might still contain information; data recovery could be done after a physical destruction. Although it is the most efficient method, destruction is naturally not possible in a cloud environment, as the DO does not own the storage media.

TABLE I.

DATA REMANENCE COUNTERMEASURES COMPARAISON

Method

Overwriting

Degaussing

Encryption

Destruction

Concept

Drawbacks

Writing using a software solution a sequence of zeros, ones or random data on all sectors of a hard disk. Using a special equipment (Degausser) to remove or reduce magnetic fields on a drive. Encrypting data before storing in cloud servers. Saving keys in local or virtual private server. Using physical or chemical destruction techniques.

Not efficient on solid state or USB flash drives. Not tailored to virtual and dynamic environment such as cloud. Limited to magnetic drives. May render the media inoperable. Difficult key management. Encrypted Data cannot be processed in the Cloud. Simply not applicable in a cloud environment.

Sanitization programs may follow some standards edited by governments and organizations in order to strengthen their overwriting process [22] “Table 2”. As cited above, Cloud providers do not address the data remanence issue as needed. There is a lack of options and measures. For instance, Amazon claim that they wipe their physical disk before assigning to a new customer in accordance with DoD 5220.22-M or NIST 800-88 [19] Microsoft state that their Cloud Platform comply with NIST standards, no further details are given though. In SalesForce data centers, data are retained in inactive status for 180 days and a transition period of up to 30 days, after which customer’s information is overwritten or deleted but also stored in backup media for additional 90 days [23] (Yet no standard is reported). IBM also follow NIST guidelines when they execute a sanitization process upon customer request or service termination and ISO/IEC 27001 [24]. TABLE II. Overwriting Standard

U.S. Navy Staff Office Publication NAVSO P5239-26 U.S. Air Force System Security Instruction 5020 Bruce Schneier's Algorithm U.S. DoD Unclassified Computer Hard Drive Disposition German Federal Office for Information Security Communications Security Establishment Canada ITSG-06 Australian Government ICT Security Manual 2014 - Controls British HMG Infosec Standard 5, Enhanced Standard

OBERWRITING STANDARDS

rounds

patterns

3

A character, its complement, random

3

All zeros, all ones, any character

7

All ones, all zeros, pseudorandom sequence five times

3

A character, its complement, another pattern

2

Non-uniform complement

3

All ones or zeros, its complement, a pseudo-random pattern

1

Random pattern

3

All ones, all zeros, random

pattern,

its

V. CONCLUSION In this paper, we gave an overview of the state of the art related to data security and residual data. We notice the difficulty of performing a sanitization in a Cloud environment. Is it more practical to implement a solution for a remote purging of VMs created and allocated by the CSP in order to ensure sufficient protection? Alternatively, to make do with a simple auditing on how provider manages this issue? This is not an easy task as logs related to all customers are stored in the same pool, especially in a public Cloud. Auditing techniques are hardly applicable in a dynamic infrastructure like the Cloud. REFERENCES [1]

[2] [3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13] [14]

[15]

[16]

P. Mell and T. Grance, “The NIST Definition of Cloud Computing. NIST Special Publication 800-145 (Draft)”, Retrieved September 10, 2011, from http://csrc.nist.gov/publications/drafts/800-145/Draft-SP800-145 cloud definition.pdf T Mather, S Kumaraswamy, S Latif, Cloud Security and privacy, O’REILLY, Sept 2009 W Jansen, T Grance, Guidelines on Security and Privacy in Public Cloud Computing, NIST (National Institut of Standards and Technology) U.S. Department of Commerce, Dec 2011 RightScale 2016 State Of The Cloud Report, http://www.rightscale.com/blog/cloud-industry-insights/cloudcomputing-trends-2016-state-cloud-survey T. Dillon, C. Wu and E. Chang, Cloud Computing: Issues and Challenges. 24th IEEE International Conference on Advanced Information Networking and Applications, 2010 V. Ciriani, S. De Capitani di Vimercati, S. Foresti, S. Jajodia, S. Paraboschi, and P. Samarati. Combining fragmentation and encryption to protect privacy in data storage. ACM Transactions on Information and System Security (TISSEC), 2010. G. Ateniese, K. Fu, M. Green and S. Hohenberger, Improved Proxy Re-Encryption Schemes with Application to Secure Distributed Storage, ACM Transactions on Information and System Security, Vol. 9 No. 1, Feb 2006 pp. 1-30. Shucheng Yu, Cong Wang, Kui Ren and Wenjing Lou, Achieving Secure, Scalable and Fine-grained Data Access Control in Cloud Computing, Proc. ACM Workshop on Computer Security Architecture (CSAW’07), Nov 2007, USA S. D. C. di Vimercati, S; Foresti, S Jajodia, S Parabocshi and P. Samarati, Over-encryption : Management of Access Control Evolution on Outsourced Data, Proc. 33th International Conference on Very Large Databases (VLDB’07), Vienna, Austria, 2007, pp. 123-134. Hota, S. Sanka, M. Rajarajan, S. K. Nair, Capability-Based Cryptographic Data Access Control in Cloud Computing. Int J. Advanced Networking and Applications 03, 1152-1161 (2011) K. Aissaoui, H. Belhadaoui, A Zakari and M. Rifi. Data Security and Access Management in Cloud Computing: Capability list-based Cyrptography. (IJCSIS) Internationl Journal of Computer Science nd Information Security, Vol. 14, No. 11, Nov 2016 J. Lopez, R. Oppliger and G Pernul. Authentication and authorization infrastructures (AAIs): a comparative survey. Computers & Security, 2004 – Elsevier A Guide to Understanding Data Remanence in Automated Information Systems", National Computer Security Centre, Sept 1991 S. Skorobogatov, Data Remanence in Flash Memory Devices. International Workshop on Cryptographic Hardware and Embedded Systems 2005 pp 339-353 B. AlBelooshi, K. Salah, T. Martin and E. Damiani, Experimental proof: Data Remamence in Cloud VMs. IEEE 8th International Conference on Cloud Computing 2015 M. Wei, L. M. Grupp, F. E. Spada, S Swanson, Reliably Erasing Data From Flash-Based Solid State Drives. 9th USENIX Conference on File and Storage Technologies. Oct. 2013

[17] Special publication 800-88: Guideline for Media Sanitization. NIST, Sept 2012 [18] P. Gutmann, Secure Deletion of Data from Magnetic and Solid-State Memory. 6th USENIX Security Symposium Proceedings, San Jose, California, Jul.1996 [19] Overview of Security Process, Amazon Web Services. Jun. 2004 [20] S. N. Chari, A. Kund, Sanitization of virtual machine images. International Business Machines Corporation, US20150033223 A1. Jan. 2015 [21] C. Gentry, Fully Homomorphic Encryption Using Ideal Latices, Symposium on the Theory of Computing, 2009, pp. 169-178 [22] https://en.wikipedia.org/wiki/Data_erasure [23] Salesforce Security, Privacy, And Architecture. December 10, 2015 [24] Data Security and Privacy Principles: IBM Cloud Services

Suggest Documents