Reliable and Efficient Storage Solution using Multiple Cloud Services

0 downloads 0 Views 837KB Size Report
on the different cloud services which makes up the cloud of cloud. ... INTRODUCTION. Cloud computing is a type of internet-based computing that gives shared ... Partition the first information record 'F' into 'n' parts. (F1, F2, …Fk). .... machine. Approach File(MB) Time(MS). Local. 10. 210. 100. 858. 200. 1341. 300. 2040. 400.
ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57

Reliable and Efficient Storage Solution using Multiple Cloud Services Shubham Singh, Gaurang Raval, Sharada Valiveti Institute of Technology Nirma University, Ahmedabad Email: [email protected], [email protected], [email protected]

Abstract— As the popularity of cloud storage service is increasing the reliable data storage methods also have significant role to play in managing data over long time. As the data is stored at single cloud service provider, failure or outage of the provider can leave the organization non-functional. A cloud of cloud is a method to reduce the limitation of the individual cloud. It improves availability and integrity of data kept in the cloud with the encryption, encoding, and duplication of the information on the different cloud services which makes up the cloud of cloud. We implement our system with four open source cloud storage services in the local network. It was observed that our technique enhances the perceived accessibility in the majority of the cases, when we contrasted single cloud provider. In this scenario, the service cost is higher than that of utilizing individual cloud but the efficiency and reliability is far more superior as compared to single cloud service. The experimentation was performed in local environment. Key word: Network File S ystem(NFS ), Reed-S olomon(RS ), Erasure-Coding(EC), Secure Cloud S torage System(S CSS),

1. INT RODUCTION Cloud computing is a type of internet-based computing that gives shared computing assets and information to personal computer and various gadgets on request. It is a method for empowering computing devices with on-request access to a pool of configurable assets which can be quickly provisioned and discharged with trivial ad ministration [1]. Cloud provides basically three main services: 1. Infrastructure as a Serv ice: The infrastructure is given to the client, ordinarily virtual mach ines (PCs) associated with the system. It modifies itself to suit the requirements. 2. Platform as a service: The platform is given to the client, with the working framework and required programming languages support 3. Software as a service: This type of ad min istration is typically situated as "programming on request", this product is sent to remote servers and the client can get to it by means of the Internet, and all updates and licenses for this product are ad ministered by the

IJCSCN | June-July 2017 Available [email protected]

service provider. Pay ment for this service is fixed as per the number of devices needing the service. Cloud provides many facilit ies like storage of data, managing, and processing data at a lower cost compared to local server deployment. When single cloud does not fulfil the reliability requirements then mu ltiple clouds can be applied to fulfil the requirements and reduce the limitation of a single cloud [2]. There are many advantages of cloud of cloud like immun ity against storage outage and survive datacentre failure, avoid vendor lock-in, tolerance to data corruption. If we use more than one cloud to store our critical data then we can preserve our data whenever limited number of cloud services are non-functional [3]. This paper examines the coding technique applied for storage efficiency and reliab ility. The concept of erasure coding is applied while coding the file and the encoded file pieces are scattered across different cloud services. Following section presents the literature survey. Then after proposed approach is described with results and analysis. The paper ends with concluding remarks and future work. 2. LITERATURE SURVEY DepSky [4] is a framework that enhances the accessibility, confidentiality, and integrity of stored information in the cloud. It achieves this objective by scramb ling, encoding and duplicating the informat ion with an arrangement on different clouds, which forms a cloud of cloud. It is a reliable and failsafe framework. DepSky addresses some v ital restrictions of distributed computing for information stockpiling in a collaborated way. It consists of three parts readers, writer, and four cloud storage service provider. Readers can fail in arbitrary mode and writers only fail by crashing. DepSky overcome loss of availability, loss of privacy issue through encryption key. It stores the fraction of the total extent of information in every Cloud using erasure codes. There are some applications of Cloud-of-Cloud like critical data storage, content distribution. It helps to distribute data in different cloud storage service provider and also support the accessibility fro m anywhere. Figure 2.1 shows the architecture of the same. DepSky reduces the limitation of the single cloud through a set of useful By zantine quorum protocols, erasure

52

ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57 coding, private sharing and cryptography application to the mu ltip le clouds.

using erasure code. Secure decentralized erasure code for distributed networked storage [7] is another variation of the application of RS code for storage efficiency and reliability. The Authors approach is shown in figure 2.3. It addresses the security problem of the d istributed storage framework. It is craved that information put away in the framework stay private regard less of the possibility that all stockpiling servers in the framewo rk are co mpro mised. They have developed a protected dispersed eradication code is wh ich is strong, secret, and with a small stockpiling price.

Figure 2.1 DepSky A rchitecture [4] Reed So lo mon Erasure code [5] is highly popular code applied in storage, communicat ion systems to handle burst errors occurring while accessing data from storage or during communicat ion. The follo wing diagram shows how RS code is applied to a data file for easy recovery when few sites fail.

Figure 2.3: min iature of distributed networked storage system: In HopsFS [8], an Erasure Coding Manager keeps running on the pioneer NameNode, overseeing record encoding, document repair operations and additionally executing an approach that spots record hinders on DataNodes. It guarantees that, in the case of a DataNode failure, in fluenced records can be repaired in any case.

Figure 2.2: Upgraded eradicat ion code-based security system [5] The RS code based encoding and decoding mechanism works as follows. Step 1. Partition the first information record 'F' into 'n' parts (F1, F2, …Fk). Step 2. Encode these pieces into 'm' sections (S1, S2, . . . , Sk, Sk+1, . . . , Sn). Step 3. Encode the above 'n' parts into scrambled ones (E1, E2, . . . , Ek, Ek+1... , En ) and scatter all to 'm' distinct storage nodes. Step 4. Remove any 'n' parts from the 'm' scramb led ones and decode them into 'n' sections (D1, D2, … DK) in unencrypted text. Step 5. Reco mpute the original information document 'F' by utilizing RS code capabilit ies.

A shared cloud-backed file system, SCFS [9], is a cloudsupported document framework that gives solid consistency even on top of in the long run pred ictable distributed storage administrations. It provides POSIX-like interface. SCFS gives a pluggable back-end that permits it to work with a single cloud or with a cloud of cloud. It addresses some problem like Most of the cloud backed file system does not support controlled file sharing in all client. So me cloud-backed file system uses a pro xy to connect cloud but it is single point failure if it is down then all process will be down that means no client can access data from the cloud. Most of the file system uses the single cloud as back-end.

Figure 2.4: Architecture of SCFS [9] Given a data file, Reed-Solo mon erasure code firstly partitions it into 'n' p ieces of a similar size and after that we conceal it into n sections. Any 'k' sections removed fro m the 'n' sections can be utilized to rebuild the initial file or data. Here the code is usually denoted as (n, k) RS code, where the code can survive n - k site failures. In [6] authors have applied security mechanis m along with erasure code for protecting the data on mult iple clouds stored

IJCSCN | June-July 2017 Available [email protected]

Figure 2.4 represent the architecture of shared cloud backed file system with its three main co mponents. The back-end cloud storage for storing the data. The coordination for dealing with the metadata and to support synchronization. The SCFS Agent that executes the majority of the SCFS functionality, and co mpares to the record framework customer mounted on the client machine. The div ision of record

53

ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57 informat ion and metadata has been frequently used to permit parallel access to documents in parallel record frameworks. SCFS exp loits the same idea. In [10] authors have applied the concept of RS code to implement reliable file transfer on top of IoT protocol stack especially for CoAP (Constrained Application Protocol). The researchers have experimented with external cloud services in the literature surveyed in the paper. As the storage devices are getting cheaper day by day and also computing devices are available at reasonable cost establishing private mu ltiple cloud services within local network also can be beneficial when it co mes to storage efficiency and latency involved in external cloud services. Next section discuss the proposed approach followed by implementation and result analysis. 3. PROPOSED W ORK In this section, proposed work is described in detail and discuss how to break the file and distribution is explained and after that how to recover our file after loss or corruption of the file.

Figure 3.1: Cloud of Cloud with four cloud service provider In above figure, we give the idea of the proposal through the diagram. We take four open source cloud storage service provider and install it in neutralized Linu x based machines and develop a single interface through which, the file is distributed to all the open source cloud storage service provider, and if some of the cloud storages are corrupted then we can generate the orig inal file using remain ing cloud storage services. Here all the implementations are expermented in local network. For data reliability, best coding techniques are ReedSolo mon coding. It is error correcting coding. It is able to detect and correct mult iple symbol error. We required an easy, reliable, and efficlient Java library to do Reed-So lo mon coding. Reed-So lo mon coding is divided into 2 part (1) Encoding (2) decoding. Th is code runs in eclipse. During the encoding process, our file is equally divided into six parts. Where four block is an equal division of orig inal file and last two blocks are parity bits. At decoding side, this will take all six block and generate original file. If we delete any files after encoding, decoding code re-creates that deleted file using Inverse matrix. So Reed-Solo mon code provides reliability. If the server goes down or some data will be deleted fro m server side then we recreate the original file.

IJCSCN | June-July 2017 Available [email protected]

The proposed work was tried storing the encoded file pieces on single mach ine as well as storing the file on different servers using network file system (NFS). A lso files were scattered across four open source cloud storage services namely Tonido, Seafile, SynC and Syncthing. 4. IMPLEMENTATION In the experimentation, we get four clouds and the files are sent into all four cloud with encryption and without encryption and all these clouds are stored on local network servers. In another scenario all files are stored on the single mach ine fro m where the client is executed. In the third scenario all files are placed on Linu x based servers using NFS service. Here client runs NFS client applicat ion while dispersing the encoded and encrypted file pieces. It was also observed how much time it takes in storing and retrieving the files in all three scenarios with encryption and without encryption. Here we briefly summarize open source cloud storage service provider services. Tonido is Open source cloud storage service. Installing process of Tonido is very simp le and after installing Tonido we have to register an account for using service of Tonido tool, after that, we get unique user id through which we access our tool and after that, we share the file after encoding of the file. After that we can access our file fro m any system locally that is both the system are connected in the same network. Seafile is a record facilitating d istributed storage model to store file. You can synchronize records and in formation with personal co mputer and cell phones effectively or utilize the server’s web interface for dealing with your informat ion documents. The file can be encrypted by a password chosen by you. Seafile provides free storage service to store file. It is easy to use. It is available for W indows, Linu x etc. SynC is an open source system wh ich provides the storage service. It backup all the data in one place and protect them fro m deletion. It provides end to end encryption. It maintains best class datacentre facility to guarantee your files are always present. It is free and provides end to end file synchronization. It is available for Mac, Linu x, Windows, and Android. Its installation process is easy. All four service Tonido, Seafile, SynC and Syncthing were installed on four different servers to hold the encoded file pieces and later retrieval. 5. RESULT AND A NALYSIS In this section, we analyse our result and compare with existing methods. Here below is table and graph for description of time taken by the file of different size to encode and decode them and also analyse time taken to distribute the file to many locations. Here in below table 5.1, we analyse the time taken to encode the different file size, that is we apply the ReedSolo mon encoding method to break the file into six part and this Whole process takes time as we see that file o f 10 M B size takes time 172 millisecond and 100 MB size takes 1299 millisecond and so on. Approach Encoding

File(M B) 10 100 200 300 400

Time(MS) 172 1299 2255 3256 4563

54

ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57 Table 5.1: Time taken to encode the different file size

Figure 5.2: co mparison for time taken to encode and decode the different file size As shown in graph, it’s clear that time taken to decode file when one file is deleted is less than the time taken to decode the file when two files are deleted. 5.3 Time taken to distribute _le locally Here belo w table 5.4, we analyse the time taken to distribute the different file size, through DepSky [4] and we calculate the time taken by different file size to distribute the file on different folders stored on the local machine without encryption.

Figure 5.1: Time taken to encode the different file size Here in below table 5.2, we analyse the time taken to decode the different file size, that is we apply the ReedSolo mon decoding method to recover the original file and in this table, we analyse the result when one part out of six-part file is deleted and this whole process takes time as shown in following table. Approach Decoding

File(M B) 10 100 200 300 400

Time(MS) 135 1120 2011 3108 42626

Table 5.2: Time taken to decode the file when one file is deleted Here in below table 5.3, we analyse the time taken to decode the different file size, that is we apply the ReedSolo mon decoding method to recover the original file and in this table, we analyse the result when t wo parts out of six-part file is deleted and this whole process takes time as shown below. Approach File(M B) Time(MS) Decoding 10 243 100 1960 200 3521 300 5322 400 7001 Table 5.3: Time taken to decode the file when two files are deleted

Approach Local

File(M B) Time(MS) 10 90 100 470 200 780 300 1240 400 1766 Table 5.4: Time taken to distribute the file without encryption locally Here in below table 5.5, we analyse the time taken to distribute the different file size, through DepSky [4] and we calculate the time taken by different file size to distribute the file with encryption on different fo lders stored on the local mach ine. Approach Local

File(M B) Time(MS) 10 210 100 858 200 1341 300 2040 400 2733 Table 5.5: Time taken to distribute the file with encryption locally

Figure 5.3: co mparison for time taken to distribute the different file size with encryption and without encryption As show in the above graph, it co mpares the time taken to distribute the file on different folders on local mach ine with encryption and without encryption. In table 5.6, we analyse the time taken to distribute the different file size, through DepSky and we calcu late the time taken by different file size to distribute the file with encryption on network file system.

IJCSCN | June-July 2017 Available [email protected]

55

ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57 Approach NFS

File(M B) 10 100 200 300 400

Time(MS) 230 985 1520 2230 2863

Next we analyse the time taken to upload and download the file for open source cloud storage services like Tonido, Seafile, SynC and Syncthing. The following table shows the time taken for upload and download using Seafile service in local network based server. File(M B)

Table 5.6: Time taken to distribute the file with encryption on NFS server Approach NFS

File(M B) Time(MS) 10 170 100 715 200 1140 300 1760 400 2165 Table 5.7: Time taken to distribute the file without encryption on NFS server In table 5.7, we analyse the time taken to distribute the different file size, through DepSky and we calculate the time taken by different file size to distribute the file without encryption on network file system.

Upload Download Time(ms) Time(ms) 10 2320 1070 100 20200 11100 200 38300 21200 400 68000 41000 1000 147600 103000 Table 5.9: Time taken to upload and download the different file size in Seafile The next table shows the time taken for upload and download using Tonido service in local network based server.

File(M B)

Upload Download Time(ms) Time(MS) 10 2120 172 100 18300 9200 200 36700 17600 400 67100 33200 1000 132400 111200 Table 5.10: Time taken to upload and download the different file size in Tonido Table 5.11 and table 5.12 shows the time taken for upload and download using Syncthing and SynC service in local network based server for various file sizes.

File(M B)

Figure 5.4: co mparison for time taken to distribute the different file size with encryption and without encryption As show in the above graph, it compares the time taken to distribute the file on network file system with encryption and without encryption. 5.5 Time taken to distribute _le on Amazon In table 5.8, we analyse the time taken to distribute the different file size, through DepSky and we calculate the time taken by different file size to distribute the file with encoding on Amazon cloud service. The time taken depends on the realistic conditions of Internet at the t ime of experimentation and also on the available bandwidth to the end user. Approach Internet

File(M B) Time(MS) 10 43000 100 530000 200 1060000 300 1760000 400 2640000 Table 5.8: Time taken to distribute the file with encryption on Amazon

IJCSCN | June-July 2017 Available [email protected]

Upload Download Time(ms) Time(MS) 10 2230 1156 100 22300 10360 200 43600 19620 400 88200 37240 1000 178400 132300 Table 5.11: Time taken to upload and download the different file size in Syncthing File(M B)

Upload Download Time(ms) Time(MS) 10 2010 1051 100 19100 9510 200 39300 18020 400 76100 35300 1000 148200 109000 Table 5.12: Time taken to upload and download the different file size in Syn C In figure 5.5, we show that how much time is taken to upload the different file size fo r different cloud storage service provider. Here, we analyse four different cloud storage services that is Seafile, Tonido, Syncthing and SynC. As shown in the figure, Tonido takes less time than all other cloud services.

56

ISSN:2249-5789 Shubham Singh et al, International Journal of Computer Science & Communication Networks,Vol 7(3),52-57

[1]

Figure 5.5: co mparison for time taken to upload the different file size

[2]

H. C. Chen, Y. Hu, P. P. Lee, and Y. Tang, “NCcloud: A networkcoding-based storage system in a cloud-of-clouds”, IEEE Trans. Computers, vol. 63, no. 1, pp. 31-44, 2014.

[3]

Y. Ma, T. Nandagopal, K. P. Puttaswamy, and S. Banerjee, “ An ensemble of replication and erasure codes for cloud file systems," in INFOCOM, 2013 Proceedings IEEE, pp. 1276{1284, IEEE, 2013.

[4]

A. Bessani, M. Correia, B. Quaresma, F. Andre, and P. Sousa, “Depsky: depend- able and secure storage in a cloud-of-clouds”, ACM Transactions on Storage (T OS), vol. 9, no. 4, 2013.

[5]

J. Li and B. Li, “Erasure coding for cloud storage systems: a survey”, T singhua Science and T echnology, vol. 18, no. 3, pp. 259-272, 2013.

[6]

W. Wang, P. Li, L. Han, S. Huang, K. Xu, C. Yu, and J. Lei, “ An enhanced erasure code-based security mechanism for cloud storage”, Mathematical Problems in Engineering, vol. 2014, 2014.

[7]

H.-Y. Lin and W.-G. T zeng, “A secure decentralized erasure code for distributed networked storage”, IEEE transactions on Parallel and Distributed Systems, vol. 21, no. 11, pp. 1586-1594, 2010.

[8]

A. Bessani, J. Brandt, M. Bux, V. Cogo, L. Dimitrova, J. Dowling, A. Gholami, K. Hakimzadeh, M. Hummel, M. Ismail, et al., “ BioBankCloud: a platform for the secure storage, sharing, and processing of large biomedical data sets”, in VLDB Workshop on Big Graphs Online Querying, pp. 89-105, Springer, 2015.

[9]

A. N. Bessani, R. Mendes, T. Oliveira, N. F. Neves, M. Correia, M. Pasin, and P. Verissimo, “ SCFS: A shared cloud-backed le system”, in USENIX Annual Technical Conference, pp. 169-180, 2014

6. CONCLUSION AND FUTURE SCOPE In this paper, we present the plan and assessment of cloud of cloud approach, a strategy that enhances the accessibility and confidentiality g iven by mu ltiple cloud services. The framework acco mplishes these goals by building cloud-of clouds on top of an arrangement of storage cloud, consolidating cryptography and erasure codes as well as certain quality of service provided by the specific cloud services. The key conclusion is that it provides fast data transfer service when installed in local network with min imu m latency and same reliability and confidentiality as provided by DepSky, SCFS and other such imp lementations. In future, work can be extended with the application of more co mp lex RS codes having larger number of nodes involvement in during coding and decoding of the files which are to be scatted on the nodes in the network.

IJCSCN | June-July 2017 Available [email protected]

REFERENCES H. Abu-Libdeh, L. Princehouse, and H. Weatherspoon, “ RACS: a case for cloud storage diversity”, in Proceedings of the 1st ACM symposium on Cloud computing, pp. 229-240, ACM, 2010.

[10] G. Raval, F. Suthar, “ Securing Application Layer Protocol For IoT”, in International Journal of Computer Science and Communications, Vol 7, No. 2, pp. 42-48, 2016.

57

Suggest Documents