as certain free cloud service providers use personal content for analytical purposes. Also, in order ... uploading and downloading using a web interface. The solution .... system API calls whereas the generation of a link has several steps. The web file ..... Networking (ICOIN), Chiang Mai, Thailand, Thailand, 2018, pp 673-. 678.
Extending Personal Computer Storages for Improved Access over Web Interfaces
W. A. A. S. Wickramarachchi, V. G. Mallawaarachchi Department of Computer Science & Engineering University of Moratuwa Moratuwa, Sri Lanka {anuradhawick, vijini}@cse.mrt.ac.lk
Abstract— At present, sharing of files and other media is a common use case which involves cloud storage services. However, the privacy of cloud services has become questionable as certain free cloud service providers use personal content for analytical purposes. Also, in order to share content over cloud storages, local files must be uploaded to cloud services, even for minor use cases such as directory browsing. This paper presents a novel solution which provides similar sharing capabilities to that of existing cloud services without having to store content in a cloud storage. The solution enables direct content browsing, uploading and downloading using a web interface. The solution also provides the capability to generate links for locally stored content in desktop or personal computers. The presented solution utilizes peer-to-peer networking technologies which are scalable and more secure. Furthermore, the performance measures indicate competitive results in comparison with existing cloud service providers for content sharing. Keywords-File sharing; Cloud storage; Privacy; Networking; Peer-to-peer; Security
I. INTRODUCTION Content sharing over the Internet has become a common practice among the general public due to its convenience and the ability to share content over a wider audience. The most common means of sharing content are using cloud storage services such as Google Drive [1] and Dropbox [2] or synchronization protocols such as Resilio Sync [3]. Cloud storage services require data to be either uploaded to the cloud or to be synchronized so that the external party can download the content. Furthermore, synchronization protocols require specific software to perform the synchronization. Hence, sharing of files through peer-to-peer synchronization is a complicated task with poor user experience. This is because even to share a small file, the user requires to install and configure the specified software. Although cloud storages address the problem of content sharing, they have their own disadvantages and insecurities. Most of the cloud services go through user data and extract information for analytical purposes. Certain cloud services deploy deduplication schemes in order to save storage space actually used by the users. In fact, Google and Dropbox mention these facts in their privacy statements; “the user data will be analysed to provide a better user experience”. Therefore, sharing of sensitive content using cloud services is no longer an option as users cannot afford the risk of
compromising the security of their data [4]. Furthermore, use of synchronization software is inconvenient due to the installation and user account creation. In order to address the issue of privacy in cloud storages, Network-Attached Storage (NAS) devices [5] are being widely used. However, NAS requires users to carry a separate device and these devices do not provide features to share content for remote users. This paper presents a novel solution which enables users to use their computers as cloud storage services, enabling similar access over the Web as found in commonly available online cloud storage services such as Google Drive and Dropbox. Section II outlines the related work in the fields of cloud storage, content sharing and synchronization whereas Section III demonstrates the architecture of the implemented system. Section IV explains the implementation carried out and Section V presents the obtained experimental results and their evaluation. Finally, Section VI concludes the paper with the inferences obtained from the results, future work and emphasizes the importance of the conducted research. II. RELATED WORK A cloud storage allows users to store data with a cloud service provider rather than on a local system where the users can access the data stored via an Internet link [6]. The actual storage is remotely located in a server farm, out of the physical reach of the consumers. Most of the cloud storage services come with a free plan and extensible paid plans. However, the storage in cloud services has become a questionable subject regarding the privacy, as data mining and information retrieval has become an essential requirement of the cloud storage vendors. For example, Google [7] and Dropbox [8] specifically mention that they read and analyse user data and interaction data in order to provide a better service and improved user experience for their clients. Furthermore, Dropbox uses deduplication schemes [9] in order to save storage and manage the storage utilized by users. This process requires the chunk-based reading of user data which can lead to privacy concerns. In order to address the possibilities of compromising the privacy of user data, NAS devices such as Seagate NAS [10] and Western Digital My Cloud [11] have appeared in the market. However, these devices require installation of specific software to operate and should be configured. Furthermore, these devices should be carried around and powered separately with an external power source before they can be
used. Moreover, one could find it difficult to manage and keep content up-to-date between a computer and a NAS device. The use of a separate NAS or a cloud service can be eliminated by using a peer-to-peer synchronization protocol such as Resilio Sync [3]. However, this requires specific software to be installed at all the parties who intend to share files. Furthermore, Box2Box [12] is another application which enables peer-to-peer file sharing. Having the same software setup requirements as Resilio Sync, this solution is not widely adopted. QuickSync is another cloud storage synchronization mechanism presented by Cui et al. [13] but the authors’ work is limited for mobile environments. The use of synchronization requires all the content to be synchronized fully before they can be browsed to look up content. This can cause unnecessary transmission of content for a scenario where only a few sets of files are required to be shared. Furthermore, the cost of synchronization is greater as both the parties need to spend the same amount of Internet quota, regardless of the actual content they intend to view. The article on measurement study of peer-to-peer file sharing systems by Saroiu et al. [14] demonstrates the concerns with file sharing in a peer-to-peer manner. The author’s work indicates that heterogeneity and lack of negotiation between peers can cause problems such as dropping of connections and other overheads. Bittorrent [15] is one of the most famous file sharing schemes which supports peer-to-peer file sharing. The work conducted by Pouwelse et al. [16] identifies design issues and performance metrics of this scheme. However, despite the advantages of the Bittorrent scheme, it is not feasible to adopt this scheme to share files in an ad-hoc manner. This scheme requires specific software to download, upload and keep track of the availability of the file. Therefore, the Bittorrent scheme is inefficient in personal file sharing, although it became very successful in worldwide sharing of media. Work performed by Chen et al. [17] presents cloud computing security issues such as data protection and proposes unified security measures. Depot presented by Mahajan et al. [18] is a cloud storage mechanism which minimizes the assumptions made in cloud storage security. Sang-Ho Na et al. [19] presents a framework to make personal
cloud computing much secure. The paper presented by Bocchi et al. [20] provides a comprehensive benchmark analysis on personal cloud storages. This presents performance measures such as deduplication which causes privacy issues among several cloud services. Venus by Shraer et al. [21] is another scheme proposed to ensure verification of content in an untrusted cloud environment. It is important to incorporate the knowledge gained by such research work, in currently available cloud solutions, to make them more secure and private while preserving the performance. As per the conducted study on related work, there have been tremendous efforts in evaluating, improving and deploying file sharing schemes for global file sharing and personal file sharing. However, the prevalent schemes have disadvantages of questionable privacy and tedious configuration processes. Furthermore, majority of the existing solutions that address the issues of privacy, require users to purchase external devices such as NAS. There has been limited research conducted in order to obtain features of cloud storages from one’s own computer, thus by converting the computer itself to provide those features. Moreover, research carried out on the usage of common peer-to-peer technologies which provide cloud storage functionalities such as link sharing for personal computers is limited. III. SYSTEM ARCHITECTURE The implemented system consists of three major components; web file explorer, central server and the clients. There are several modes of communication such as WebSocket message passing, WebRTC [22] JSON streams and WebRTC data channels that are used for communication of content and directory data. Fig. 1 demonstrates the arrangement of the three main components of the system. Clients section consists of the computers that run the client software. Web file explorer component is a web view that provides access to the system from browsers. The central server is the communication mediator for the communication between the client machines and the web file explorer of the system. The peer-to-peer architecture for heavy downloads makes sure that the architecture is highly scalable and can handle a large number
Figure 1. Overall architecture of the system.
of online client instances. The architecture of the web file explorer consists of a static web application that consists of a connection manager in order to parse the messages sent from the central server. Fig. 1 further elaborates the architecture of the central server. The server maintains two databases in order to serve the users in an efficient manner. A Mongo database [23] is used to store the authentication information and a Redis [24] database is used to manage the tokens issued to the users. The Redis database is used to ensure in-memory efficiency to validate tokens and relay communication between the client program and the web application. The central server also renders the static HTML content.
other parameters. The central server relays these messages between the live WebSocket connections.
Figure 3. Authentication and communication via central server.
Figure 2. Architecture of the client software.
Fig. 2 demonstrates the architecture of the client software. The client software has a layered architecture with three layers. The topmost layer handles communication, the second layer handles the authentication process and the bottom most layer handles the operations which accesses the file system API. The metadata database stores file information required to provide search capabilities efficiently. IV. IMEPLEMENTATION The implementation of the solution consists of three main components; a web application, central server and the client application. The communication process of the web application and the client application for directory listing and other command submissions occur through the central server via WebSocket communication. The connection initiation process required by WebRTC is performed by communication of ICE candidates [22] through the central server. The implementation has several key areas that make the solution more secure and user-friendly. A. Authentication The system uses tokens to conduct authentication of file system commands. Fig. 3 demonstrates the communication process between the user and the client application via the central server. The diagram corresponds to the light-weight communication which does not involve heavy communication overheads where WebRTC data channels are used. The messages are sent in JSON format. As a user may have multiple desktop computers which use the solution, in each message the target client application ID is sent along with the
B. Web File Explorer This is the interface which is exposed over the web, allowing users to browse content and perform file modifications. Web file explorer requests are of several types. The user’s end can send requests for the following operations; directory listing, deleting, creating, moving, renaming, compressing, extracting, copying files and folders and generating links to share to remote users. The file operations are handled through the file system API calls. The client application is implemented using Node JS version 8 LTS [25]. The interfaces for the client application were developed using the Electron framework [26], in order to deploy the program across multiple platforms. Fig. 4 demonstrates the user interface of the web file explorer. The tree view on the left side of the interface identifies the connected computers of the user who is logged in. All the functions, except for link sharing, perform file system API calls whereas the generation of a link has several steps. The web file explorer is developed by modifying the Angular File Manager [27] which is an open source project.
Figure 4. Web File Explorer Interface.
C. Link Sharing A unique ID is generated for each of the shared links and an entry is added in the access control database referring both the unique ID and the path of the file. These entries are saved in a NeDB database [28]. Fig. 5 demonstrates the JSON body of the entry saved in the database for the use case of link sharing. The shared link will take the format as given below.
http://www.localcloud.com//
Figure 5. Link share database entry in JSON.
The URL contains the device ID for the user of the client application and the file ID generated uniquely for the particular file. Once a user visits this URL, a separate view is rendered to make a WebRTC powered direct download. This involves several steps. Firstly, the web view initiates a WebSocket connection with the central server. Then the web view communicates the file ID information to the device using the device ID. If the file is found in the client machine, the client application sends ICE candidates along with the confirmation to the web view. Next, the web view generates its ICE candidates upon receipt of ICE candidates of the client application. The generated ICE candidates are sent to the client application. This initiates the peer-to-peer WebRTC data channel, which is end-to-end encrypted. Then the web view sends an acknowledgement along with the file ID. The client application responds to the web view with the stream of data related to the file along with the metadata of the file. The metadata of the file is used to provide information for the user, such as the size of file and progress of the download. Once the download completes, the data channel connection will be disconnected. If the connection fails, the user will be notified and will be given a prompt to retry. Upon confirmation, the connection initiation process starts over. In this scenario, the acknowledgement will be sent along with a flag so that the receipt of data can be continued from the point of failure. The generation of links to share a file is provided using a right click context menu. This context menu enables users to perform basic file operations such as downloading, renaming, moving and compressing. V. RESULTS AND DISCUSSION The implemented solution was evaluated based on the aspects of performance and availability of functionality. Table I demonstrates the performance comparison of the implemented solution with two existing solutions; Dropbox Web View and Google Drive online file browser. TABLE I.
PERFORMANCE COMPARISON BETWEEN IMPLEMENTED SOLIUTION AND AVAILABLE SOLUTIONS
Performance criteria Response time Download speed
Dropbox Web View 320 ms 25,456 kbps
Google Drive Web 310 ms 26,335 kbps
Implemented Solution 220 ms 28,223 kbps
Upload speed
24,720 kbps
24,540 kbps
28,223 kbps
Compress time (100MB File)
N/A
N/A
4,128 ms
According to the values depicted in Table I, it is evident that the implemented solution transmits files in real time. Therefore, the observed speeds are similar for uploads and
downloads. Dropbox and Google Drive provided similar uploads and download speeds. However, this requires files to be uploaded to the cloud storage at the available upload speed before a user starts to download them. Furthermore, none of the vendors provide the capability to compress files online and download them. However, multiple file downloads are rendered as a compressed version which can sometimes confuse the users. Table II demonstrates the times taken for files of different sizes to be transmitted to a target user using Dropbox and the implemented solution. According to Fig. 8, it can be seen that the time taken by Dropbox is nearly twice the time taken by the implemented solution. This is because the synchronization to the target destination only happens after the entire file uploads completely to the cloud storage. TABLE II. File Size (MB) 100 200 300 400 500 600 700 800 900 1000
COMPARISION OF TRANSMISSION TIMES FOR DROPBOX AND THE IMPLEMENTED SOLUTION Transmission time in Dropbox (s) 28 58 88 124 152 174 196 218 240 262
Transmission time in the implemented solution (s) 18 32 42 61 77 88 101 117 132 146
Figure 6. Graph of transmission time vs. file size for Dropbox and the implemented solution.
Table III demonstrates a feature comparison of the implemented solution with currently available cloud storage solutions. Majority of the existing solutions do not provide features such as local availability, multiple file/folder downloads, right click context menus and file compression. Furthermore, some existing services conditionally support features such as management of multiple storages and link sharing facilities. Hence, it is evident that the implemented solution outstands within the comparison by providing a comprehensive set of features with improved usability, better speeds and faster availability of content over existing cloudbased solutions.
TABLE III.
FEATURE COMPARISION OF AVAILABLE CLOUD STORAGE SOLUTIONS WITH THE IMPLEMENTED SOLUTION
Feature Availability of data in the local machine Internet bandwidth utilization Management of multiple storage computers Support link sharing Multiple file/folder downloads Right click context menu Support file compression
Dropbox and Google Drive No
NAS (Other products) No
Implemented Solution Yes
Only for synchronization
If accessed remotely
Should purchase multiple accounts Yes
Should purchase multiple NAS devices which is expensive Yes (Not all products) Yes
Only when a file is being downloaded Just install the program on the user’s computers Yes
Only in Google Drive Only in Google Drive No
[3] [4] [5]
[6] [7]
[8]
[12]
[13]
Yes [14]
Yes
Yes
Yes (Not all products)
Yes
REFERENCES [2]
[10]
[11]
VI. CONCLUSION AND FUTURE WORK The increasing concerns towards the privacy of content have encouraged people to use more secure means of sharing storage such as NAS devices. However, these devices are tedious to handle in terms of management and powering up. The proposed solution, which is an extension of the previous work [29], successfully addressed these issues by providing a similar access to the storage of a user’s own computer by exposing a web interface similar to that of common cloud storage services. Furthermore, the solution provides the capability to share links to content that resides in one’s own computer, without having to use a cloud storage. The work intends to expand and deploy features in order to allow users to share computing capabilities beyond sharing the storage. The system intends to provide a web console that enables executing functions in the host machine remotely and provide inputs to be processed. This will enable users to expose one’s machine for external parties to process data. [1]
[9]
“Google Drive,” 2018. [Online]. Available: https://drive.google.com. [Accessed 5 September 2017]. “Dropbox,” 2018. [Online]. Available: https://www.dropbox.com. [Accessed 5 September 2017]. “Forums - Sync Forums,” 2017. [Online]. Available: https://forum.resilio.com/. [Accessed 04 April 2017]. J. W. Rittinghouse and J. F. Ransome, Cloud Computing: Implementation, Management, and Security, CRC Press, 2016. G. A. Gibson and R. V. Meter, “Network Attached Storage Architecture,” Communications of the ACM, vol. 43, no. 11, pp. 3745, 2000. A. T. Velte, T. J. Velte and R. Elsenpeter, Cloud Computing: A Practical Approach, McGraw-Hill, 2010. “Google Drive Terms of Service,” Google, 16 February 2017. [Online]. Available: https://www.google.com/drive/terms-of-service. [Accessed 10 April 2017]. “Dropbox - Terms,” Dropbox, 08 December 2016. [Online]. Available: https://www.dropbox.com/terms. [Accessed 10 April 2017].
[15]
[16] [17]
[18]
[19]
[20]
[21]
[22] [23] [24] [25] [26] [27]
[28]
[29]
Dropbox, “Changes to our policies,” Dropbox, [Online]. Available: https://blogs.dropbox.com/dropbox/?p=846. [Accessed 03 March 2018]. Seagate, “Personal Cloud Home Media Storage,” [Online]. Available: https://www.seagate.com/in/en/consumer/backup/personal-cloud/. [Accessed 24 December 2017]. Western Digital, “MY CLOUD,” [Online]. Available: https://www.wdc.com/products/personal-cloud-storage/mycloud.html. [Accessed 24 December 2017]. A. Lareida, T. Bocek, S. Golaszewski, C. Luthold and M. Weber, “Box2Box - A P2P-based file-sharing and synchronization application,” in 2013 IEEE Thirteenth International Conference on Peer-to-Peer Computing (P2P), Trento, Italy, 2013, pp. 1-2. Y. Cui, Z. Lai, X. Wang and N. Dai, “QuickSync: Improving Synchronization Efficiency for Mobile Cloud Storage Services,” IEEE TRANSACTIONS ON MOBILE COMPUTING, vol. 16, no. 12, pp. 3513-3526, 2017. S. Saroiu, P. K. Gummadi and S. D. Gribble, “Measurement study of peer-to-peer file sharing systems,” Multimedia Computing and Networking, vol. 4673, 2002. DOI: 10.1117/12.449977. B. Cohen, “The BitTorrent Protocol Specification,” [Online]. Available: http://bittorrent.org/beps/bep_0003.html. [Accessed 2018 March 2]. J. Pouwelse, The Bittorrent P2P File-Sharing System: Measurements and Analysis, Berlin: Springer, 2011. D. Chen and H. Zhao, “Data Security and Privacy Protection Issues in Cloud Computing,” in 2012 International Conference on Computer Science and Electronics Engineering, 2012, Hangzhou, China, pp. 647651. P. Mahajan, S. Setty, S. Lee, A. Clement, L. Alvisi, M. Dahlin and M. Walfish, “Depot: Cloud storage with minimal trust,” ACM Transactions on Computer Systems, vol. 29, no. 4, article no. 12, 2011. S. H. Na and E. N. Huh, “Personal Cloud Computing Security Framework,” in IEEE Asia-Pacific Services Computing Conference, Hangzhou, China, 2010, pp. 671–675. E. Bocchi, I. Drago and M. Mellia, “Personal Cloud Storage Benchmarks and Comparison,” IEEE Transactions on Cloud Computing, vol. 5, no. 4, pp. 751-764, 2017. A. Shraer, C. Cachin, A. Cidon, I. Keidar, Y. Michalevsky and D. Shaket, “Venus: verification for untrusted cloud storage,” in Proceedings of the 2010 ACM workshop on Cloud computing security workshop, Chicago, Illinois, 2010, pp. 19-30. WebRTC, “WebRTC,” 2018. [Online]. Available: https://webrtc.org. [Accessed 24 December 2017]. MongoDB, “MongoDB,” [Online]. Available: https://www.mongodb.com. [Accessed 03 March 2018]. Redis, “Redis,” [Online]. Available: https://redis.io. [Accessed 03 March 2018]. “Node JS,” Node JS, [Online]. Available: http://nodejs.org. [Accessed 03 March 2018]. Electron, “Electron Framework,” [Online]. Available: http://electron.atom.io. [Accessed 03 March 2018]. J. S. Street, “angular-filemanager,” [Online]. Available: https://github.com/joni2back/angular-filemanager. [Accessed 24 December 2017]. S. Robinson, “NeDB: A Lightweight JavaScript Database,” 29 April 2016. [Online]. Available: http://stackabuse.com/nedb-a-lightweightjavascript-database. [Accessed 10 May 2017]. A. Wickramarachchi, D. Atapattu, P. Wimalasir, R. M. Arachchi and G. Dias, "Use of nomadic computing devices for storage synchronization," in 2018 International Conference on Information Networking (ICOIN), Chiang Mai, Thailand, Thailand, 2018, pp 673678.