A User-space File System for On-demand Legacy ...

1 downloads 28992 Views 563KB Size Report
This is a common case now: Content Delivery Network has been widely used for software downloading. As disclaimed by Akamai[14], the world leading CDN.
SCIENCE CHINA February 2010 Vol. 53 No. 1: 1–18 doi:

A User-space File System for On-demand Legacy Desktop Software Zhang Youhui1,2 ∗ , Su Gelin1,2 & Zheng WeiMin1,2

2Tsinghua

1Department of Computer Science and Technology, Tsinghua University National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China,

Received August 22, 2010

Abstract Some user-level virtualization technologies have been used to convert legacy software (like the existing Windows desktop applications) into the on-demand software without any modification. To give the client a friendly and compatible method to access on-demand legacy software across the Internet, this paper presents a client-end file system for this usage case. It is a Windows user-space file system based on the cloud storage (where the on-demand software is stored), which converts the local file system accesses into remote visits. Quite a few optimizations are adopted and adjusted to suit the file-access-pattern of on-demand software, including the local cache, metadata/data/software pre-fetch and CAS (Content-Addressable Storage), to decrease the number of remote visits and/or to overlap IO operations with software execution. Detailed access-pattern analyses are also presented. This file system has been implemented and tests show that it is practical for much daily-used software—a local cache of limited size can provide up to 80% hit ratio and the corresponding overhead of running-time is about 37%. Owing to this method, on any compatible and networked computer, a user can use his/her personalized software conveniently although it does not exist on the local host. Keywords

1

on-demand software, software as a service, user-space file system, cloud storage

Introduction

On-demand software [1] is a software distribution model in which applications are hosted by a service provider and available to clients over the network. Traditional on-demand software often belongs to Web applications, including not only the enterprise-level software, but also some applications for common users (like Google Docs and Spreadsheets [2] and so on). Recently, some virtualization technologies are adopted to transform legacy desktop software into the on-demand mode without any modification of source code, like Softgrid [3], Virtual Desktop [4] and PDS [5]. From the technical viewpoint, such software runs in a virtualization environment. During run-time, the virtualization environment intercepts some resource-accessing APIs issued from software, including those that access system registry, files/ directories, environment variables, etc., and redirects them to the resources stored in the network as needed. Then, the client can directly use the existing desktop software as a network service, which is stored in the network and running on the local computer as needed (it is different from the thin-client computing that executes the software on the server while the local computer ∗ Corresponding

author (email: [email protected])

2

Zhang Y, et al.

is only used as a GUI). However, no existing solution is designed to deliver the legacy software across the Internet. Here, a Windows client-end file system for on-demand software is designed and implemented because Windows OSes still dominate the desktop market. This design contains the following contributions: 1. A user-space file system The user-space file system works as a proxy of file system accesses: file operation requests from ondemand software to the Windows I/O subsystem will be forwarded to the corresponding user-space callback functions that visit real data in the cloud and send results back. Although a user-space file system usually suffers from some performance loss, it reduces the development complexities. And more importantly, it is a flexible solution dependent on the OS to the minimum extent. 2. Content-addressable storage (CAS) CAS [6] uses cryptographic hashing to reduce storage requirements by exploiting commonality across multiple data objects. Then, a CAS-enabled storage can identify sets of identical blocks and only store one copy even though higher-level applications may maintain multiple instances. Our file system employs this technology to reduce the number of remote visits. Moreover, because CAS identifies each data block based on its content, it is also used to make sure that the data (and then software) is intact. 3. Local Cache Based on the detailed analyses of access patterns of on-demand software, we found that, for much commonly-used desktop software, the most frequently-used files belong to those accessed during the startup process, which only occupied a limited ratio of the whole storage capacity. Then, one local cache with limited space can achieve a fairly high hit rate to improve the access performance remarkably. 4. Extensive usages of pre-fetch Three levels of pre-fetch mechanisms are used. Meta-data all metadata of the remote on-demand software is pre-fetched as the client-end starts up, which will be updated as necessary during the follow-up running time. Data the detailed analyses show that, access patterns of software data are identical whether it is installed locally or accessed on-demand. So that, a pre-fetch mechanism is designed based on the historical access information. File similar to the file-level pre-fetch of Window XP [7] (and the later OSes), our file system can pre-fetch correlation files when any on-demand software is being launched, which is also based on the historical usage information. The file system has been implemented to access the back-end storage via the HTTP protocol, which is supported by some service providers of cloud storage, like Amazon’s S3 [8]. And extensive tests show that this solution is practical for much daily-used software. We first present the related work; and the design philosophies are presented in detail in Section 3, including the concrete file-access-pattern analyses and the related optimizations. Section 4 introduces the implementation and presents performance tests. Finally, the summary and future work are given.

2 2.1

Related work On-demand software

On-demand software (also frequently called Software as Services or software streaming) is regarded as the future usage mode for software. Now, most on-demand software is web-based applications, and a

Zhang Y, et al.

Sci China Inf Sci

February 2010 Vol. 53 No. 1

3

web browser is usually employed as the running platform with the collaboration from a remote server. Therefore, the existing desktop software cannot be used in this mode directly. Some implementations employ the application-level virtualization. SoftGrid [3] can convert applications into virtual services that are managed and hosted centrally but run on demand locally, which reduces the complexity and labor involved in deploying, updating, and managing applications. Similarly, Citrix’s Virtual Desktop [4] can deliver applications to remote clients on demand. [5] gives Progressive Deployment System (PDS) , which is a virtual execution environment and infrastructure designed for deploying software on demand. PDS intercepts a selected subset of system calls on the target machine to provide a partial virtualization at the operating system level. Our previous work [9][10] provides a solution for on-demand software based on lightweight virtualization and p2p transportation technologies. We also developed a fast deploy mechanism [11] of desktop software in a VM-based cloud environment (like EC2). Another interesting research is Xax [12]. It is a browser plug-in that enables developers to leverage existing tools, libraries, and entire programs to deliver feature-rich applications on the web. Based on a narrow syscall interface, Xax implements an abstraction layer that provides a consistent binary interface across operating systems, as well as the memory-isolated native code execution. But for Xax, the source code of legacy applications and the existing tool chain have to be modified. Compared with our previous research, this work designs a client-end user-space file system to deliver on-demand software to users across the Internet. And the achievements in [9][10] are used to convert legacy Windows applications into the on-demand version.

3

Design Philosophies

The on-demand software can run without installation as its virtualization runtime system provides all necessary resources transparently. Then, an important issue is how to design a friendly and compatible delivery solution. We design a user-space file system for cloud storage to reach this target. In another word, sets of legacy Windows desktop software are stored in the cloud storage and are presented as files / folders on a virtual local drive in the client machine. 3.1

The file system framework

We use Dokan [13] as our file systemąŕs basement. The kernel proxy driver will intercept any file accesses of the virtual user-space file system (which is mounted as a local virtual drive), and then retransfer them to the user-level file system program, where some registered callback functions complete corresponding operations respectively. When a file is read, the access will be redirected to the remote position to fetch data (the local cache and pre-fetch are both employed, which will be described later). If there is any write, a copy-on-write (COW) operation is triggered, which means the whole remote file will be fetched to the client at first, and then any subsequent operation can happen locally. Moreover, the file system program maintains three lists: 1. List of remote files/folders (abbreviated as remote list) Any file/folder only located in the Internet belongs to this list. 2. List of new or modified files/folders (abbreviated as new list) It contains all created or modified files/folders during the running time. Owing to the COW mechanism, all of them are stored locally. 3. List of deleted files/folders (deleted list)

4

Zhang Y, et al.

When the file system is launched the first time, new list and deleted list are both empty while all files/folders of the on-demand software belong to remote list. During the runtime, when any file/folder is erased, it will be moved into deleted list. Modified or created files will be moved or added into new list. The detailed process is presented in the "Callback functions" section. As the file system is being unmounted, the content of files/folders in new list will be transferred to a remote position (reserved for every client) hosted in cloud; then all lists are merged correspondingly (because all files/folders now are located in the cloud) and also stored in the reserved cite, as well as all metadata. So the next time, the client can reach his/her latest data. Moreover, all files/folders/lists and the local cache (mentioned in Section 3.4) can be stored at the client end if the client always uses one fixed machine; or they can be stored on a portable storage device, so that the client can access them on any machine. 3.2

Callback Functions

For any callback function, the kernel proxy driver provides the full path of the target file/folder as an argument. So we can use this information to classify it and adopt the corresponding strategy as described in subsection 3.1. The process flows of main functions are presented as follows: 1. CreateFile There are four sub-routines in this case: Routine 1 As a new file is created, its full path will be inserted into new list and removed from deleted list (if existing). Routine 2 If a file in remote list is opened and emptied, it will also be inserted into new list and removed from remote list. Routine 3 If a file in new list is opened, all subsequent operations will complete at the client end. Routine 4 If a file in remote list is opened, how subsequent operations are handled depends on the concrete operation-type. 2. WriteFile Two sub-routines. Routine 1 If the target is located in new list, the written content will be saved at the client end. Routine 2 For a remote one, it will be fetched to the client and regarded as a new file, and then the operation is done in the same way as Case 1. 3. ReadFile Routine 1 If the target is in new list, the operation is completed locally. Routine 2 For a remote one, data will be fetched from the backend while some optimizations are employed as mentioned in Section 3.4. 4. FindFiles Because metadata of remote files is pre-fetched during the start-up time (which will be described in Section 3.4) and all new files are located in the client, it is completed on-site. 5. SetFileAttributes / GetFileAttributes Because metadata of remote files has been pre-fetched and all new files are located in the client, both are completed on-site. 6. MoveFile Four sub-routines.

Zhang Y, et al.

Sci China Inf Sci

February 2010 Vol. 53 No. 1

5

Routine 1 Both files belong to remote list. The source file’s name is moved into deleted list while the destination is moved into new list. Then the COW strategy is used, which means the source content is fetched from the backend and saved locally with the destination name. Routine 2 The source file belongs to remote list and the destination is new. The source file name is moved into deleted list. And the COW strategy is used as Case 1. Routine 3 The source belongs to new list and the other is remote. The source is removed into deleted list while the destination is removed into new list. Routine 4 Both belong to new list. The source is removed into deleted list. 3.3

Remote Access

Based on the previous content, it is apparent that only three kinds of file-level remote access interfaces are needed by our file system: 1. ReadWholeFile (string full-path-of-the-file) 2. WriteWholeFile (string full-path-of-the-file) 3. ReadPartOfFile (string full-path-of-the-file, int offset, int len) We implemented two types of remote-access protocols: file-level and block-level. 1. File-level This is straightforward. It communicates with the backend through the HTTP protocol supported by S3 [8], which is the Amazon’s notable cloud storage service. In S3, data is organized as objects in a bucket identified by unique IDs. Therefore, any file of on-demand software is regarded as an object in S3 and is identified by its full path name. Its URL looks like http://server-address/portablesoftware /full...path/filename. For any new or modified file of a user, its naming style is different. For example, if a use ,Su, creates a new file (/program files/app/filename), the URL looks like http://server-address/portablesoftware/ su/program-files/app/filename. In addition, for each user, the metadata of his/her software, combined with the above-mentioned lists, is stored in a special position, http://server-address/portablesoftware/username/metadatalist. In summary, all users share software stored at http://server-address/portablesoftware; each has the private space for any new or modified files to avoid write conflicts, as well as for metadata. Then, ReadWholeFile can be implemented with the HTTP GET method while ReadPartOfFile is achieved with the same method using the Range Header Field. For WriteWholeFile, the HTTP PUT method is employed. 2. Block-level For the file level access, no CAS is used, which is enabled in the block level. Here the concept of file recipes from CASPER [6] file system is employed. A file recipe is a first-class file system object containing content hashes that describe the data blocks composing the target file. So, applications can reconstruct the file content from available CAS data blocks based on recipes. All of the recipes and data blocks and other metadata form the bottom level of the whole storage hierarchy. Therefore, before on-demand software is delivered, all of its files are divided into fixed-length blocks and the hash value (an md5-based ID) of each block is computed. Then, hash values, the relationship between blocks and files, and other metadata are packaged into recipes and placed in the cloud as well as the data blocks (rather than the original files). In our case, the CAS storage consumption accounts for about 84% of the original. Based on recipes, the file-level access can be converted into block-level.

6

3.4 3.4.1

Zhang Y, et al.

Optimizations CAS

At first, the system transforms the file-level access into CAS-enabled block-level access(es), as mentioned in subsection 3.3. And then the local cache is visited. If hit, the cached data will be sent back; otherwise, it visits the remote storage. If the target file is in new list, the local cache and CAS will be bypassed, because any file in new list is stored locally as mentioned in subsection 3.1 and 3.2. 3.4.2

Local cache

We analyzed the file-access-pattern of the running instances for some frequently-used desktop software. It is found that, for common operations (it is difficult to classify which operation is common or not. Here, we define common operations as those open/edit/search/replace/close/playback/network-transfer operations owned by most software.), only a limited ratio of files will be accessed, compared with its large storage consumption. For example, the following frequently-used software has been converted into on-demand versions: OpenOffice, PhotoShop, Lotus Notes, VLC (a powerful media player), 7Zip, UltraEdit, ClamWin (an anti-virus program), FileZilla, Gimp (an open source picture editor), Acrobat Reader. We find for most software, the ratio of the amount of data accessed during the running process to the whole capacity is less than 40%. The detailed measurement method is presented in Section 4. Based on this observation, we design a local block-level cache for the frequently-accessed data, and its replace strategy is also based on the usage frequency of files. In other words, data of frequently-used file has a higher priority. In detail, its kernel data structure is a map; the CAS ID of the wanted data is used as the access key. If hit, the cached data on the local disk will be visited; if missed, data will be gotten from the network and then cached according to the frequency-based replace strategy. In addition, the usage count of the corresponding file increases. The first time, the cache is empty and then the run speed is fairly slow, especially the start-up performance, which it is dominated by read overheads. During the run time, the cache is filled according to the usage frequencies of data. Then for the following runs, the performance is improved as reads will be partially hit in the local cache. Because the amount of the frequently-used data is limited, a small-size cache (less than 160MB) can get an 80% hit rate for the all software mentioned above. The concrete results are presented in Section 4. 3.4.3

Metadata pre-fetch

When a user launches the file system first time, the file system program contacts the remote server for login. Then based on the user ID, it gets all metadata of on-demand software that contains the following information: Full paths of all files and folders (relative to the mounted virtual drive’s root folder); The attribute, size, creation-time, last-access-time and last-write-time of all files/folders. For the block-level access, the recipe of each file is also contained. Then, the block-level occupies more storage space than the file-level metadata. For all on-demand software in our test, the size of the latter is about 2.8MB and that of the former is about 160KB, while all software occupies more than 661MB space. The received metadata is stored on the client. Therefore, any metadata access can be completed locally, which speeds up the corresponding operations (like browsing directory, etc.) remarkably. 3.4.4

Data pre-fetch

Here data pre-fetch means that when some data is being fetched from the network, its successive (and uncached) data will be read remotely in the background although it has not been required.

Zhang Y, et al.

Sci China Inf Sci

February 2010 Vol. 53 No. 1

7

It is another potential method to reduce the number of accesses across the Internet. And its efficiency depends on the concrete access mode: for sequential accesses, it will be highly efficient. We have studied the access behavior on any single file. For a given file, two arguments are defined: a is the ratio of the number of sequential reads to the total read-number, and b is the ratio between read amounts. The greater the value of a or b is, the better the effect of pre-fetch is. We consider a file as sequential if and only if values of its a and b are both more than a given threshold. Then, to improve the accuracy of pre-fetch, we only adopt it for these sequential files. Therefore, how to set the threshold and the pre-fetch size is the key issue. We find that these ratios are usually not less than 50%, and it is optimal to set the threshold at 66% and set the pre-fetch distance at 32KB. Detailed tests are presented in the next section.

4

Prototype and Tests

4.1

Prototype

We have implemented the prototype using VC 2005. The work flow of this prototype is: 1. The user launches the client-end shell program, which logins the storage server to get metadata of his/her customized software, and inserts the kernel driver, registers callback functions for the user-level file system to mount a virtual disk drive. 2. After initialization, its GUI is presented and the user can click the icons to launch software ondemand. 3. The user selects a program on the GUI to start. 4. During the start-up progress, a wrapper DLL is injected into the target process’ virtual address space to interpret the related APIs to establish the virtualization environment. 5. During the running process, all registry/file system accesses of the target process are handled as described in Section 1. For more details, please refer to our previous work [10][11]. 6. As the program exits, all modifications of the system registry and the file system are stored on the virtual drive. 7. When the shell program exits, all modifications of the virtual drive are uploaded to the server. Then the user can access his/her customized data next time. 4.2 4.2.1

Performance Tests Test Methods

Run time of applications is measured. We implement a special program that can record the user’s inputs of the keyboard and mouse and replay them after start-up. Then the execution of on-demand software can be automated this way. Based on this method, we design some scripts to control software to complete a series of operations, which looks like triggered by a real user. For example, for media player software, a local video file is opened and played for several seconds before exiting; for word processing software, one document is created and compiled for several seconds and saved before termination. Moreover, between any two continuous operations, some random waiting time (less than 0.3 second) is inserted to simulate the human’s behavior. The elapsed time since the software was being launched is logged as the run time.

8

Zhang Y, et al.

4.2.2

Test Environments

Client PC The client platform for test is a Windows Vista PC, equipped with 2 GBytes DDR2 SDRAM and one Intel Core Duo CPU (1.86GHz). The hard disk is one 160 GBytes SATA drive. It uses one 100M Ethernet adapter to access the Internet. Web Servers The client machine should cross the Internet for data. So where to place the server is decisive for the performance. Two cases are considered. In Case 1, it is assumed that some edge server can be found to provide the download service; therefore the web server is located in the CERNET (Chinese Education & Research Network, is the second largest network backbone in China. Now there are about 1,500 universities and institutions connected) as well as the client PC. This is a common case now: Content Delivery Network has been widely used for software downloading. As disclaimed by Akamai[14], the world leading CDN provider, most visit requirements can be fulfilled by some edge server(s) just a single-hop away. The network throughput between the client and this server is about 1.89MBps and the average response time is about 6ms, which are tested by Qcheck [15], a free and professional network benchmark program. In Case 2, the server is located outside the CERNET. The throughput is 998KBps while the average response time is 32ms. Two Windows 2003 Web servers, equipped with one Intel Core 2 Duo E4500 CPU (2200MHz), 2 GBytes DDR2 SDRAM, and one 240GBytes SATA II disk, are used for these two cases respectively. 4.2.3

Test cases

Following cases are tested: 1. The original run time (software is saved in one local disk). 2. The run time based on the user-space file system (software is saved in one local disk, which is also mirrored as a virtual drive; then the software is launched from this drive.) 3. The run time based on the user-space file system visiting the remote server located inside the CERNET; no cache, no pre-fetch; 4. The server is located inside the CERNET; the cache hit ratio is 20%, no pre-fetch; 5. The server is located inside the CERNET; the cache hit ratio is 40%, no pre-fetch; 6. The server is located inside the CERNET; the cache hit ratio is 60%, no pre-fetch; 7. The server is located inside the CERNET; the cache hit ratio is 80%, no pre-fetch; 8. The pre-fetch size is 32KB and its threshold value is 66%; inside the CERNET; no cache; 9. The pre-fetch size is 32KB and its threshold value is 66%; inside the CERNET; the cache hit ratio is 20%; 10. The pre-fetch size is 32KB and its threshold value is 66%; inside the CERNET; the cache hit ratio is 40%; 11. The pre-fetch size is 32KB and its threshold value is 66%; inside the CERNET; the cache hit ratio is 60%; 12. The pre-fetch size is 32KB and its threshold value is 66%; inside the CERNET; the cache hit ratio is 80%; Case 13∼Case 22: The server is located out of the CERNET and other conditions are the same as those of Case 3∼Case 12. All cases mentioned above work on the block level.

Zhang Y, et al.

Sci China Inf Sci

February 2010 Vol. 53 No. 1

Figure 1: Case 3∼Case 7 (the time unit is ms)

Figure 2: Case 8∼Case 12 (the time unit is ms)

Figure 3: Case 13∼Case 17 (the time unit is ms)

Figure 4: Case 18∼Case 22 (the time unit is ms)

4.2.4

9

Test results (in Figure 1∼4)

We can see that the user-space file system itself introduces 34% extra overheads averagely (comparing Case 1 with Case 2), as the system causes more context-switch operations. Comparisons of all of the following are carried out with Case 1. The system introduces about 212% extra run time on average in Case 3 (in the CERNET, no cache) and 809% in Case 13 (out of the CERNET, no cache). When the cache-hit ratio is 80%, the corresponding results are 41% (Case 7, in the CERNET) and 70% (Case 17, out of the CERNET) respectively. So, the network performance largely determines program behaviors; and local cache is a highly-efficient method to improve the performance. When the pre-fetch is enabled, in the CERNET (case 8, no cache), the extra run time on average is about 195%; outside the CERNET (case 18, no cache), the corresponding value is about 690%. If the hit ratio is 80%, the corresponding results are 37% (Case 12, in the CERNET) and 67% (Case 22, out of the CERNET). As mentioned in subsection 3.4, the most frequently-used files only occupied a limited ratio of the whole capacity. In our tests, one local cache of 160MB can reach the hit-ratio of 80%. Then, for much frequently-used software, after several runs, the corresponding extra run time is 37% if some edge server is used as the storage server; for the server outside, the extra run time is 67%. Moreover, we believe that, in reality the situation will be better—the runtime here includes both the start-up time and normal execution time, and the former introduces more overheads relatively as it is a procedure with intensive data accesses and without any manual operation. Therefore, when only the normal execution process is considered, its overhead will be much less, which is a key measure of what it feels like to use the system for everyday work.

10

5

Zhang Y, et al.

Conclusion and Future Work

This paper proposes a solution to stream legacy desktop software on-demand from the cloud storage to the client’s PC with the file-system interface, which eliminates the need to install the application on the client’s own computers and simplifies maintenance. From the viewpoint of a user, he/she can use personalized software anytime although it does not exist on the host. A user-space file system is designed and optimized for this usage case. In this file system, extensive pre-fetch methods for data/metadata/files are adopted, as well as a local cache designed according to the file access-patterns of software. Moreover, CAS mechanism is employed here to improve the storage and transfer efficiency and check the integrity of contents. Tests show that it is practical for much daily-used software. The current implementation is focused on the client-end design. In the next step, we intend to replicate the user’s file in the server side and adopt some optimized replication method [16] to improve the performance. Moreover, now we use some operations artificially generated to simulate person behaviours. We plan to collecte running traces from volunteers for more accurate simulation to guide the optimizations better.

Acknowledgements The work was supported by the High Technology Research and Development Program of China (Grant No. 2008AA01A201), the National Grand Fundamental Research 973 Program of China (Grant No. 2007CB310900).

References 1 Kirk Beaty, Andrzej Kochut, Hidayatullah Shaikh. Desktop to cloud transformation planning. Proceedings of 2009 IEEE International Symposium on Parallel & Distributed Processing. Rome, Italy, May 23-29, 2009. 2 Google App Engine, http://www.google.com/apps/intl/en/business/index.html. 3 http://www.microsoft.com/systemcenter/softgrid/default.mspx. 4 http://www.citrix.com/English/ps2/products/product.asp?contentID=186&ntref=prod_top 5 Bowen Alpern , Joshua Auerbach, et al. PDS: a virtual execution environment for software deployment. Proceedings of the First ACM international conference on Virtual execution environments, March, 2005. 6 Niraj Tolia, Michael Kozuch, Mahadev Satyanarayanan, Brad Karp. Opportunistic Use of Content Addressable Storage for Distributed File Systems. Proceedings of USENIX 2003 Annual Technical Conference. June 9-14, 2003. Texas, USA. Pp. 127-140. 7 Kernel Enhancements for Windows XP. http://www.microsoft.com/whdc/archive/XP_kernel.mspx. 8 Amazon Simple Storage Service (Amazon S3). http://aws.amazon.com/s3/. 9 Youhui Zhang, Xiaoling Wang, and Liang Hong. Portable Desktop Applications Based on P2P Transportation and Virtualization. Proceedings of the 22nd Large Installation System Administration Conference (LISA ’08) San Diego, CA. USENIX Association, November 9-14, 2008. Pp. 133-144. 10 Youhui Zhang, Gelin Su, and Weiming Zheng. Converting Legacy Desktop Applications into On-Demand Personalized Software," IEEE Transactions on Services Computing, 14 Jun. 2010. IEEE computer Society Digital Library. IEEE Computer Society. 11 Youhoi Zhang, Gelin Su, and Weimin Zheng. On demand mode of legacy desktop software and its automatic deployment for Cloud-Computing Environment, Proceedings of the Sixth Workshop on Grid Technologies and Applications (WOGTA 2009), 18-19 Dec 2009, Taitung, Taiwan. pp.25-31. 12 John R. Douceur, Jeremy Elson, Jon Howell, and Jacob R. Lorch. Leveraging legacy code to deploy desktop applications on the web, Proceedings of 8th USENIX Symposium on Operating Systems Design and Implementation. December 8-10, 2008, San Diego, California, USA. pp.339-354. 13 Dokan, user mode file system for windows. http://code.google.com/p/dokan/. 14 Akamai Solutions for the High Tech Industry. Available at http://www-8cc.akamai.com/dl/whitepapers/Akamai-HighTechIndustry-Whitepaper.pdf. 15 Qcheck - Free Network Benchmark Utility. http://www.ixchariot.com/products/datasheets/qcheck.html. 16 Xu P Z, Wu Y W, Huang X M, et al. Optimizing write operation on replica in data grid. Sci China Inf Sci, 2011, 54: 1 ĺC 11, doi: 10.1007/s11432-010-4153-z.