Short Paper: ‘Virtual P2P Client: Accessing P2P Applications using Virtual Terminals’ Syed Arefinul Haque, Salekul Islam
Jean-Charles Gr´egoire
United International University, Dhaka, Bangladesh Email: {arefin, salekul}@cse.uiu.ac.bd
INRS-EMT, Montr´eal, Canada Email:
[email protected] Torrent Site Web Server
End User
A new trend in service deployment in the Internet, based on cloud computing and virtualization, shifts the location of applications and infrastructures from the user device to the network to reduce the costs associated with the management of hardware and software resources [1]. An end user accesses and controls the virtual P2P client application using a standard web browser, which reduces the requirements and the load on user devices by offloading the signalling and session management tasks to the remote server. Unlike virtualizing the whole operating system or network architecture, such virtualization could be used in a smaller scale for the client side applications. For example, a virtual IP Multimedia Subsystem (IMS) client has been presented in [2], where the client is implemented inside a remote server instead of the end user’s device. In this paper we introduce a virtual P2P client which follows the BitTorrent protocol but is deployed in a remote cloud server. BitTorrent is the most popular P2P application for distributing large size files. It is implemented as a hybrid P2P system where most of the interactions are done directly between
rre nt he Ru ns t
(4) Handshake (5) Pieces of File
Tracker Server
Peer B
Bit Torrent Client
I NTRODUCTION
Peer-to-Peer (P2P) networks are popular tools for contentsharing because they provide better scalability and fault tolerance than the traditional client-server model of computing. For running such P2P applications a user has to install a client application on her device. The single most important task of these applications is to exchange data between peers, but apart from that, they may also perform routing/forwarding, content validation (e.g. hash checking) and implement different mechanisms for efficient bandwidth usage. As a result these client applications consume various resources such as processing power, memory, bandwidth, etc. A local application requires a prior installation and has to be regularly updated for maintenance, which can be a burden for the user. [?]
t Lis rt o eer r P nd P o f a s est ith IP u q w t Re (2) er lis Pe (3) Peer A
(1 )
Browser
Index Terms–P2P, BitTorrent, virtual client, cloud-based server.
I.
nd ha
.to
rc Sea (1)
e t fil ren .tor d a nlo Dow
fil e
Abstract—We introduce a virtual Peer-to-Peer (P2P) client with the property of separating the data and control planes of a traditional P2P client. In this model, an end user will access and control the virtual P2P client application using a web browser. All P2P application-related control messages will be originated from and terminated to the virtual P2P client deployed inside the remote server. The web browser running on the user device will only manage the download and upload of the P2P data packets. Since BitTorrent is the most widely deployed P2P client, we study a BitTorrent-specific virtual P2P client. We also discuss the implementation challenges of our proposed architecture.
Peer C
Fig. 1.
Peer D
BitTorrent’s file sharing process
peers but initial and further occasional interactions with a server are required for locating peers [3]. A user gets the information about the peers using a metainfo file (or metafile). The BitTorrent architecture is shown in Figure 1. It can be summarized in the following points: 1) 2) 3) 4) 5)
A peer willing to download a shared content has to download the corresponding metainfo or tracker file. The peer contacts the tracker and requests for a list of peers that are already participating in the torrent (i.e., sharing that content). The tracker server replies with a list of peers with their respective IP address and connection port. The peer selects a number of peers from the list provided by the tracker and establishes a connection with them. When connections are established, the peer exchanges pieces of that file with these neighbours.
A set of peers using the same metainfo file to share a particular file are part of the same swarm. A tracker can introduce the newly joined peer to multiple swarms at the same time. A file is divided into fixed size pieces which peers exchange with each other. When a piece is downloaded its SHA1 hash is computed and compared with the value in the metafile. If the value matches then the piece is declared downloaded and made available to other peers for downloading.
In section II we present the structure of a generic virtual P2P architecture and then a BitTorrent specific implementation. In section III we discuss the challenges in implementing such a system. II.
A
VIRTUAL
P2P
CLIENT
We present a virtual P2P client architecture, which is illustrated in Figure 2. In this architecture, most of the P2Prelated signalling load is transferred to a server, which we shall call a surrogate. The surrogate deploys a virtual P2P client application and thus acts as a remote server for the user to access, organize, provision and monitor her P2P application’s control plane related services. Surrogate Web Server
Tracker Server
End User Interaction Layer Storage Handler
B. Description of the Model The functionalities of a P2P client can be broadly divided into control plane and data plane, which will be implemented in the user device and the surrogate. We can offload all the control plane activities from the user device to the surrogate, with the great benefit of reducing resource usage and implementation complexities on the user device. Data plane related processing could also be offloaded to the surrogate while different pieces of a file will be downloaded to the surrogate, then merged together to rebuild the file. Depending on the device, the whole file could incrementally be transferred to the user as the pieces are received, or conversely be held by the surrogate for a later, completed file transfer. There are clear trade-offs in terms of extra complexities to be incurred in user device and extraneous resource consumption in the server to handle these different scenarios, but they do not pose any specific challenge. For reasons of efficiency, we only consider below the alternative where the client will directly receive the pieces.
Peer List Handshake Virtual P2P Client (Control Plane Handler)
Surrogate (Virtual P2P Client)
Peer A Peer B
User Device P2P Comm. Module File Handling Module
Tracker Comm. Module Metainfo File Processor IP Forwarder
1. User login and upload Metainfo file
Data Plane Handler
Tracker Server
Peers
2. Metainfo file 3. Peer list
Downloads Data Pieces
Peer C
4.1 Handshake with the active peer
Peer D 5. Active peer IP address & port
4.2 Handshake complete
6.1 Request data piece to active peer
Fig. 2.
Proposed architecture
6.2 Download requested data piece 6.3 Storage handler keeps track of the downloaded pieces 7. State oriented message
A. Proposed architecture
8. Keep alive message
The existence of a surrogate is transparent to other P2P clients in the network. At the same time, the surrogate acts as a server–side for the virtual client of the user. By transferring control plane complexity to the surrogate, a simple user device with functionalities restricted to operating the GUI and downloading/uploading data could be used. Web-based GUIs are commonly used now for virtual clients, say for email or desktop applications, and most user-device platforms are equipped with one or more web browsers. Present cumulative industry growth projections give us an estimation that there will be 20 billion Internet-connected devices by 2020 [4]. Web browsers are getting ever richer features every day and so are browser-based applications. A surrogate implements a server, which receives the users’ input through the GUI running on the web client inside the user device. An interaction layer is needed between the server and the virtual P2P client for establishing communications between them. This layer transfers the GUI’s input to the P2P client application and P2P application’s status (e.g. connection established with tracker server, seeders found, etc.) to the server. The proposed architecture has been designed to be usable for any P2P protocols. Since BitTorrent is currently the dominant P2P protocol, in the following we describe how our model can implement a virtual BitTorrent client. Note that although some parts of this description is BitTorrent-specific, this model, with minor modifications, could easily be adopted for any future P2P protocols.
9. Announce current state
9. Announce current state
10.1 Download the last data piece 10.2 Storage handler merge all pieces, copy the file to local storage of user device
Fig. 3.
Message sequence of virtual P2P client
Control plane and data plane activities related to file transfer are shown in the following steps : 1)
2)
3) 4)
5)
A user connects to the surrogate through a web browser. On successful user authentication, the end user uploads the metainfo file or provides the access (magnet) URI. The surrogate transfers the metainfo file to the virtual P2P client that handles the control plane activities on behalf of the user and creates a session with the user. It communicates with the tracker server requesting the peer list from the active swarm. The tracker responds to the surrogate with the peer list. The surrogate completes the initial handshakes with the peers and finds the active ones. The surrogate thus keeps tabs on IP address and port number of each active peer available for BitTorrent data transfer. The surrogate then returns the active peer list with respective IP address and port number to the data plane handler in the user device.
6) 7) 8) 9) 10)
The user device requests the file pieces from the active peers. The storage handler in the user device keeps track of the pieces downloaded. The user device periodically exchanges state-oriented messages with the peers to keep them aware of the availability of different pieces of the file. The user device sends keep-alive messages to the active peers to check if the opened peer connections are alive. The user sends state oriented messages to the tracker through the surrogate to update the state of the user participating in the swarm. Finally, after the last piece has been downloaded, the storage handler merges all pieces to produce the file, which will be written to the local disk. The user’s machine uploads the file as long as the user’s browser window is open. III.
I MPLEMENTATION C HALLENGES
The model proposed here requires a few modifications in the BitTorrent protocol and in the BitTorrent client running in the browser. Also, traditional browsers have some limitations to accommodate peer-to-peer applications. We discuss these challenges in this section. 1) Bidirectional connections in a browser: Existing BitTorrent applications use the BitTorrent protocol in the application layer and open a bidirectional TCP connection with each of the peers. However, a browser does not natively “speak” the BitTorrent protocol. The WebSocket protocol [5] allows two-way communications which can be initiated from the browser, with the server [4]. It can be used to tunnel BitTorrent messages, as well as the keep-alive messages. Data exchanges between peers can further be facilitated with WebRTC, which is now available in mainstream browsers [6]. 2) BitTorrent messages over HTTP: A browser based client can be designed using the HTTP protocol where the payloads will be the BitTorrent messages. These types of clients can only communicate with peers that are using similar browser-based clients. To communicate with the traditional BitTorrent clients, hybrids would be designed with a module that would be able to parse HTTP requests from browser based clients. It would then be able to seed and leech to both type of implementations. 3) Downloading pieces from forwarded IP addresses: Due to same-origin policy, the web browser enforces constraints on which requests can be initiated by the application and to which content originator. When the IP forwarder returns with the IP addresses of the peers to connect to, it cannot initiate a connection to the peer which has an origin outside of the domain of the initial remote server. This can be solved using Cross-Origin Resource Sharing (CORS) [4], which provides a secure opt-in mechanism for client-side cross-origin requests. 4) Symmetric NAT Traversal: A hole-punching method may not be able to connect two peers behind a symmetric NAT. In that case the user may have been given two options: indirect download and remote download. In an indirect download process, the virtual client relays file pieces and blocks from the peers to the user device. In the remote download process, the virtual client completes the total download process then notifies the user device to download it from the cloud.
5) Keeping the browser open: The peers must keep their browser tab open while exchanging files. If the window is closed, both downloading and uploading (seeding) will be terminated. Also, to ensure proper seeding for the availability of the content a peer must have to be encouraged to keep the window open until a satisfactory amount of data is seeded. 6) Performance of the remote server: In our perspective, storage on the server side is restricted to metafiles which are small and do not require much space. Communications with the user device require the extra keep-alive messages for the service. This, however, is not heavy traffic. As server-side functions are modular, scalability can be handled easily with a classical Elastic Load Balancing (ELB) model. Example of such deployment of virtual SIP client in public cloud could be found in [7]. IV.
C ONCLUSION
We have presented a scheme to virtualize P2P clients, using BitTorrent as a case study. The main point of this scheme is to reduce the load on user devices, but it also supports mobility and facilitates inter-networking. We propose the use of ubiquitous browser to implement the user interface of the virtual client. [] In future work, we would like to explore the full potential of WebRTC based messaging to implement this model. We also plan to study the trade-offs between execution of functions between the browser and the server, as well as investigate better ways to facilitate interoperations with traditional P2P clients. R EFERENCES [1] L. M. Vaquero, L. Rodero-Merino, J. Caceres, and M. Lindner, “A break in the clouds: towards a cloud definition,” ACM SIGCOMM Computer Communication Review, vol. 39, no. 1, pp. 50–55, 2008. [2] J.-Ch. Gr´egoire, and S. Islam, “Virtual Terminals for IMS,” Future Internet Services and Service Architectures, A. Prasad, J. Buford and V. Gurbani (eds.), River Publishers, Mar. 2011. [3] R. L. Xia and J. K. Muppala, “A survey of BitTorrent performance,” Communications Surveys & Tutorials, IEEE, vol. 12, no. 2, pp. 140– 158, 2010. [4] I. Grigorik, High Performance Browser Networking: What Every Web Developer Should Know about Networking and Browser Performance. O’Reilly Media, Incorporated, 2013. [5] I. Fette and A. Melnikov, “The websocket protocol,” RFC 6455, Dec. 2011. [6] WebRTC. [Online]. Available: http://www.webrtc.org/ [7] S. Islam and J.-C. Gr´egoire, “Giving users an edge: A flexible cloud model and its application for multimedia,” Future Gener. Comput. Syst., vol. 28, no. 6, pp. 823–832, Jun. 2012.