Simulation Simulation: Transactions of the Society for Modeling and Simulation International 89(8) 1009–1019 Ó 2013 The Society for Modeling and Simulation International DOI: 10.1177/0037549713492437 sim.sagepub.com
An OPNET simulation model for peer-to-peer networks Mohammed Hawa
Abstract Peer-to-peer (P2P) traffic has increased rapidly over the past few years, with file sharing providing the main drive behind such traffic. Therefore, modeling P2P systems and studying their impact on the performance of the underlying network is a vital research topic. In this work we describe a flexible, efficient and easily expandable P2P simulation framework developed for the OPNET simulation package. The framework can be used to study a wide variety of issues regarding the packet-level and/or the flow-level performance of P2P systems, including the impact of cooperation incentives on file dissemination; mobility and quality of service (QoS) support in P2P distribution; and the effects of free-riding, resource pollution, and malicious attacks on P2P networks.
Keywords peer-to-peer, simulation, ED2K, OPNET, free-riding, reputation
1. Introduction Nowadays, peer-to-peer (P2P) networks contribute a sizable chunk to Internet traffic volume.1,2 The P2P paradigm was fostered by the low cost and high availability of personal computers, coupled with the ubiquity of the Internet. At the time of writing, the most popular P2P networks are the two file sharing networks: BitTorrent and eDonkey2000 (ED2K). Those two P2P networks combined account for between 51% and 99% of all P2P file-sharing traffic, depending on which part of the world you live in.2,3 Understanding the impact of P2P file sharing on the performance of the underlying network is an important and critical issue for service providers having to deal with the large demand file sharing imposes on their networks. This motivates the creation of an accurate and a scalable software simulation framework for P2P systems, which represents a valuable tool to researchers. In addition, such a simulation framework will certainly bring insights into the inner workings of P2P networks, which can inspire improvements to the design of P2P protocols. The majority of P2P simulators developed to date, such as P2PSim,4 PlanetSim,5 Overlay Weaver,6 and PeerSim7–9 target mainly the simulation of P2P overlay routing protocols (such as Chord, Kademlia, Tapestry, Koorde, and Pastry) and services on top of such overlay networks (such as DHT, CAST, DOLR, etc.).4–9 These simulators achieve scalability by avoiding packet-level details. Collected statistics usually include things like
success/failure rate of file lookup queries, latency, and the number of hops to locate resources. An exception is PeerSim, which provides an extra discrete-event simulation (DES) engine that sacrifices scalability for adding extra details of the underlying transport layer. Even though file indexing and search queries represent a critical part of the overall P2P file sharing application, they do not impact the underlying network as the file dissemination process itself, and hence, there is a need for simulators that can quantify the influence of P2P file dissemination on the Internet infrastructure by addressing the packet-level performance. For example, in certain scenarios, ISPs might want to herd P2P traffic away from expensive or congested links to optimize their networks.10,11 In such cases, packetlevel behavior and performance are significant. The simulator described in this work (named iP2P) is extremely flexible, as it can simulate the overlay network thus achieving scalability, or (if the user so desires) can perform a more detailed packet-level evaluation of the P2P system. The framework is quite generic and can be used to study many aspects of P2P systems, such as file Electrical Engineering Department, The University of Jordan, Amman, Jordan Corresponding author: Mohammed Hawa, Electrical Engineering Department, The University of Jordan, Amman, 11942, Jordan. Email:
[email protected]
1010
Simulation: Transactions of the Society for Modeling and Simulation International 89(8)
dissemination patterns, the differences between various P2P cooperation incentives, the effects of free-riding on P2P networks, etc. Because our iP2P framework is integrated into the OPNET12 simulation package, it can take into account the topology of the underlying network including the intricacies of the transport layer, the location of critical links, the configuration of the routers, the behavior of the wired or wireless MAC protocols and the background traffic mix, all in a DES environment. Earlier attempts to build an integrated P2P simulator include: GnutellaSim13 where the Gnutella P2P protocol was implemented on a packet-level basis within the ns-2 package; the work of Eger et al.14 where a BitTorrent-like protocol was implemented with packet-level details again in ns-2; and the work of Katsaros et al.15 where BitTorrent was implemented within the OMNeT ++ simulator. To the best of the author’s knowledge, no P2P model exists for OPNET, which is a popular network simulation software in the research community. Hence, the main contributions of this work are as follows: •
•
• •
The design and implementation of a P2P simulation framework for the OPNET package. This framework can be integrated with a wide range of simulation models provided by OPNET, such as MPLS, IPv6, LTE, Wi-Fi, WiMAX, etc, without extra effort on the researcher part. The framework was designed to be flexible and extensible so that various implementations and alternatives of P2P systems can be incorporated in the future. In this work, we report on implementing ED2K using this framework, but an effort to implement BitTorrent is already underway, and will be reported in future publications. Validation for this simulation framework is performed against a real-life ED2K testbed, and the results are reported. In addition, the iP2P framework is showcased using a sample scenario related to the free-riding phenomenon in P2P systems.
The rest of this paper is organized as follows: Section 2 summarizes the two most popular P2P networks: BitTorrent and ED2K. Section 3 provides details of the iP2P simulation framework, which is validated in Section 4. A sample P2P simulation scenario illustrating the effects of free-riding on P2P systems is analyzed and discussed in Section 5, and the summary is presented in Section 6.
2. P2P file-sharing networks 2.1. The BitTorrent network In the BitTorrent network,16,17 peers in the system split the content of a shared torrent into pieces; each piece is
typically 256 kB in size. When sending a piece from an uploader peer to a downloader peer, the piece is split into sub-pieces (16 kB in size) called blocks. Each block is transmitted individually over the TCP stream established between the uploader and the downloader. Peers interact with each other in three stages: source selection, piece selection, and peer selection. The first stage (source selection) is executed at the downloader, where a downloader peer contacts a third party (called the tracker), which maintains a list of peers who have a copy of the torrent. The downloader peer asks the tracker for the IP addresses of a random subset of such peers to build its initial swarm. The next step is piece selection strategy, in which the downloader decides which piece to request next from remote peers in the swarm. The ‘‘rarest-piece first’’ algorithm is the piece-selection strategy used by BitTorrent. This strategy is augmented by the ‘‘strict-priority policy’’, which states that once the first block of a piece (the rarest piece) has been requested, the remaining blocks from that particular piece must be requested next. This attempts to complete the download of a piece as quickly as possible, as only complete pieces can be shared with other peers in the swarm. Finally, the peer-selection strategy is implemented at the uploader. This strategy decides which of the several downloaders requesting pieces from the uploader should be served first. This strategy is called the choke algorithm (or tit-for-tat) in BitTorrent.16,17 In the choke algorithm the uploader refuses to upload to (chokes) all interested downloaders except for exactly four peers. In other words, the uploader provides a total of four upload slots to send content to four downloaders. The uploader peer unchokes peers that it believes will reciprocate back by sending content (with a high rate) to the uploader. Further discussion on the BitTorrent protocol and the choke algorithm can be found in the literature.16–18
2.2. The ED2K network In the ED2K file-sharing network, files are divided into chunks.18–20 Each chunk is approximately 9.28 MB, which is much bigger compared with a piece in BitTorrent. Chunks are further subdivided into smaller parts (10 kB in size) to be transferred over the TCP stream between the uploader and the downloader. Just as in BitTorrent, file download in ED2K goes through three stages: source selection, chunk selection, and peer selection. In the first stage, a downloader first obtains a list of sources for the desired file from an ED2K server (or a DHT-based distributed-search network called the Kad network). An ED2K server is designed to respond with a subset of peers chosen randomly form all peers sharing the file. This is quite similar to the tracker behavior in
Hawa
1011
Figure 1. A sample iP2P network consisting of 498 peers, 1 server node, and 671 full-duplex links. The server node is located at the center bottom of the network map.
BitTorrent. The downloader then contacts the sources of the file (the potential uploaders), and places an upload request for that file at the uploaders’ upload queues. The next step is peer selection at the uploader, in which the uploader arranges all upload requests from various downloaders in its upload queue based on a scoring system that the uploader maintains. In such system, each upload request is assigned a score, which depends on three factors: (a) the time the upload request was placed in the upload queue; larger waiting time in the queue means a higher score value, which emulates the well-known first-come first-served (FCFS) policy; (b) the priority assigned to the shared file by the uploader (e.g. for a newly released file); and (c) the reputation of the peer requesting the file; peers with better reputation (i.e. those who have delivered content in the past to the uploader) will receive preference compared with other peers. This is the reputation-based incentive mechanism known to combat free-riding in P2P systems. The final step is chunk selection, which is executed at the downloader. This happens when the upload request for a particular file reaches to one of the first four positions in the upload queue. This is when the uploader notifies the corresponding downloader of this fact, and the downloader is allowed to decide on which chunk to request from the uploader. The chunk-selection strategy employed by ED2K is based on the: ‘‘rarest chunk first’’ and the ‘‘strictpriority’’ rules similar to BitTorrent. Further details about the ED2K protocol and its incentive mechanism can be found in the literature.18–20
3. iP2P simulation framework The iP2P framework currently implements the ED2K protocol and, hence, borrows some of its terminology. The ED2K was chosen only as a starting point. Hence, the iP2P framework was designed while keeping in mind the possibility of extending it in the future to simulate a wide variety of P2P systems, including BitTorrent. The idea was to build a generic framework for P2P simulations, which is highly efficient but also easily expandable. The iP2P simulation framework contains two types of nodes: a peer node and a server (tracker) node. Any number of these nodes can be used in any desired network topology. A sample scenario constructed using a group of iP2P nodes connected in a network (with power-law connection distribution) is shown in Figure 1. The reader can see multiple peer nodes and one server node. Such nodes are connected using routers and Ethernet switches. To allow for flexibility and scalability of the simulator, both the peer and server nodes are designed in two variants: the basic and the advanced variants. The advanced model runs the ED2K protocol on top of the OPNET TCP/ IP protocol stack, thus taking into account all packet-level details including routing, segmentation/reassembly, packet loss, link failures, TCP connection establishment/teardown, TCP slow start, etc. The other variant is the basic model which executes the ED2K protocol while bypassing the TCP/IP stack as shown in Figure 2. The basic model sacrifices packet-level details
1012
Simulation: Transactions of the Society for Modeling and Simulation International 89(8)
dhcp
CPU
rip
p2p_peer tcp
udp
rsvp
ip_encap ip p2p_peer
arp
mac
mac
hub_rx_0_0
hub_tx_0_0
hub_rx_0_0
(a)
hub_tx_0_0
(b)
3.2. The peer node
Figure 2. (a) The basic peer node model and (b) the advanced peer node model. Note that the common p2p_peer process at the top of each model.
Peer Peer
Server
TCP C
onnect
OFFER
FILES
TCP D
isconn
ect
Server
TCP C
onnect
GET S
OURC
FOUND
ES
CES
SOUR
TCP D
isconn
(a)
when it has a file (or a chunk of a file to offer), or (ii) when it queries the server for sources of a file to download. In the second case, the server can be configured to respond with a list of all sources sharing that file or only with a random subset of such sources. The peer-server message exchange is illustrated in Figure 3. The OPNET process model for the iP2P server node is shown in Figure 4. The server node goes into the Update_Database state when it receives an OFFER FILES message, and goes into the Search_Database state when it receives the GET SOURCES message, at which time it replies with a FOUND SOURCES message. A purge timeout is assigned to each database record in the server. If the peer does not keep offering a file, the server assumes that the peer has failed or that it has stopped sharing such file, in which case the server purges that peer as a source for the file. This is done in the Purge_Old_Srcs state.
ect
(b)
Figure 3. Messages exchanged between the peer node and the server node: (a) when offering a file and (b) when inquiring about sources of a file.
to allow for a much faster, more efficient, and more scalable simulation that concentrates on the effects of P2P protocol design, cooperation incentives and other high-level aspects of the P2P system. Both variants of the peer and server nodes share the same code base implementation of the ED2K P2P protocol as shown in Figure 2.
3.1. The server node The iP2P server node represents the equivalent of an ED2K server. It represents a global file index, and for each file, stores the IP addresses of the peers currently sharing this file. The server obtains such information when peer nodes offer their list of shared files to the server. Hence, a peer node needs to communicate with a server node (i)
Peer nodes communicate with each other to either request the download of a file, or to perform the transfer of file content. These two message exchange scenarios are illustrated in Figure 5. In Figure 5(a), when the downloader (peer A) requests a file from the uploader (peer B), peer B stores an upload request for that file in B’s upload queue (assuming there is space in the queue), and replies with a QUEUE RANK message indicating the position of the downloader’s upload request in the upload queue. If the requested file does not exist (e.g. it has been deleted from peer B), or the queue is full, peer B replies with the proper message. When peer B is ready to upload to peer A (we say peer A was assigned an upload slot at the uploader), peer B contacts peer A by sending a READY TO UPLOAD message. The downloader (peer A) starts requesting the desired chunks from the uploader. The REQUEST PARTS message helps throttle the download rate at peer A (if so desired) since peer A can request any number of parts at its own pace. Peer B (the uploader) can also throttle the upload rate using a timer implemented for sending parts. A STOP UPLOAD message is sent if the downloader loses its upload slot. This can happen in ED2K when the downloader does not have the highest peer score among other peers.18 The OPNET model for the peer node is illustrated in Figure 6. The peer node model is a much more complex model compared with the server model. It includes both downloader and uploader functions. Note that the chunk-selection strategy at the downloader can be implemented either in the Request_Chunk state or the Request_Parts state, which provides flexibility to implement various P2P protocols. The peer-selection strategy, on the other hand, is implemented in the Mng_Queue state. A timer is included to create files regularly at each peer, adding such files to a Shared Files List. Another timer
Hawa
1013
init_wait
Purge_Old_Srcs
(default) (START_INIT) (PURGE_TIMER) init
Update_Database
(RCVD_OFFER_FILES) wait
(GO_OFFLINE) (RCVD_GET_SOURCES) offline
Search_Database
(default)
Figure 4. The server node OPNET process model.
Peer A Downloader
Peer B Uploader onnect
TCP C
LOAD
TO UP
READY
Peer A Downloader
Peer B Uploader
CANCE
OR
L TRAN
SFER
REQU
EST P
ARTS
TCP C
onnect
T G PAR SENDIN PART G IN D SEN
REQ (File or UEST Chunk /Piece) E
QUEU
OR OR
...
REQU
E FULL
EST P
QUEU
UND OT FO
N
TCP D
T
G PAR
SENDIN
RANK
ARTS
T G PAR SENDIN T G PAR IN D N SE ...
isconn
ect
OR OR
STOP
D
UPLOA
CANCE
L TRAN
TCP
(a)
SFER
nect Discon
(b)
Figure 5. Messages exchanged between peer nodes: (a) when requesting a file and (b) when transferring content.
simulates random deletion of files, thus removing them from the Shared Files List into a Deleted Files List. These two operations are performed in the two states Create_File and Delete_File, respectively. Another timer is used to (randomly) pick files for download. Once a peer decides to download a file, it places it in
its download queue, which initiates a cascade of events including asking the server about possible sources sharing the file, and then contacting those sources one by one to request such file. The states that the peer goes through when downloading a file are: Pick_File, Get_Sources, Read_Sources, Pick_Source, Request_Chunk, Read_Response, Request_ Parts, and finally Read_Part. Those states are shown in Figure 6 under ‘‘Downloader Functions’’. The ‘‘Uploader Functions", on the other hand, are initiated by the reception of a REQUEST message, which activates the Add_to_Queue state. A timer activates the Mng_Queue state which in turn calls the Send_Ready, Process_Reply and Send_Part states, in this order.
3.3. Attributes and parameters Many iP2P model parameters (called attributes in OPNET) are presented for the end user to manipulate. This includes the chunk size, part size, the number of files in the simulation (number of files created at the start of simulation, the maximum files per peer and the maximum files per simulation), the shared files mix (i.e. the percentage of audio, video, CD image, and archive files) in the simulation, and the file size statistics for each category (i.e. mean, variance, distribution, etc.). In addition, the maximum number of sources saved and contacted per file, and the maximum number of sources reported by the server in a FOUND SOURCES message can be controlled. The model also allows setting the random variables representing the time interval between file creations, file deletions, file downloads, and source contacts. Other settable attributes of the model include the maximum download and upload data rate for each peer, the
1014
Simulation: Transactions of the Society for Modeling and Simulation International 89(8)
Uploader Functions
File Creation/Deletion
init_wait
Create_File
offline
Delete_File
Send_Part
Purge_Sources
Process_Reply
(default) (START_INIT)
(PURGE_TIMER)
(SHARE_FILE)
(default)
(GO_OFFLINE) init
(RCVD_REQUEST_PARTS_OR_CANCEL)
(UP_THROTTLER_TIMER)
(UNSHARE_FILE)
Build_Files
Send_Ready
(SERVER_UP) (CONTACT_DOWNLOADER)
(default) (GO_OFFER_FILES)
(UPLOAD_TIMER) wait
Offer_Files
Mng_Queue
(START_DOWNLD)
Pick_File
(RCVD_CHUNK_REQUEST)
Add_to_Queue
(GO_GET_SOURCES)
(CONTACT_SOURCE) (RCVD_READY_TO_UPLOAD)
(DOWNLD_TIMER) (FOUND_SOURCES)
Get_Sources
Read_Sources
Pick_Source
(RCVD_RESPONSE)
Request_Chunk
Read_Response
(RCVD_PART_OR_STOP_UPLOAD)
Request_Parts
Read_Part
Donwloader Functions
Figure 6. The peer node OPNET process model.
maximum active upload slots at the uploader and the freeriding behavior of the peer. In addition, memory limits can be imposed by setting maximums on the Download Queue size, Upload Queue size, Known Peers List size, and the maximum number of incoming and outgoing TCP sessions.
3.4. Collected statistics Various statistics are collected by the implemented iP2P simulation framework. The user can select all or a subset of such statistics that he/she wants to investigate. These include the number of created files, downloaded files, shared files, and deleted files. Shared files are also subcategorized into files shared as complete files and as partial files. The file download time, file life time, file size, and chunk download time are also recorded. In addition, the downloaded volume, uploaded volume, download rate, upload rate, and upload/download ratio (i.e. fairness index) are collected for each peer and for the group of all free-riders and all altruistic peers. Other collected statistics include the number of files being downloaded concurrently, number of files being
uploaded concurrently, number of download requests and upload requests, number of OFFER FILES and GET SOURCES messages sent and received, number of sources sharing a file, number of outgoing and incoming TCP sessions, and the Download Queue and Upload Queue sizes at various peers.
4. Model validation 4.1. Simulation versus empirical results To validate our implementation of the ED2K protocol in iP2P, we compare our simulation-based results against a small-scale, but real-life, ED2K testbed. The testbed consists of ten 2.0 GHz Intel Xeon computers, each running a 32-bit version of Microsoft Windows and an eMule v0.50a client.19 An exception is the tenth node, which runs, instead, a lugdunum eserver v17.7 (an ED2K server for Windows).21 The upload data rate of each eMule client is throttled to 400 kbit/s. The testbed nodes are interconnected by a 1 Gbps Ethernet switch and a built-in router. All of the nodes were symmetrically positioned in terms of network topology.
Hawa
1015
8000 Empirical: 346.88 MB Empirical: 173.73 MB Empirical: 98.64 MB Empirical: 49.87 MB Empirical: 7.01 MB Simulation: 346.88 MB Simulation: 173.73 MB Simulation: 98.64 MB Simulation: 49.87 MB Simulation: 7.01 MB
File Download Time (sec)
7000 6000 5000 4000 3000 2000 1000 0
1
2
3 4 5 6 Total Number of Sources
7
8
Figure 7. Experimental and simulation-based results for the case of consecutive downloads of a file.
15000 Average Download Time (sec)
Empirical Simulation
10000
5000
for the file, the third node starts downloading such file from the two available sources. This is followed by the fourth node downloading from three sources, and so on. This experiment illustrates the case of multiple uploading peers (sources) sending content to one downloading peer. The file download time for each downloading peer is recorded using the real-life testbed and the iP2P simulation model. The results are shown in Figure 7. We note how close the experimental and simulation results are in this case. It is interesting to see that as the number of sources for a file increases, the file download time drops down. In the second experiment, we setup the first node to share one file at a time, and all of the other nodes to request that file from that same source at exactly the same time. This represents the case of one uploader peer and multiple downloaders, and is different than the first experiment. It also illustrates the case of cooperation between multiple peers concurrently downloading one file by sharing the chunks they finish downloading with the other peers in the swarm. We record the file download time averaged over all downloading peers, and we compare the empirical results with the simulation results (see Figure 8). Note that the time to download a file increases with the file size. However, the relationship is not linear because larger files (i.e. those that contain more chunks) allow for the different peers to obtain different chunks at the beginning of the download process, which allows sharing such chunks with other peers in the P2P network, thus effectively increasing the available download rate for that file.
4.2. Model performance 0
0
50
100 150 200 250 File Size (Mega Bytes)
300
350
Figure 8. Experimental and simulation-based results for the case of concurrent downloads of a file.
Using OPNET and the iP2P simulation framework, we also implemented a matching topology to the above ED2K environment. We performed two experiments. In each experiment, five different files were exchanged between the ED2K peers. The sizes of these files were: 7.86 MB, 50.14 MB, 99.4 MB, 175.05 MB, and 349 MB. Those sizes were selected to match real-life file sizes that can be found over popular file-sharing networks.22 ED2K peers perform a zlib-compression on the files before exchanging them to save network bandwidth. The file sizes after compression were: 7.01 MB, 49.87 MB, 98.64 MB, 173.73 MB, and 346.88 MB, respectively. In the first experiment, the first node starts as the only source of one of the above-mentioned files, after which the second node starts downloading that file. Once the second node finishes the download and becomes a second source
We also investigate the difference in performance between the two available iP2P model variants (basic and advanced) under similar conditions with regards to memory requirements and simulation runtime. We vary the number of peer nodes while maintaining the simulation parameters discussed later in Section 5.3. The results are shown in Table 1 illustrating the requirements of running the iP2P framework on a 2.0 GHz Intel Xeon processor with 32-bit version of Microsoft Windows. The results illustrate the power of the iP2P model as it can be used in either the efficient and speedy basic mode, or the more detailed advanced mode with some increase in processing demands.
5. Sample simulation (alleviating free-riding) To illustrate part of the capabilities of the iP2P simulation framework, we run a sample simulation to study the issue of free-riding in P2P systems. This study, however, is kept to a minimum for space considerations.
1016
Simulation: Transactions of the Society for Modeling and Simulation International 89(8)
Table 1. Simulation runtime and memory requirements for the iP2P basic and advanced simulation models. Measurements are for the OPNET 32-bit address space with both the optimized and development kernels. Total peer nodes
Total switches or routers
Basic model Optimized kernel runtime
Development kernel runtime
Allocated memory
Optimized kernel runtime
Development kernel runtime
Allocated memory
102 306 606 1068
3 6 12 21
2h 8.7 h 26.3 h 60 h
4.7 h 22 h 61 h 126 h
22 MB 36 MB 62 MB 110 MB
6h 22 h 60 h 142 h
13.3 h 55 h 139 h 294 h
135 MB 227 MB 367 MB 679 MB
Advanced model
5.1. Definition of free-riding Free-riding in P2P systems describes the refusal of peers to give up some of their upload capacity to the good of the community.23 This was first observed in the Gnutella file sharing network, where it was reported that nearly 70% of Gnutella users do not share any files, and peers that volunteer to share files are not necessarily those who have desirable files.24 This trend continued with all generations of P2P networks, including BitTorrent and ED2K.20,25 Identifying and controlling free-riding behavior in P2P file sharing networks is a critical issue for the survival of such systems. To this end, cooperation incentive mechanisms17,18 were devised to ensure cooperation of selfish peers. In such mechanisms, downloaders are forced to reward uploaders in order to compensate for their resource consumption and encourage further altruistic behavior. The two most popular cooperation incentive mechanisms in use nowadays are the reputation-based incentive mechanism used in the ED2K network (also known as credit-based) and the barter–trade mechanism used by BitTorrent (also known as tit-for-tat strategy). Using the iP2P framework, we study and attempt to quantify the effectiveness of the ED2K reputation-based cooperation incentives in combating free-riding behavior.
When the uploader becomes a downloader and requests content from remote peers, remote peers use the previously calculated reputation score Sr in combination with a FCFS waiting-time score (called Sw ) to allow for preferential treatment for peers with better reputation (i.e. such peers advance faster in the upload queue).
5.3. Simulation setup In this sample simulation we will investigate the cooperation incentive method that the ED2K network uses to alleviate free-riding. To that end, we will compare three different setups in our simulation: •
•
5.2. ED2K credit system As explained earlier, peers with good reputation in ED2K (i.e. those who upload more) get better treatment from other peers in the network. In ED2K terminology, a peer accumulates credit (i.e. reputation) at remote peers by uploading content to them. Depending on the amount of bytes the uploader sent to a remote peer, the remote peer calculates a credit score (reputation score) Sr for that uploader, which increases along with received bytes. The formula used in ED2K to calculate the reputation score for an uploader is pffiffiffiffiffiffiffiffiffiffiffiffiffi 2Ut Sr = min , 0:2854 × ðUt 1Þ, Ut + 2 Dt
ð1Þ
where Ut is the total amount of megabytes that the uploader sent in the past and Dt is the total amount of megabytes the uploader got in return (in the past as well).
•
NC scenario: This will be the worst-case scenario, in which free-riders will refuse to share any content they download and thus will provide no contribution whatsoever to the P2P system. W scenario: In this scenario, free-riders will only share files they are currently downloading. Upon completing a file download, free-riders will immediately delete the file from their shared folder. Hence, free-riders will provide limited contribution to the system (something the ED2K network requires from its peers). However, in this scenario no cooperation incentive mechanisms are employed by the P2P system to further control free-riding. In other words, peers (when uploading content) treat all other peers equally in a FCFS fashion, irrespective of the fact that the downloader might be a freerider. WR scenario: This is similar to the W scenario, except now the P2P system uses the ED2K reputation-based cooperation incentives (see equation (1)) in order to identify and control free-riders by giving preferential treatment to altruistic peers.
We aim to quantify the enhancement introduced by the two improvements described in the W and WR scenarios. We emulate a small-sized P2P file sharing network consisting of 306 peer nodes and a single server node. The main parameters chosen for the simulation are shown in Table 2.
Hawa
1017
Table 2. Main simulation parameters. Parameter
Description
File size (MB) Files created at start (each peer) Time between file requests (h) Chunk size Part size Max upload slots (at each peer) Download capacity (each peer) Upload capacity (each peer)
Uniform (3, 79) 10 Uniform (12, 84) 9.28 MB 10 kB 4 Unlimited 10 kbit/s
18000
Number of Downloaded Files
Scenario: NC Scenario: W Scenario: WR
250
200
150
100 Load = 99% 50
0 0.5
16000 Load = 99% 14000 12000
D=
8000 Load = 995%
6000 4000
0 0.5
Scenario: WR Scenario: W Scenario: NC 0.6
0.7 0.8 0.9 Percentage of Free Riders
1
Figure 9. Number of downloaded files by all peers at the end of the 16-week period.
40
NC: Free Riders NC: Altruistic Peers W: Free Riders W: Altruistic Peers WR: Free Riders WR: Altruistic Peers
20 Load = 200%
0 0.5
Load = 99% 0.6
0.7 0.8 0.9 Percentage of Free Riders
1
Average file size × Number of peers Average time between file requests 41 MB × 306 = 48 hours
ð2Þ
Different runs are carried out as we increase the percentage of peers acting as free-riders. This reduces the total upload capacity, C, of the network, which is provided by the altruistic peers (non-free-riders). This capacity is calculated as C = Peer upload capacity × Altruistic peers
ð3Þ
5.4. Results and discussion
30
10
0.7 0.8 0.9 Percentage of Free Riders
We define the system load, L, as the ratio of the content demand rate to the upload capacity of the network, i.e. L = D=C.
60 50
0.6
Figure 11. Average upload queue size at altruistic peers.
Load = 200%
10000
2000
Avg File Download Time (Days)
Avg Upload Queue Size at Altruistic Peers
300
1
Figure 10. Average file download time for files downloaded in the 16-week period.
Based on the above parameters, the rate of content demand (D) on the system is
We run each of the three scenarios (NC, W and WR) described earlier for a total 16 weeks (4 months) of simulated time. The results are shown in Figures 9–12. Figure 9 shows the total number of downloaded files at the end of the 16 weeks period. The figure shows that the increase in free-riders in a P2P system can easily cause the system to collapse. If the percentage of free-riders in the system increases beyond 80%, the system demand outstrips the available upload capacity, and since free-riders delete their files, the system degrades due to lack of resources (see the NC case). Forcing free-riders to provide some contribution (even if small) can further delay the point of collapse as seen in the W scenario. This is because while free-riders are downloading content, their upload capacity contributes to the total capacity of the network. An interesting observation is that the amount of upload capacity gained by free-riders can be estimated by comparing the L = 99% break point in
1018
Simulation: Transactions of the Society for Modeling and Simulation International 89(8)
Avg Download Queue Size at ALL Peers
30
25
Scenario: NC Scenario: W Scenario: WR
20
15
10
5
0 0.5
0.6
0.7 0.8 0.9 Percentage of Free Riders
1
Figure 12. Average download queue size at all peers.
the NC scenario with the L = 200% break point in the W scenario. Figure 10 shows the average file download time over all downloaded files in the 16 weeks period. Note that even though the collapse of the system happens at a later point in the W scenario, altruistic peers and free-riders are both suffering excessively large file download times after the L = 99% mark. This is unfair to altruistic peers and is a result of the FCFS behavior of the system. Larger file download times result because free-riders immediately delete completed files and thus deprive the system from such resources, which is quite detrimental at high loads. However, note that the credit system used by ED2K (the WR scenario) has the ability to differentiate between altruistic peers and free-riders, giving preferential treatment to altruistic peers, thus reducing their file download time compared with free-riders. This is quite different from both NC and W scenarios. An important side effect of this differentiation is that the average file download time calculated over all peers in the WR scenario is smaller than that in the W and NC scenarios. The reason for this is that preferential treatment for altruistic peers results in files migrating to altruistic peers faster (on average) than to free-riders, thus providing more sources for any particular file in the system at any given time, hence reducing the overall file download time. Had files migrated to free-riders first, such files would have been deleted and the file download time would not have improved. Hence, preferential treatment of altruistic peers is not only advantageous to altruistic peers, but is also advantageous to the whole system. Figure 11 shows the average upload queue size at altruistic peers. This is a rough measure of the memory and processing power requirements at altruistic peers. If
such requirements exceed a certain point, altruistic peers will be tempted to leave the system or free-ride themselves. Hence, it is to be expected that the collapse of a P2P system will accelerate once the memory and processing requirements on altruistic peers exceed a certain point, resulting in a catastrophic collapse of the P2P network rather than a gradual one. We note that the WR scenario places the least demand on resources, but unfortunately is not that far from the W scenario, which might explain why eMule users (the main client for the ED2K network) always complained about the heavy demand on their systems.26 It is worth mentioning that the drop of the average upload queue size in the NC scenario at very high loads (i.e. large percentage of free-riders) is due to a complete system collapse rather than a reduction in processing requirements at altruistic peers. In other words, as the high percentage of free-riders deleted their files, the small percentage of altruistic peers were no longer able to obtain many files to share, thus they were not be able to respond to upload requests from other peers in the system because they do not have the files to begin with. Hence, all other peers in the system will be starved from viable sources of shared files, and their download queue sizes will increase dramatically. This is clearly seen in Figure 12 showing the download queue size averaged over all peers in the system. Note the sharp increase of the average download queue size in the NC scenario when the percentage of free-riders increases.
6. Summary In this work, we described iP2P, a P2P simulation framework to be integrated with OPNET. The implementation details were briefly described. Because iP2P provides packet-level details, it can generate a more accurate picture of real-life P2P environments allowing researchers to explore various aspects of P2P systems. Those who can do without packet-level details can gain in terms of runtime and memory requirements if they use the highly efficient iP2P basic model. An effort to implement the BitTorrent protocol inside the iP2P framework has already started, and we will report on the progress in due time. The implementation will not only provide an extra P2P simulation model for researchers, but will also provide a proof-of-concept for the flexibility and expandability of the iP2P simulation framework. Owing to such flexibility and the large model base provided by OPNET, we also envision that this framework can represent a cornerstone for future simulation models that implement various distributed networking protocols, such as distributed backup, distributed Web caching, content delivery networks (CDNs), and Grid computing,27,28 to name a few.
Hawa The iP2P framework was used in this paper to obtain some insights into the effects of free-riding on P2P systems and to investigate available methods to alleviate it. Observed results suggest that forcing free-riders to share content while downloading represents an important step in overcoming free-riding and can help avoid the immature collapse of the system. Funding This work was supported in part by the Deanship of Academic Research at The University of Jordan.
References 1. Garettoa M, Figueiredob DR, Gaetaa R and Serenoa M. A modeling framework to understand the tussle between ISPs and Peer-to-Peer file-sharing users. Perform Eval 2007; 64(9–12): 819–837. 2. Schulze H and Mochalski K. Internet study 2008/2009. Online Report, Ipoque, 2009, http://www.ipoque.com/resources/internetstudies/internet-study-2008_2009/. Retrieved January 2013. 3. Schulze H and Mochalski K. Internet study 2007. Online Report, Ipoque, 2007, http://www.ipoque.com/resources/internetstudies/internet-study-2007/. Retrieved January 2013. 4. P2PSim official website, http://pdos.csail.mit.edu/p2psim/. Retrieved January 2013. 5. PlanetSim official website, http://ants.etse.urv.es/planet/planetsim/. Retrieved January 2013. 6. Overlay Weaver official website, http://overlayweaver.sourceforge.net/. Retrieved January 2013. 7. PeerSim official website, http://peersim.sourceforge.net/. Retrieved January 2013. 8. Jelasity M, Montresor A and Babaoglu O. Gossip-based aggregation in large dynamic networks. ACM Trans Comput Syst 2005; 23: 219–252. 9. Montresor A and Ghodsi A. Towards robust peer counting. In: Proceedings of the 9th IEEE International Conference on Peer-to-Peer Computing, 2009, pp. 143–146. 10. Wang JH, Wang C, Yang J and An C. A study on key strategies in P2P file sharing systems and ISPs’ P2P traffic management. P2P Netw Appl 2011; 4(4): 410–419. 11. Ribeiro B, Figueiredo D and Towsley D. Herding BitTorrent traffic away from expensive ISP links. UMass CMPSCI Technical Report UM-CS-2008-029, University of Massachusetts, 2008. 12. OPNET official website, http://www.opnet.com/. Retrieved January 2013. 13. He Q, Ammar M, Riley G, Raj H and Fujimoto R. Mapping peer behavior to packet-level details: A framework for packet-level simulation of Peer-to-Peer systems. In: Proceedings of the 11th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2003, pp. 71–78. 14. Eger K, Hoßfeld T, Binzenho¨fer A and Kunzmann G. Efficient simulation of large-scale P2P networks: packetlevel vs. flow-level simulations. In: Proceedings of the 2nd workshop on use of P2P, GRID and agents for the development of content networks, 2007, pp. 9–16.
1019 15. Katsaros K, Kemerlis VP, Stais C and Xylomenos G. A BitTorrent module for the OMNeT ++ simulator. In: Proceedings of the 17th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2009, pp. 361–370. 16. Cohen B. Incentives build robustness in BitTorrent. In: Proceedings of the 1st Workshop on Economics of Peer-toPeer Systems, 2003, pp. 1–5. 17. Huang K, Wang L, Zhang D and Liu Y. Optimizing the BitTorrent performance using an adaptive peer selection strategy. Fut Gen Comput Syst 2008; 24(7): 621–630. 18. Hawa M. Cooperation incentives: issues and design strategies. In: Antonopoulos N et al. (eds.), Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications. Information Science Publishing, 2009. 19. eMule official website and source code, http://www.emuleproject.net/. Retrieved January 2013. 20. Caviglionea L and Davolib F. Traffic volume analysis of a nation-wide eMule community. Comput Commun 2008; 31(10): 2485–2495. 21. lugdunum eserver official website, http://lugdunum.shortypower.org/files/. Retrieved January 2013. 22. Hawa M, Rahhal JS and Abu-Al-Nadi DI. File size models for shared content over the BitTorrent Peer-to-Peer network. P2P Netw Appl 2012; 5(3): 279–291. ¨ . Counteracting free 23. Karakaya M, Ko¨rpeog˘lu I_ and Ulusoya O riding in Peer-to-Peer networks. Comput Netw 2008; 52(3): 675–694. 24. Adar E and Huberman B. Free riding on Gnutella. First Monday 2000; 5(10): 1–22. 25. Hughes D, Coulson G and Walkerdine J. Freeriding on Gnutella revisited: the bell tolls? IEEE Distrib Syst Online 2005; 6(6). DOI: 10.1109/MDSO.2005.31. 26. eMule Discussion Forum, http://forum.emule-project.net/ index.php?showtopic=136707, Retrieved September 2009. 27. Cao J. ARMSim: A modeling and simulation environment for agent-based Grid computing. Simulation 2004; 80(4–5): 221–229. 28. Mocskos EE, Yabo P, Turjanski PG and Ferna´ndez Slezak D. Grid Matrix: a grid simulation tool to focus on the propagation of resource and monitoring information. Simulation 2012; 88(10): 1233–1246.
Author biography Mohammed Hawa graduated from the University of Kansas in 2003 with a PhD in Electrical Engineering. He received his MSc degree from University College London in 1999 and his BSc degree from the University of Jordan in 1997. He is the recipient of the Fulbright Scholarship (1999) and the Shell Centenary Scholarship (1998). He is a published author and a member of the IEEE and IAENG. He is currently an Associate Professor of Electrical Engineering at the University of Jordan. His main research interests include networking, quality-ofservice and peer-to-peer networks.