Telecommun Syst (2008) 39: 103–116 DOI 10.1007/s11235-008-9115-z
Validating BitTorrent models David Erman · Daniel Saavedra · José Á. Sánchez González · Adrian Popescu
Published online: 11 July 2008 © Springer Science+Business Media, LLC 2008
Abstract BitTorrent (BT), a Peer-to-Peer (P 2 P) distribution system, is a major bandwidth consumer in the current Internet. This paper reports on a measurement study of BT traffic intended to identify potential traffic invariants. To this end, we employ high-accuracy packet capture hardware together with specialized parsing software. Important characteristics regarding BT sessions and messages as well as general system characteristics are reported. The results indicate that characteristics such as session inter-arrival times are corroborated to be exponentially distributed, while other characteristics are shown to differ from previously reported results. These differences are attributed to changes in the core BT algorithms. Further, it is observed that long- and heavy-tailed distributions can be used to model several characteristics. The log-normal, Pareto and mixtures of these distributions are used to model session sizes and durations, while the Weibull distribution has been observed to model message inter-arrival and inter-departure times for bandwidth-limited clients. Keywords Peer-to-Peer · Network measurements · BitTorrent · Traffic characteristics
D. Erman () · A. Popescu Department of Telecommunications systems, School of Engineering, Blekinge Institute of Technology, 37179 Karlskrona, Sweden e-mail:
[email protected] D. Saavedra C/ Muller 42, 1-11, C.P. 28029, Madrid, Spain J.Á. Sánchez González Avda. Portugal, 3, Olula del Río, 04860, Almería, Spain
1 Introduction Given the importance of P 2 P applications, for both users and network operators, the number of measurements studies of various P 2 P systems has been growing steadily. These measurements can be done using many approaches in order to obtain information on many different levels. For example, in the measurement study reported in [11], the authors used crawlers for Napster and Gnutella networks. A crawler is a piece of software that iteratively retrieves data from a network. Based on this data, the authors reached important conclusions about the non-cooperativeness and diversity of the peers involved in the systems, which is contrary to the design assumptions these system were built upon. In the study presented by Sen and Wang [12], the authors use flow-level information collected at border routers across a large ISP network to characterise P 2 P system behaviour. They provide metrics for FastTrack, Gnutella and DirectConnect systems at different levels of spatial aggregation. They observe that the systems exhibit significant dynamics at short time scales at the IP layer. On a higher aggregation level however, using prefixes instead of the whole IP address, the fraction of P 2 P traffic contributed by each entity is observed to be more stable than WWW traffic or overall traffic. Even though the BT protocol is over five years old, measurements studies of it are not abundant. In [3] and [2], we reported BT message and session models. We provided models for session metrics, such as inter-arrival time, sizes and duration, and concluded that the BT system behaves well with respect to session sizes and durations. We also obtained accurate models for most bandwidth consuming BT messages. Using an instrumented BT client, Legout et al. [6] explore the properties of two of BT’s key algorithms: the choke and
104
the rarest-first algorithm. They show that both algorithms perform remarkably well and point out that an older version of the choke algorithm suffered from several problems, including permitting free riding. They also explore the dynamics of the peer set a client is connected to and evaluate protocol overhead. In [9], a theoretical approach is taken. The authors present a simple fluid model for BT-like P 2 P networks and compare results of simulations using the model and real traces. They claim the model can capture the behaviour of the system even when the arrival rate is small. They also studied the built-in incentive mechanism of BT, the choke algorithm. Their results show that the number of seeds and leechers follow a Gaussian distribution when in steady state. Instrumented client data and tracker logs from a specific torrent spanning five months were used in [5] to study the BT system. Their results demonstrate the ability of the BT protocol to sustain flash crowd events. This, together with the good performance in terms of throughput per client, allow them to claim that BT is highly effective to replicate content. Complementing this long term study, an active measurement approach was taken in [8]. The study was performed on more than two thousand torrents hosted on a popular WWW site over eight months, giving important information about torrent dynamics. 1.1 Motivation is one of the most popular file distribution protocol in the Internet. As such, it is important to characterize the traffic patterns and dynamics of the protocol. In the work we present in this paper, we have opted to characterize BT traffic by fitting various distributions and estimating the parameters of these distributions to the session characteristics of BT nodes. The primary goal of this work was to corroborate or refute our previous findings, and to provide parsimonious models to be used in, e.g., simulation studies and for traffic engineering. BT
D. Erman et al.
but rather several separate networks, as many as the content sets being shared. These separate networks are called swarms. A peer may participate in an arbitrary number of swarms simultaneously. Two other important characteristics of the protocol are swarming and fairness. Swarming means that the data is divided into pieces, so a peer can get the pieces from different peers, thus sharing the total load between the peers. Fairness is achieved as the protocol introduces an algorithm to guarantee a reasonable level of upload and download reciprocation, thus penalizing free riders, i.e., peers that never upload. The BT protocol suite comprises two protocols, the peer wire protocol and the tracker protocol as well as the torrent file structure. The torrent file includes all the information needed by a peer to start downloading a file and also a way to check the integrity of the downloaded data. The tracker protocol is used by the peer to acquire information from the tracker, mainly information concerning the other peers that are connected to the swarm. Finally, the peer wire protocol, also known as the peer or P 2 P protocol, is used by the peers to communicate and specifies both the signalling and the data transfer. More detailed descriptions of the BT system are provided in [1, 2, 10]. This paper focuses on the peer wire protocol, since it is the biggest contributor to traffic load. The term BT protocol will be therefore used for the peer wire protocol. The other protocols will be explicitly named and the term BT protocol suite will be used to refer to the whole set of protocols and the torrent file convention. There are a large number of different BT client implementations. This means that many, albeit minor, differences between the protocol suite implementations exist. Additionally, several of these clients implement extensions to the protocol. In this paper we report on measurements performed on the official BT client, also called the reference client, version 4.0.4. The results were obtained as part of the work described in [10], where additional models are also provided. 2.1 Terminology
2 The BitTorrent system BT is a P 2 P system primarily used for distributing large files and reducing the load of content providers servers. It employs a modified tit-for-tat scheme to encourage up- and download reciprocation between peers. In contrast to other P 2 P protocols, BT does not include any lookup or search functionality for content. Therefore, users must rely on other means, mainly the WWW, to find the corresponding torrent file, which gives information about the content and where to download it. Furthermore, another particular feature of BT is that there is not a single BT network as is the case for, e.g., eDonkey, Gnutella and others,
We adopt the following terminology in this paper: Peer/Client. While all BT clients connected to a swarm are peers, the term client will be used to name the BT client that is running on the local machine. Piece/Block. A piece refers to a portion of the content data as defined in the torrent file, i.e., with a corresponding SHA-1 code to verify its integrity. A block denotes a specific byte range within a piece, which is used when requesting data from a peer. Leechers/seeds. Both seeds and leechers are peers; the difference between them is that a seeder only uploads as it already has the whole content. A leecher however, is
Validating BitTorrent models
still downloading the content, reciprocating the data it has downloaded to other peers to achieve better download rates. Leecher and seed phase refers to when the client is a leecher or a seed, respectively. Meta-data. The torrent file or meta-info file contains all the necessary information for a BT client to connect to a swarm in order to download the swarm content. The meta-info file is bencoded—a simple encoding scheme to structure data into strings, integers, dictionaries and lists. Torrent files are typically downloaded using HTTP, and are not distributed using BT itself. 2.2 The peer protocol The peer protocol operates over TCP, with data and signalling traffic transferred together in the same connection. The connection between peers is symmetrical in the sense that messages sent in both directions are identical in structure and format and data can flow in both directions. The message flow starts with a mandatory initial handshake (Fig. 1) by the initiator of the connection, which has already established a TCP session. The handshake message contains a length prefixed string to identify the BT protocol, followed by eight reserved bytes, the 20 bytes info_hash (which is the SHA-1 hash value of the data contained in the info key in the torrent file) to identify the content (as some peers might be connected to more than one swarm) and the 20 bytes peer_id to identify the initiator. The values of the peer_id and info_hash fields must correspond to the values sent to the tracker arbitrating a specific set of content. Upon reception of a handshake, a peer responds with the same info_hash but with its own peer_id. BT messages can be classified based on their specific function in the protocol, as follows: • Related to the choking algorithm: interested, not interested, choke and unchoke. • Related to data transfer: request, piece, have and cancel. There are two additional peer messages, the keepalive and bitfield messages. The keepalive message has no payload
105
and is used to keep the connection between peers open. The bitfield message may only be sent immediately after the initial handshake. The payload of the bitfield message is a bitmask where each set bit represents a piece that the sending peer has. For peers that do not have any pieces, the message is optional. 2.3 The tracker protocol The tracker is a HTTP or HTTPS service listening for GET requests from BT peers. A peer connects to the tracker using the address provided in the announce-field in the torrent file. The peer sends a HTTP GET request with parameters added using standard CGI methods, i.e., ‘?’ after the announce URL, followed by ‘param = value’ sequences separated by ‘&’. 2.4 The choking algorithm The choking algorithm introduces fairness to the BT protocol, in the sense that downloading is allowed by uploading. This is known as the tit-for-tat algorithm. A choked peer is not allowed to download, while an unchoked peer is. Also, choking is used to avoid performance degradation related to bad behaviour of TCP congestion protocol when sending over many connections at once. The protocol specification proposes one choking algorithm, but allows for different algorithms to be used as long as it fulfils the following requirements [1]: 1. The new algorithm works well with a network consisting entirely of itself, and also in a network consisting mostly of the standard algorithm. 2. Avoids choking and unchoking quickly, known as fibrillation. 3. Caps the uploading connections for good TCP performance. 4. Reciprocates to peers that let the client download. 5. Finds out unused connections that have better download rates. 6. Allows one peer to be unchoked regardless of its upload rate to allow peers without any data to join the swarm. The algorithm described in the protocol specification is claimed to fulfil all these requirements. To implement the choke algorithm, the client maintains two states for each connected peer:
Fig. 1 BT handshake procedure
Choked: indicates whether or not the remote peer has choked the client. This means that the client can not download from the remote peer until the client has been unchoked. Interested: indicates whether the client is interested in downloading something from the remote peer or not.
106
It should be noted that the client also needs to keep record of whether it is choking the remote peer or not and whether the remote peer is interested in downloading from the client. State changes should be notified as soon as they occur with choke/unchoke and interested/not interested messages. To comply with requirement 3, a maximum number of unchoked peers is established, i.e., the client may only upload to this number of peers. The default value in the reference client is four. In order to achieve requirement 2, the list of peers unchoked is checked every round. A round lasts 10 seconds by default, although some peer state changes may trigger this earlier. Requirements 4–6 are implemented by a mechanism called optimistic unchoking. Every three rounds a random peer, which is interested and choked by the client, gets unchoked. This allows the client to find peers that offer better download rates, as it will discard the slowest peer in the next round, replacing it with the optimistically unchoked peer. The algorithm is slightly different when in seeder phase, as the client no longer needs to download. The first implementations of the choke algorithm used to choose the fastest downloading peers as the unchoked ones. According to [6], starting with version 4.0.0 of the reference client, this criterion changed to favour peers that have been downloading for a shorter period of time, using higher upload rates when in a tie. With this new implementation, the seed bandwidth is shared more equally among the peers. The authors show that the older version allowed a high download capacity client to monopolise all the resources of one or more seeds, even if that client is not sharing, permitting free riding.
3 Traffic measurements For the present work, a number of passive measurements were performed. A local computer was configured to join as a peer in several real BT swarms and a separate measurement computer was configured to capture packets. Additionally, an active probe was deployed to measure the total swarm size. This two-pronged approach gives the advantage of both providing detailed message-level data as well as global swarm-level data. The measurements were made at the department of Telecommunication systems (BTH), Karlskrona, Sweden. 3.1 Measurement infrastructure The measurement infrastructure (shown in Fig. 2), was developed at BTH, and has been used for measurements on BT and Gnutella systems [4]. The measurements reported in this paper were performed using an Endace DAG 3.5E card. This provides a hardware
D. Erman et al.
Fig. 2 Measurement infrastructure
timestamp accuracy on the card of 60 ns, which is later decreased to 1 µs due to the timestamp format of the capture file format. The DAG card contains a passive wiretap which allows for the capture of all traffic traversing the measurement station. The clock on the DAG card is synchronized using GPS. The measurement system runs Debian GNU/Linux, with kernel 2.4.31 (kernel 2.4, as opposed to the more modern version 2.6, is required due to compatibility issues with the DAG card). The measurement computer is equipped with a Pentium 4 2.0 GHz processor, 512 MB RAM, 80 GB hard disk, a 10/100 FastEthernet network interface and the Endace DAG 3.5E. An additional 250 GB USB 2.0 drive is used to store the packet traces. The system where the BT client is executed runs Kubuntu GNU/Linux, with kernel 2.6.12. It is also equipped with a Pentium 4 2.0 GHz CPU, 512 MB RAM, 80 GB hard disk, 10/100 FastEthernet network interface computer. The BT reference client version 4.0.4 for Unix based systems, released on 17th August 2005, was used for all the measurements. 3.2 Measurement details Ten measurements were performed spanning over two weeks in late November 2005. To avoid copyright infringement, open source GNU/Linux distributions as well as files under Creative Commons license were selected as the content for our measurements. Files were selected based on their popularity, size and kind of content in order to model a wide range of contents. Thus, for GNU/Linux distributions, we used three large distributions: one popular distribution (RedHat Fedora Core 4 for i386 DVD image) and two other less popular distributions (RedHat Fedora Core 4 for x86_64 CD Images and RedHat Fedora Core 3 for i386 binary CD Images). For non-Linux related content we chose two medium
Validating BitTorrent models Table 1 Content summary
107 Content
Pieces
Size
Measurement
RedHat FC4 ‘Stentz’ i386 DVD Image
10493
2.56 GB
1, 3, 4, 9, 10
9721
2.37 GB
2
RedHat FC4 ‘Stentz’ x86_64 source CD Images
Table 2 Measurement summary
RedHat FC3 ‘Heidelberg’ i386 binary CD Images
9414
2.29 GB
5, 6
Star Wreck: In the Pirkinning
1083
541.38 MB
7
Best of Comfort Stand vol. 1
3072
767.87 MB
8
#
Records
Start
Duration
Comment
1
3692866
2005-11-23
1 day, 23 hours
Upload Bandwidth = 20 kB/s.
2
559881
2005-11-25
2 days, 19 hours
Discarded.
3
5142892
2005-11-28
10 hours
Interrupted due to insufficient disk space.
4
9126772
2005-11-29
22 hours
Valid.
5
1218861
2005-12-01
15 hours
Interrupted due to hard disk failure.
6
2074139
2005-12-02
1 day, 3 hours
Valid.
7
2182003
2005-12-03
1 day, 17 hours
Valid. Piece size = 512 kB.
8
2208976
2005-12-05
2 days
Valid.
9
1214856
2005-12-07
27 minutes
Discarded. BT client.
10
1181475
2005-12-07
41 minutes
Discarded. Azureus client.
size contents, one popular (Star Wreck: In the Pirkinning) and another less popular (Best of Comfort Stand vol. 1). Table 1 details the sizes and number of pieces of each content as well as the associated measurements. The piece size was 256 kB for all measurements, except measurement 7, which had a 512 kB piece size. Measurements 2, 9 and 10 failed self-consistency checks and were therefore discarded. The first two measurements were carried out with the default maximum upload bandwidth of 20 kB/s, to be able to compare whether this influenced the models or not. Measurements 3 and 5 were interrupted due to hard disk related problems, but were not discarded since messages captured before the interruption are still valid. Table 2 summarizes the measurement validity as well as duration and number of records.
4 Modeling methodology We use a modeling methodology similar to the one used in [3]. In short, this methodology is based on a three-step process: Distribution selection. This step mainly entails discarding distributions that are clearly not suitable model candidates. This is typically performed using visual inspection of histograms, Empirical Distribution Function (EDFs) and Complementary Cumulative Distribution Functions (CCDFs) as well as Hill plots and α-estimation plots.
Parameter estimation. Once suitable candidate distributions have been selected, we employ Maximum Likelihood Estimation (MLE) to obtain parameter estimates for the distributions. For mixture distributions, we use successive right censoring of the data to locate the cutoff points between the component distributions. Fitness assessment. The fitness of a given set of parameters and distribution is determined by both formal hypothesis tests and visual inspection using CCDF overplots and Quantile-Quantile (QQ)-plots. Additionally, the AndersonDarling (AD) statistic and an error percentage are used to give a numerical comparison base. The reason for also using the AD statistic is that it is specifically designed for taking into account errors in the tails of a distribution. Since two of the distributions we use for the modeling are longtailed (the log-normal and Pareto distributions), the AD statistic is useful for assessing the fitness of an estimated set of parameters. However, the AD statistic tends to grow as the number of samples grow, which is why we also include the error percentage E% in our models. The error percentage E% is defined as E% =
n 100 Ui − Uˆ i , nEmax
(1)
i=1
where Ui is the ith sample out of n sample drawn from a drawn uniform distribution, Uˆ i the ith order statistic of the investigated distribution, transformed to a uniform distribution using the estimated parameters, and Emax is defined
108
D. Erman et al.
Table 3 Fitness quality boundaries
5 Session characteristics
E% ≈
0
1
2
3
4
Degree
Excellent
Very good
Good
Fair
Poor
as the maximum discrepancy that may occur from a true U (0, 1), i.e.
1
Emax = 0
3 sup{U (x), 1 − U (x)} dx = , 4
(2)
where U (x) is the uniform distribution. For the purposes of this work, we use the informal degrees of fitness quality for E% in Table 3. The major difference to the work reported in [3] is that, for the models presented in this paper, the candidate models are already selected based on the previous results. While the distribution selection step was not omitted, it was made substantially less challenging as the candidate distributions were already known. 4.1 Model distributions The characteristics reported in Sect. 5 are modeled using three distinct distributions and mixtures thereof, namely the binary Hyper-exponential, Log-normal and Generalised Pareto distributions. The binary Hyper-exponential probability density function is defined as: H2 (x) = pλ1 e−λ1 x + (1 − p)λ2 e−λ2 x ,
(3)
where λ1 and λ2 are the arrival rates for the two exponentials and p is the mixing weight. The lognormal probability density function is given in (5), where μ is the mean and σLN is the standard deviation ⎧ 2 2 ⎨ 1 e−(ln x−μ) /2σLN x > 0, σ > 0, 2 (4) f (x) = x 2πσLN ⎩ 0 otherwise. The Generalised Pareto probabiltiy density function is defined as:
f (x) =
⎧1 x−μ −α −1 +1 ⎪ β {1 + α β } ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
0
x ≥ μ if α ≥ 0, μ≤x ≤μ− if α < 0, otherwise
β α
where β is the scale, α the shape and μ is the location.
(5)
In this section, we present models and analysis of key BT session characteristics, i.e., session inter-arrival time, size and duration. 5.1 Previous results One of the main motivations behind the work presented in this paper is to ascertain whether any invariants can be found in BT traffic patterns. To this end, we performed a measurement study in 2005, in which several interesting traffic characteristics were found [2, 3]. Important results from this study are as follows. For convenience, we include the model parameters as well as the error percentage E% (Sect. 4) in Tables 4–6. The estimated parameters are denoted with a hat (ˆ), and the parameter σˆ X is the standard deviation of the associated estimated parameter X. For instance, σˆ σLN in Table 5 is the standard deviation of the estimate of the σLN parameter. • BT session interarrival times are well modeled by a second order hyper-exponential distribution. • BT session sizes and durations are reasonably well modeled by the log-normal distribution. The reason for measurements 4–10 missing in Tables 5 and 6 is that there was not enough data to model these due to the censoring that was done on the data. This censoring was also employed for the models presented in this paper. The models for session size and durations are reported in Tables 5 and 6, respectively. Only the sessions that actually receive data have been modeled. Lognormal distributions with parameters μ and σLN have been used for modeling. The second to fifth columns show the estimated parameters, together with the associated estimated standard deviations, for which the best value of E% was obtained. The value of E% is given in column 8. The sixth column indicates the tail probability mass for which the fitting passed the 5% fitness limit of E% , while the seventh column shows the tail probability mass for which the best value was obtained. Column 9 shows the significance levels obtained in the Anderson-Darling test. While several other characteristics were presented in the mentioned studies, the primary goal of this paper is to analyze the session characteristics and compare them to the new study described in Sect. 3. Further results from this study can be found in [10]. 5.2 Session statistics summary In Table 7 we present summary sessions statistics. No aggregation of sessions based on peer_id is done, i.e., if one peer closes the connection with our client and starts a new
Validating BitTorrent models Table 4 Fitted hyperexponential parameters
109 λˆ 1
σˆ λ1
λˆ 2
σˆ λ2
pˆ
σˆ p
E%
1
0.0593
0.0046
0.1696
0.0085
0.2215
0.0467
2.07367
2
0.1158
0.0009
0.7556
0.0279
0.7936
0.0066
0.41535
3
0.0566
0.0006
0.3653
0.0099
0.6575
0.0077
0.49009
4
0.5372
0.0178
0.0168
0.0002
0.2533
0.0052
2.79455
5
0.5538
0.0212
0.0162
0.0002
0.2156
0.0052
2.79722
6
0.4798
0.0174
0.0127
0.0002
0.2879
0.0060
3.93588
7
0.4188
0.0143
0.0052
0.0001
0.3014
0.0076
2.05430
8
0.5142
0.0113
0.0168
0.0002
0.4252
0.0050
2.79291
10
0.5581
0.0205
0.0128
0.0002
0.3276
0.0064
3.7641
11
0.0140
0.0009
0.0802
0.0005
0.0219
0.0024
2.20763
12
0.0935
0.0004
5.8224
0.1380
0.8252
0.0021
3.84606
13
0.0563
0.0004
0.4175
0.0065
0.5897
0.0048
1.87389
Measurement
μˆ
σˆ μ
σˆ LN
σˆ σLN
Pass
Tail mass
E%
AD
1
18.7
0.04
0.62
0.02
0.45
0.21
2.1
> 0.25
2
17.8
0.04
0.99
0.03
1
0.4
2.9
> 0.025
Measurement number
Table 5 Upstream size parameters
sign.
number
Table 6 Duration parameters
3
18.4
0.04
0.60
0.02
1
0.24
3.3
> 0.05
11
14.1
0.06
2.44
0.04
1
0.99
2.4
≈ 0.001
12
13.6
0.05
2.36
0.04
0.86
0.74
3.4
< 0.001
13
19.0
0.03
0.69
0.02
1
0.17
3.0
> 0.025
Measurement
μˆ
σˆ μ
σˆ LN
σˆ σLN
Pass
Tail mass
E%
AD
1
8.55
0.03
1.08
0.02
1
0.74
2.2
≈ 0.01
2
8.16
0.04
1.33
0.03
1
0.99
1.5
> 0.15
3
8.17
0.04
1.38
0.02
1
0.98
1.6
> 0.05
11
8.09
0.04
1.56
0.03
1
1
2.4
> 0.001
12
7.2
0.03
1.57
0.02
1
1
3.9
0.001
13
7.94
0.03
1.52
0.02
1
1
2.3
< 0.001
sign.
number
session afterwards, this is considered to be two different sessions. The minimum session length and size are 0 seconds and 68 bytes respectively for all measurements. This corresponds to a session containing only an interrupted handshake (HSK), i.e., an upstream or downstream HSK message (the size of which is 68 bytes) with no reply. Measurement 1 shows very low values since the maximum upload bandwidth for this measurement was 20 kB/s. 5.3 Session inter-arrival times Binary hyper-exponential distributions were found to be good candidates to model session inter-arrival times. The
measurements have been analysed considering up to 99% probability mass, except for measurements 3 and 4 in which the whole dataset was used, in order to find a distribution that better models the considered data. This censoring is considered safe, as very big inter-arrival times do not stress the network, but modelling the lowest values well is critical. Furthermore, for measurement 8, it was needed to censor seven hours of data in order to obtain a better fit. This is justified because during the censored period the tracker was off-line, which means that inter-arrivals times increased in that specific period. It was not possible to model the period when the tracker was down because less than 200 samples were available.
110
D. Erman et al.
Table 7 Session and peer summary #
Number of sessions
Session length (s) Mean
Max
Std
Session download size (MB)
Session upload size (MB)
Session size (MB)
Mean
Mean
Mean
Max
Std
Max
Std
1
47090
136
84947
1482
0.0586
579.79
4.18
0.0725
100.63
3
12273
141
37759
1233
0.2228
2247.97
20.30
2.279
2276.83
1.06 48.7
Max
Std
0.1311
580.5
4.38
2.502
2279.3
52.81
4
23474
185
59220
1634
0.1184
816.84
6.93
2.325
2373.71
51.68
2.443
2376.3
52.20
5
997
333
43510
2265
2.427
426.53
22.99
5.918
1527.77
68.63
8.345
1529.4
79.30
6
1635
267
40519
2245
1.484
477.47
19.65
7.869
2053.13
98.24
9.355
2284.5
7
26491
57
63041
705
0.0219
186.23
1.41
0.4295
441.52
7.18
0.4515
442.0
7.33
8
7113
104
30839
894
0.1134
746.7
8.85
2.121
708.26
23.55
2.235
747.5
25.18
Table 8 MLE fitted hyper-exponential parameters
#
λˆ 1 ± σˆ λ1
λˆ 2 ± σˆ λ2
pˆ ± σˆ p
E%
104.8
AD
1
0.2625 ± 0.0020
1.1953 ± 0.0735
0.8947 ± 0.0079
0.5759
12.37
3
0.0667 ± 0.0185
0.3104 ± 0.0031
0.0034 ± 0.0017
2.4940
32.46
4
0.2982 ± 0.0026
1.7022 ± 0.3389
0.9740 ± 0.0072
0.8918
20.15
5
0.0102 ± 0.0006
7.7835 ± 0.4598
0.5150 ± 0.0198
3.1503
5.98
6
0.0082 ± 0.0003
6.5876 ± 0.3176
0.4855 ± 0.0151
4.8206
18.77
7
0.1792 ± 0.0015
2.0132 ± 0.5672
0.9666 ± 0.0071
0.2994
2.45
8
0.0233 ± 0.0004
10.9804 ± 0.9083
0.8872 ± 0.0058
4.7376
37.04
The conclusion is that the hyper-exponential distribution fits the empirical data well, as reported in Table 8. However, for measurements 5, 6 and 8, the error percentage is rather large. This is likely because there is not enough data to obtain a well fitting model. For measurements 5 and 8 better fits are obtained using minimum distance method, so the error percentages are reduced to 2% in both cases. There are several measurements with low E% , but only two measurements yield good results for the AD test. The obtained errors are likely to be caused by a bad fit in the tail, which is not critical for network traffic. Another possible explanation is the existence of spikes in the histograms. 5.4 Session duration and size In this section, the modelling results for the upstream size and duration of remotely initiated sessions during the seed phase are reported. We observe that their correlations vary greatly for different measurements, as shown in Table 9. This result can be explained by the different natures of the swarms our client is connected to. For measurements that show low correlation, we observe the mice and elephants effect, i.e., some long sessions receive little data and some short sessions receive large amounts of data [7]. This effect is shown in Fig. 3b. Only sessions that last more than 1 second and send more than a handshake are considered in Fig. 3. The black dots represent sessions that receive at least one piece and are remotely initiated during the seed phase. The green dashed lines show
Table 9 Correlation coefficients for session duration and upstream size Measurement
1
3
4
5
6
7
8
Seed phase ρxy
0.82
0.28
0.19
0.66
0.87
0.18
0.29
the average session duration and size for the sessions represented by black dots. The red dots represent the remaining sessions initiated during the seed phase, whereas the blue dots represent sessions initiated during the leech phase. The light blue dashed line represents the duration of the leech phase. Although an upper and lower cluster can be discerned in the plot, representing mice and elephants as reported in [2], we observe that another clearer cluster appears on the left of the duration mean line. We hypothesise this cluster as being a result of the new version of the choke algorithm for seed phase, as discussed in Sect. 2.4. The cluster represents many peers receiving similar amounts of unchoked time and data from our client. Our previous results with the old version of the client (Fig. 4b) and our results with the new version confirm the claim made in [6] that the old algorithm tended to favour peers with high download bandwidth (more pronounced mice and elephants effect), while the new one shares seed resources more evenly. For measurements 5 and 6, the high correlations are due to a low number of simultaneous sessions in an unpopular swarm, so the seed can upload to peers as much as they need and duration and size are closely related by upload rate. For
Validating BitTorrent models
111
Fig. 3 Session size-duration scatter plot
Fig. 4 Session size-duration scatter plot
measurement 1, the content is quite popular and many peers connect to our client during the seed phase. However, our client has a limited upload bandwidth, which clearly reflects on an upper straight line bound to the scatter plot in Fig. 3a as well as in the high correlation observed. It is interesting to compare both figures, as our client connected to the same swarm in both measurements. It can be deduced that an upload bandwidth limit together with the new choke algorithm makes the share of seed resources among the leechers more fair. The reason for modelling only remotely initiated sessions during the seed phase is the same as for session inter-arrival times: the leech phase is too short to be modelled and including both phases would give a mixed distribution, as the client behaviour is different for each phase. By not taking into account locally initiated sessions many sessions that contributed greatly to traffic were left out. In Fig. 3 these sessions are represented by the red dots. However, only considering remotely initiated sessions during the seed phase makes it possible to use the models obtained for session size, duration and inter-arrival together in simulation. Only sessions that receive at least one piece are modelled. The reason for this is double: • To compare these results with previous results reported in Erman et al. [3] and Erman [2].
• Size and duration distributions of sessions receiving pieces are bound to be related to the choke algorithm, while this is not the case for the rest of the sessions. Therefore, modelling all remotely initiated sessions together would stop us from understanding that relationship. Additionally, these sessions do not add significantly to the total amount of data transmitted, making modeling them less relevant. There is however a drawback: the number of sessions available after this censoring for each measurement is significantly smaller (Table 10). Measurements 5 and 6 are therefore left out of analysis for obvious reasons and we provide results for the other measurements, but always considering the reduced number of sessions. A single distribution for both session duration and size is desirable. However, no single model, neither the Pareto nor the Log-normal, was found to be a good fit for all the measurements. Instead, some of the measurements showed a clear mixture of models, i.e., they appear to be drawn from one distribution for values below a certain threshold and from another distribution over that threshold. Consequently, a censored mixture model consisting of a log-normal distribution for the body and a Pareto distribution for the tail was used. An optimization of a weighted sum of the error percentages for the body and the tail fit was
112 Table 10 Number of remotely initiated sessions during seed phase uploading at least 1 piece
D. Erman et al. Measurement
1
3
4
5
6
7
8
Sessions
1918
531
1175
5
12
524
303
% of total sessions
4.07
4.33
5.00
0.50
0.73
1.98
4.26
% of remotely initiated
4.37
4.80
5.33
0.76
1.05
2.01
8.16
during seed phase
Table 11 Fitted censored mixture log-normal Pareto parameters for session duration
#
Log-normal μˆ ± σˆ μ
σˆ ± σˆ σ
Cutoff point Pˆc ± σˆ Pc
Generalised Pareto βˆ ± σˆ β αˆ ± σˆ α
xˆm
1
6.26 ± 0.01
0.35 ± 0.02
0.80 ± 0.02
0.71 ± 0.01
2837 ± 298
1294
1.65
3.33
1.99
3
6.16 ± 0.01
0.33 ± 0.01
0.74 ± 0.01
0.62 ± 0.15
1485 ± 240
945
1.45
1.84
1.56
4
6.24 ± 0.01
0.36 ± 0.02
0.85 ± 0.01
0.55 ± 0.14
3308 ± 501
1619
2.17
4.18
2.48
7
6.82 ± 0.01
1.37 ± 0.02
Single
0.47 ± 0.07
1149 ± 88
11
1.36
1.07
–
8
3.82 ± 0.01
1.09 ± 0.01
0.73 ± 0.02
0.37 ± 0.06
2166 ± 393
359
5.34
2.94
4.69
E% Body
Tail
EW
Fig. 5 Modelling results for session duration during seed phase
performed to obtain the best cutoff point c for each measurement. The cutoff point value is given as the probability Pc , so the weighted error is defined as:
EW = Ebody · Pc + (1 − Pc ) · Etail , ˆ Ebody = E | Uˆ i = Fˆ (Xi ≤ c; ), ˆ Etail = E | Uˆ i = Fˆ (Xi > c; ),
(6)
where Fˆ (·) represents the probability distribution function, ˆ and measured samples Xi . with the estimated parameters The body probability mass and the tail probability mass, the errors and estimated parameters were obtained using left and right censoring before optimization.
5.4.1 Session duration For measurements 1, 3 and 4, only the censored mixture distribution gives a good fit (Fig. 5a). The single log-normal fits 50% of the probability mass well, but greatly underestimates the tail. The single Pareto does not fit the body or the tail at the same time. On the other hand, we obtained good fits for measurement 7 for both single log-normal and Pareto, the latter giving a slightly better error estimate and fit for the tail (Fig. 5b). Measurement 8 showed a good mixture fit, but also an acceptable Pareto fit that overestimates the tail. Table 11 shows the parameters for the censored mixture distribution, except for measurement 7 where both single distributions are given instead. The different fit results with different swarm characteristics lead us to hypothesise that there is a relationship between session duration distribution and the number of leechers in a swarm. Specifically, we believe that the clear mixture
Validating BitTorrent models
113
Fig. 6 Session duration mixture model for old client version
Fig. 7 Session duration single distribution models for old client version
of log-normal and Pareto distribution for popular swarms is caused by the new choke algorithm. This can be corroborated by the fact that this mixture of distributions was not observed in the measurement study in 2005 [2]. This is shown in Figs. 6 and 7 obtained in our previous study, with a client implementing the old version of the algorithm. Our claim is not refuted by the fact that measurement 7 shows a good fit with a single Pareto or log-normal as observed in Fig. 5b, because in this case, the number of leechers connected to our client was reduced and there was a large number of seeds available. We believe that the log-normal behaviour of the body is caused by the new choking algorithm when many leechers compete for seed resources. In this case the choking algorithm tends to unchoke all leechers for similar amounts of time, which is reflected in the cluster observed in Fig. 3 and the log-normal shape of the distribution for the body. On the other hand, when fewer leechers compete to get content from a seed, the leecher determines the session duration, leading to a Pareto distribution, which reflects the mice and elephants effect.
Therefore, the cutoff point value depends on how scarce the resources are in the swarm. This means that when there are few seeds and many leechers, log-normal behaviour predominates and the cutoff point is high, while when many seeds and few leechers are present, a Pareto distribution is expected. In the latter case, a log-normal distribution might also be a good fit, probably because of the limited amount of data a peer wants to get in a swarm, which sets an upper bound to the session duration. Also, the upload bandwidth of the seed must be considered, as leechers compete more to get resources from seeds that provide better download rates. 5.4.2 Session upstream size Although a good result using censored mixture distributions was expected, this proved not to be the case. Instead, we observed that both single log-normal and Pareto distribution are acceptable fits for most cases, the latter being a better fit for the tail and in most cases providing better error estimates, as shown in Table 12. There are three complementary factors that explain the discrepancies of the session duration compared to the previously reported results:
114 Table 12 Fitted single log-normal Pareto parameters for session upstream size
D. Erman et al. #
μˆ ± σˆ μ
Log-normal σˆ ± σˆ σ
E%
αˆ ± σˆ α
Generalised Pareto βˆ ± σˆ β
xˆm ± σxˆm
E%
1
12.61 ± 0.03
1.44 ± 0.02
3.11
0.80 ± 0.03
284860 ± 272
17791 ± 162
2.98
3
13.25 ± 0.08
1.93 ± 0.06
2.33
1.32 ± 0.09
422080 ± 426
17788 ± 704
3.34
4
12.93 ± 0.06
2.04 ± 0.04
4.39
1.65 ± 0.07
218870 ± 451
17787 ± 213
2.02
7
13.89 ± 0.11
2.43 ± 0.08
3.37
2.22 ± 0.13
480162 ± 623
16611 ± 844
5.21
8
13.52 ± 0.13
2.19 ± 0.09
5.66
1.77 ± 0.15
416606 ± 393
16859 ± 213
2.47
Fig. 8 Comparison of mixture model for upstream sizes during seed phase for measurement 1
• While there was no left censoring of session durations, there is censoring of the upstream session size, as we only consider sessions with more than one piece. This is combined with the fact that the lower body of the data is not as continuous as the duration is, since only completed pieces can be accounted and values are roughly multiples of the piece size. In fact, when further censoring of the data is made to only include session sizes over ten pieces, an excellent fit for the censored mixture model is obtained for measurement 1 (Fig. 8). • Session durations are tightly linked to the unchoke time of each peer, but upstream size depends on the bandwidth of the seed-leecher channel. Moreover, leechers with better upload rates are favoured when all the leechers have been unchoked for the same amount of time. This explains why most of the measurements look like single Pareto distributions: peers with a large amount of available bandwidth are able to download more data during the same unchoke time. Also, measurement 1 shows a clear censored mixture distribution which further confirms our assumption since, for measurement 1, bandwidth is limited and therefore sessions get a more equal amount of data. • Another important factor is that the maximum session size is limited by the content size. The reasoning is as follows. Consider the largest possible upstream size below
the threshold set by the content size. In a measurement that contains few sessions and spans a short amount of time, this size occurrence is less likely to appear. If a large session does occur, this makes the appearance of the distribution look much more like heavy tail behaviour compared to the same session size occurring in a measurement that lasts longer and contains many more sessions, since the probability of similarly sized (larger) sessions is higher. This would explain why previous results showed no heavy tail behaviour (Fig. 9b) while we do observe that behaviour, owing to our previous measurements lasting longer and containing more sessions. Measurement 7 also supports this assumption. The content size is 540 MB and the measurement spans a day and a half. It can be observed in Fig. 9a that the Pareto tends to overestimate the tail. To summarise, session upstream size distribution shows heavy-tail behaviour as long as the upper limit set by the content size does not come into play. The log-normal behaviour for the body observed for session durations is likely to be observed in session sizes when upload bandwidth is a constraint.
Validating BitTorrent models
115
Fig. 9 Single Pareto and log-normal fits for upstream size
6 Conclusions The main motivation for this paper is to corroborate previously obtained models for key BT characteristics in [2, 3]. A different measurement approach was taken in order to verify whether the models hold when inspecting these characteristics on the network layer. This required using specialised packet capture hardware with previously developed BT parsing software at BIT. Several measurements were performed on different BT networks with varying intrinsic characteristics. We report results regarding the important BT traffic characteristics session duration, size and inter-arrival times. Our previous results obtained for session inter-arrival times are confirmed, showing hyper-exponential distribution. On the other hand, discrepancies were found concerning session sizes and duration, which are mainly attributed to different implementations of the choke algorithm during seed phase. A censored mixture of log-normal and Pareto distributions was found to fit session duration and sizes under certain circumstances. The log-normal behaviour of the body is closely related to the new choke algorithm and only appears when leechers compete for seed resources. This behaviour also reflects on session size as long as upload bandwidth has to be shared among leechers. In swarms where seeders exceed leechers, single long-tail distribution models are observed. It is believed that the absence of true heavy-tail behaviour reported in previous results and observed in some of our measurements is because of content size limit rather than by BT systems behaving well.
References 1. Cohen, B. (2005). BitTorrent protocol specification. 2. Erman, D. (2005). BitTorrent traffic measurements and models. Licentiate thesis, Blekinge Institute of Technology.
3. Erman, D., Ilie, D., & Popescu, A. (2005). BitTorrent session characteristics and models—extended version. To appear in COMCOM special journal issue dedicated to “Performance Modelling and Evaluation of Heterogeneous Networks HET-NETs’05 and HET-NETs’06”. 4. Ilie, D., & Erman, D. (2007). Peer-to-peer traffic measurements (Technical Report No 2007:02). Blekinge Institute of Technology. 5. Izal, M. et al. (2004). Dissecting BitTorrent: Five months in a torrent’s lifetime. In Passive and active measurements (PAM2004). 6. Legout, A., Urvoy-Keller, G., & Michiardi, P. (2005). Understanding BitTorrent: An experimental perspective (Technical report). INRIA Sophia Antipolis/INRIA Rhône-Alpes—PLANETE INRIA France, EURECOM—Institut Eurecom. 7. Paxson, V., & Floyd, S. (1997) Why we don’t know how to simulate the Internet. In Winter simulation conference (pp. 1037– 1044). 8. Pouwelse, J. A. et al. (2005). The BitTorrent P 2 P file-sharing system: Measurements and analysis. In 4th international workshop on peer-to-peer systems (IPTPS’05). 9. Qiu, D., & Srikant, R. J. (2004). Modeling and performance analysis of BitTorrent-like peer-to-peer networks (Technical report). University of Illinois at Urbana-Champaign, USA. 10. Saavedra, D., Juan, S., & Sánchez González, J.Á. (2006). BitTorrent traffic model extensions. Master’s thesis, Blekinge Institute of Technology. 11. Saroiu, S., Gummadi, P. K., & Gribble, S. D. (2002). A measurement study of peer-to-peer file sharing systems. In Proceedings of the multimedia computing and networking (MMCN). 12. Sen, S., & Wang, J. (2004). Analyzing peer-to-peer traffic across large networks. IEEE/ACM Transactions on Networking, 12(2), 219–232.
David Erman received his B.Sc., M.Sc. and Ph.D. degrees at Blekinge Institute of Technology (BTH). He is currently employed as a research assistant at BTH, where he is working on mobility management and media distribution. His main research interests are distributed systems, IP networking, P 2 P systems and performance analysis of heterogeneous systems. He is a member of the IEEE, ACM and ACM SIGCOMM.
116
D. Erman et al. Daniel Saavedra received his M.Sc. degree in telecommunication engineering at the Technical University of Madrid (UPM). He wrote his Master Thesis at Blekinge Institute of Technology (BTH). He is currently employed as special projects engineer at Visual Tools, where he is working on video enhanced control systems. His main research interests are video analytics, IP networking, distributed systems and P 2 P protocols.
José Á. Sánchez González received his M.Sc. degree in Telecommunications Engineering at the Polytechnical University of Madrid (UPM), writing his Master Thesis at the Blekinge Institute of Technology (BTH) in 2006. He is currently employed as a research engineer at UPM, working on speech signal processing and software development.
Adrian Popescu received two Ph.D. degrees in electrical engineering, one from the Polytechnical Institute of Bucharest, Romania, in 1985 and another from the Royal Institute of Technology, Stockholm, Sweden in 1994. He has also the Docent degree in computer sciences from the Royal Institute of Technology, Stockholm, Sweden (2002). He is a Professor in the Department of Telecommunication Systems, Blekinge Institute of Technology, Karlskrona, Sweden. He is an area editor of Computer Networks (Elsevier) Journal. His main research interests include Internet, communication architectures and protocols, overlay routing, seamless handover, traffic measurements, analysis and modeling. He is a member of the IEEE, IEEE CS, ACM and ACM SIGCOMM.