A dynamic coalition formation game for search efficient adaptive ...

7 downloads 1961 Views 2MB Size Report
A dynamic coalition formation game for search efficient adaptive overlay construction in unstructured peer-to-peer networks. Authors; Authors and affiliations.
Peer-to-Peer Netw. Appl. (2014) 7:86–99 DOI 10.1007/s12083-012-0185-z

A dynamic coalition formation game for search efficient adaptive overlay construction in unstructured peer-to-peer networks Arezou Soltani Panah & Siavash Khorsandi

Received: 30 January 2012 / Accepted: 7 November 2012 / Published online: 20 December 2012 # Springer Science+Business Media New York 2012

Abstract A great number of recent works deal with improving search in peer-to-peer systems, specifically by clustering peers into semantic groups. When the process of clustering is predetermined and static, it suffers from lack of adaptation to highly dynamic peer-to-peer environments. We model the problem as a non-superadditive coalition game with non-transferable utility characteristic function, and propose a distributed dynamic coalition formation algorithm through myopic best-reply with experiment rule to solve the coalition formation problem. Coalitions are formed by peers with similar interests considering geographical proximity. The overlay network is dynamically reconfigured over time based on the changes in the interests or locations of the individual peers. The convergence of the proposed algorithm using “core solution” concept is studied. The simulation results show that the proposed algorithm can efficiently reduce the search time, although the overhead of the overlay adaptation is slightly higher. Keywords Peer-to-peer network . Overlay network . Coalition formation game . Non-transferable characteristic function . Non-superadditive . Overlapping coalitions

1 Introduction Peer-to-Peer architectures evolve through organizing the participating nodes in an overlay network, that is, a network in which the nodes represent the processes and the links represent the possible communication channels. Different overlay A. Soltani Panah : S. Khorsandi (*) Department of Computer Engineering and Information Technology, Amirkabir University of Technology, 424 Hafez Ave, 15875, Tehran, Iran e-mail: [email protected] A. Soltani Panah e-mail: [email protected]

networks offer different features in searching for data items, routing mechanism, performance, scalability, to name but a few. The most popular unstructured peer-to-peer systems such as Gnutella are not highly scalable or efficient mainly because of peers’ random inter-connections. It has been shown that in Gnutella only 7–10 % of the queries are successful in returning useful contents [1]. In order to solve the problem, structured overlays have been proposed that provide mapping between contents and location of peers through a DHT-based mechanism. Although structured peer-to-peer schemes are efficient for locating files but don’t support semantic queries and they are hard to maintain in environments where nodes join and leave network at a high rate. More recently, clustering approaches have been proposed for capturing relations between peers but most of them such as [2] use a predefined classification which leads to a static configuration that can neither adapt to changes in peer interests, nor recover from a wrong classification. In this paper we propose an overlay network that improves search performance through grouping peers with similar interests into coalitions. This grouping mechanism isn’t a single shot process, i.e. it continues over time and hence can capture changes. For this purpose, we focus on dynamics inspired from evolutionary games. Each peer is represented by an interest vector which would be updated based on recent queries. To prevent redundant traffic in the underlying physical network and the extra delay in message delivery, coalition formation is done considering geographical proximity of peers. Our main contributions are as follows: &

Each peer is assumed as a player which seeks to maximize its payoff in the network by establishing connections to similar peers. To quantify the similarity between peers, a non-transferable characteristic function is derived.

Peer-to-Peer Netw. Appl. (2014) 7:86–99

&

& &

&

&

&

There is no inherent superadditivity assumption in our work. Non-superadditivity means the coalition including all peers in the network, namely grand coalition, is not optimal and hence the grand coalition doesn’t emerge. Unlike most of existing coalition formation models, we allow for overlapping coalitions which leads to increase in search performance. We have extended the base game formulation proposed in [3], to adopt Non-transferable utility to make it suitable for our problem. Previous Non-transferable utility coalition games were not customized for overlay construction or search-optimized P2P networks. Our algorithm is adaptive to the environment changes. Peer-to-peer networks are strongly dynamic. Peers may join or leave network due to various reasons at any time. Not only the population of participating peers but also their interests may change over time. As a result, we allow overlay connections to be dynamically reconfigured. Finding the optimal coalition structure is generally a NPhard problem and equals to Bell number in the case of disjoint coalitions [4, 5]. Overlapping coalitions will further increase the complexity. Heuristic algorithms rely on “random experiments” to approach the core solution, but suffer from slow convergence. We have replaced random experiments with a new systematic approach based on Genetic algorithm for detecting blocking coalitions which considerably reduces the convergence time. Finally, a formal proof is provided that the proposed coalition formation algorithm will converge to its “core solution”. So given all peers making rational decisions based on this algorithm, convergence to the desired overlay is guaranteed. It is the most important differentiation between our work and others’ that we can guarantee the optimal solution will be found. Using dynamic coalition formation game theory, we provided a new formulation for the search problem in P2P networks through an analytical model to find the optimal clustering.

This paper is organized as follows. In Section 2, we review related works. Section 3 includes the formal statement of the problem and the related gaming models. Dynamic coalition formation algorithm for overlay construction will be described in Section 4. In Section 5, Simulation results are presented. Finally, Section 6 draws concluding remarks and future works.

2 Related works As an effort to decrease search time and network traffic in peer-to-peer networks, DHT-based overlays like Chord are proposed that provide mapping between content and location of peers [6, 7]; However, these overlays aren’t suitable

87

for environments where nodes join and leave the network at a high rate. Also, hash functions are designed only for exact queries and are not suitable for search based on partial information. Unstructured peer-to-peer networks do not suffer from these problems but demonstrate low search efficiency. There have been continuous efforts to improve the search efficiency in unstructured networks, such as random walk [8], iterative deepening [9], and adaptive probabilistic search [10]. There are many other works aiming to increase search efficiency by improving overlay construction. Broadly speaking, they can be classified into three categories, namely overlay construction based on a) a logical criterion, b) a semantic criterion, and c) a physical criterion. In the first category, connections between peers are established based on a logical criterion. In [11], links are established among trusted peers. In [12], and [13] peer connections are replaced with more profitable connections, according to a defined criterion. In [12], connections are set based on the degree of connectivity among the nodes. In [13], connections are created based on the history of the query hits which implicitly captures the temporal locality. In the second category, a semantic locality is exploited for overly construction. Semantic Overlay Network (SON) proposed by Crespo et al. groups peers with similar documents according to a predefined classification hierarchy [2]. In [14] a clustering method using semantic vectors is suggested that first categorizes data items according semantic space and then peers are clustered based on their data items. The main drawback of these schemes is their static configuration that cannot adapt with dynamic nature of peer-topeer networks and cannot recover from a wrong classification. There are a few recent studies which has issued this problem. In [15] a simple dynamic cluster construction is proposed, in which a cluster can be formed when access frequencies for certain data items increase. Although this method responds to the dynamic nature of the network to some extent but it can be shown that an optimal overlay cannot be formed, mainly because neither static nor dynamic clusters are never deleted. The third category focuses on the mismatching problem between the peer-to-peer overlay and the physical layer which causes a large volume of traffic in the underlying network as well as an extra delay in message delivery in the overlay network. Paper [16] is one of the first studies in this area that partitions the network such that peers falling within a given portion are relatively close to each other in terms of network latency. In [17], a mathematical model for topology awareness of overlay network is provided. In recent years, dynamic peer-to-peer systems have attracted the attention of research community. In [18], an extended version of adaptive probabilistic search, namely SPUN is proposed that employs reinforcement learning for assigning a probability value to peer’s neighbors based on

88

successful path, reducing the uncertainly in the peer selection decision compared to APS. Authors in [19] propose an adaptive search algorithm in Gnutella-like networks, in which some peers broadcast their new interest fields every few time and as a result other peers change their interest group by computing similarity again; However, if a peer’s query is not resolve in its groups, a blind search algorithm would be done in external groups of the requesting peer. In [20] a caching protocol for hierarchical P2Ps is proposed that exploits the heterogeneity of nodes. Super-peers maintain semantic caches with indices to file requested by peers with similar interests. Leaf peers are able to find super-peers that guarantee the highest performance of their searches, but they don’t provide any analytical justification for their claim. Paper [21] presents a novel synchronization technique for clustering distributed data using the K-Means algorithm with high clustering confidence that can adapt to dynamic P2P network, although the algorithm has significant scalability constraints. Authors in [22] suggest an ant inspired, self-organized search algorithm for improving rare information retrieval in p2p systems. There are also a few works which employ evolutionary game theory to analyze the dynamic process of p2p networks, for instance authors in [23] propose a selfadaptive overlay by merging and splitting clusters and then apply EGT to solve the problem of optimization of cluster sizes. Also, [24] presents an adaptive honeycomb p2p overlay with failure resistant property in structured networks which is limited to a small number of nodes. From a game theory point of view, there are a number of works applying coalition formation game in the peer-to-peer networks, most notably works reported in [25] and [26]. In [25] a coalition formation algorithm has been proposed for minimizing peers’ average download delay that converges into a Nashstable partition. In [26], a coalition formation process based on an incentive mechanism is provided with the purpose of discouraging free-loader behavior. We, however, apply a nontransferable coalition formation game for overlay construction which has not used before for this purpose. We construct an overlay based on interest and physical locality simultaneously, in order for improving search mechanism in peer-to-peer networks. Unlike the mentioned algorithms, we formulate the coalition formation game to maximize peers’ similarities in the coalitions. We show that this leads to a lower search time. In an earlier work, we had presented our formulation as a nonoverlapping coalition formation game [27]. This paper includes formulations and results for both overlapping and nonoverlapping formations.

3 Problem statement Search operation for a requested content is performed through logical connections between peers which build a network overlay on top of the underlying physical layer.

Peer-to-Peer Netw. Appl. (2014) 7:86–99

Instead of establishing connections in an arbitrary manner, we aim to connect together peers with similar interests. Therefore, overlay construction can be formulated as a maximization problem, i.e. gaining maximum similarity between peers in constructed overlay. Assume there are N0{1,2,…,n} peers in the network. Each non-empty member of the power set of N is called a coalition, i.e. S∊2N is a coalition. We denote the set of all formed coalitions in the network by Co which is called a coalition structure. We assume a general case of non-disjoint coalitions, i.e., the coalition structure can include overlapping coalitions and each peer can join at most m coalitions simultaneously. Set Coi ¼ fS 2 Co=i 2 S g to denote all coalitions that peer i belongs to. Hence, we have Coi ⊂Co. Set of all possible coalition structures is denoted by C. The payoff allocation 8ðCoÞ ¼ f8i ðSÞgi is a vector of 8i(S)’s representing the utility of every peer i in coalition S that it belongs to. The maximization problem can now be stated as follows: max

Co2C

XX

8 i ðSÞ

i2N S2Coi

Constraint 1 states possible payoff values. For a peer i and coalition S, 8i(S)00 if there is no similarity pffiffiffiffiffiffi between peer i and other peers in S, and 8i ðSÞ ¼ jS j if they are perfectly similar. The reason why 8i is proportional to the square root of |S| is that we want to set a soft limit on the coalition size to prevent it from growing linearly. The max value of |S| is equal to n (in the case of grand coalition), as a pffiffiffi result 8i ðSÞ 2 R \ ½0; nŠ. Constraint 2 guarantees fairness of the solution by eventually arriving at a Pareto optimal point. In other words, there is no other solution in which the utility of a peer can be improved. To arrive at our coalition formation algorithm demonstrated in Section 4, we first review relevant concepts from coalition formation game theory. 3.1 Coalition formation games Coalition game theory is a useful way for modeling agent cooperation in multi-agent systems. A coalition formation game is generally non-superadditive and has two types: static coalition formation game, and dynamic coalition formation game. In the former, an external factor imposes a certain coalition structure and the objective is to study this structure and its stability so that no player has an incentive to deviate from resulting coalition structure. The main objectives in dynamic coalition formation games are to

Peer-to-Peer Netw. Appl. (2014) 7:86–99

analyze the formation of a coalition structure through player’s interactions, as well as to study the properties of this structure and its adaptability to environment variable or externalities [5]. From another perspective, coalition games are classified into two types based on the portion of gains among players in a coalition: 1) a transferable utility (TU) game where total payoff can arbitrarily be divided in any manner among coalition members and 2) a non-transferable utility (NTU) game where there are rigid restrictions on distribution of the utility. A TU coalition formation game is defined by (N,v) which N is the set of players and v is a real-valued characteristic function such that v(S): 2N ! R; S  N . In a NTU game the value of a coalition is no longer a function over the real line, but is a set of payoff allocations, i.e. V ðSÞ  RjS j where each element xi of a vector x ∊ V(S) represents the payoff that player i ∊ S can obtain within coalition S given a certain strategy selected by i. If a group of peers can form a coalition that provides its members larger similarity, this coalition will block the current allocation payoff. In other words, a coalition S with allocation y in ℝ|S| will block an allocation x if yi > xi 8i 2 S . In this case, S is called a blocking coalition. 3.1.1 Overlapping coalition formation games Current works typically restrict coalition formation to disjoint coalitions. The notion of overlapping coalitions was introduced by Shehory and Krous [28]. In comparison to the standard coalition games with strict membership, overlapping coalition formation games allow agents to join several coalitions. An important class of overlapping coalition games is fuzzy games. In fuzzy coalition games, a player can participate in a coalition at various levels, and value of a coalition depends on participation levels of its agents [31]. In a peer-to-peer network, peers may have multiple interests; as a result they can join multiple coalitions simultaneously. We therefore, formulate our problem as an overlapping coalitions problem. However, the memberships in our formulation are not fuzzy enabling us to study the stability of the resulting network overlay using the concept of core. Definition 1 For any NTU game (N,V) an allocation x 2 Rnm is a core allocation if In other words, the core of a coalition game is the set of payoff allocations which guarantees not to be blocked by any other coalition of players. This implies that core allocations are stable in the sense that, once a core allocation is achieved, there would be no incentive for players to change their coalitions. In next section, we present our model of the problem as a non-transferable utility coalition game with overlapping coalitions.

89

4 Dynamic coalition formation In this section, we extend the dynamic coalition formation game proposed in [3] to NTU coalition games, taking overlapping coalitions into account simultaneously. 4.1 Dynamic coalition formation model In each period t, every player independently takes a random draw from Bernoulli trial with probability γ and outcome “adjust”. If this happens, the player decides whether to join any existing coalitions or to form a singleton coalition, with or without leaving any of its current coalitions. In this model players (peers) are assumed to be only bounded rational [3]. A player who is selected to move seeks to maximize its expected payoff for the next period. Each player’s strategic variables are its coalition choice (Si) and its payoff (8i). Hence possible strategies for player i at time t is given by:    pffiffiffi "i ðCoÞ ¼ ðSi ; 8i ÞSi ¼ S k [ fig8S k 2 Co; 8i 2 0; n k ð1Þ In Formula (1), Sk is the kth coalition of current coalition structure Co, i.e. Co0{Sk}k. In the following, we will develop player strategies for finding coalitions to join in two general categories: Myopic strategy and far-sighted strategy. 4.1.1 Myopic strategy As a myopic optimizer, the player chooses a coalition which promises him the highest feasible payoff, without decreasing others’ payoffs in that coalition. Let Si,j(t)denote the jth coalition that player i belongs to at time t. For abbreviation, we show Si,j(t) by Si,j. So the chosen coalition Si,j (t+1) is given by rule ℱ: F : Si;j ðt þ 1Þ ¼ argmax 8i;j ðSÞ S ¼ S k [ fig 8k Si Si;j

ð2Þ

If all peers in the network follow the rule ℱ, we have F CðtÞ ! C ðt þ 1Þ. Rule ℱ, definesa finite Markov chain with pffiffiffi nm  state space Ω ¼ w ¼ ðCo; 8ðCoÞÞCo 2 C; 8 2 ½0; nŠ which is called best-reply process. If Nww′ denotes the set of peers which get chance to change their coalitions, the transition probability from state w to w′ is then: pww0 ¼ ð1

g Þn

jNww0 j

Y

gpi ðw0 jwÞ

ð3Þ

i2Nww0

The function pi is defined by the player i’s best-reply rule. 0 gp probability of player i switching

0 0 from i  ðw j w  Þ shows 0 Si;j ; 8i Si;j associated with state w to Si;j ; 8i Si;j

90

Peer-to-Peer Netw. Appl. (2014) 7:86–99 0

associated with state w′, which Si;j is maximizer of (2), and 0 0 8i;j is payoff for joining coalition Si;j . 4.1.2 Farsighted strategy Following the best-reply process, players switch between coalitions over time until a stable coalition structure forms in which no player has an incentive to move. It can be shown that the best-reply process has at least one absorbing state; we, however, are interested in driving conditions which the process converges towards states involving core allocations. For example consider a game with three players. Assume ∀i∊N, v({i})02, v(S)05 for |S|02 and v(S)08 for | S|03. The state w0({N},(4,2,2)) is absorbing but the allocation 80(4,2,2) is not in the core, since it is blocked by the coalition {2,3}. Thus by applying the best-reply process, absorbing states may comprise non-equilibrium states.1 We call these states as “deadlock” states. In order to break up deadlock situations, experiment is introduced in [3]. If we assume that all players know the characteristic function, players are allowed to experiment with suboptimal strategies whenever they are a member of a potentially blocking coalition. It is probable that players’ payoffs decrease in this step, but taking farsighted strategy will lead to an increase theirs in future. It is noticeable that experimenting differs from the concept of “trembles” or “mutations” in evolutionary games: The players experiment if and only if there exist an outcome that is potentially better, whereas mutations occur in any state with uniform probability.

few methods for computing these metrics. Without losing generality, we use Global Network Position (GNP) method to predict internet network distance. GNP is based on absolute coordination from modeling Internet as a ddimensional geometric space. So we assume each peer can be shown via a point in d dimensional Cartesian space. For each peer i we measure its average distance to other coalition members. It is obvious the value of a coalition is an ascendant function of interest locality and descending function of geographical locality. Hence if 8i(S) denotes the utility of player i when he belongs to a coalition S (at period t), it can be defined as follows: ( 8i ðSÞ ¼

In order to consider both interest similarity and geographical locality, a characteristic function is derived which assigns a real number to each player as its payoff for cooperation. We use vector space model (VSM) to obtain similarity between two peers. Through VSM, peer’s interests are shown via a vector that each dimension corresponds to a separate interest. So similarity between two peers is simply the cosine of the angle between the interest vectors. It is obvious that the higher the cosine of the two peers, the higher the similarities between them. As a result, for each player, its similarity to other coalition members is the summation of pair-wise cosine values between its interest vector and theirs. In order to preserve physical locality, there are a great number of metrics to estimate the distance between two peers such as round trip time, delay, autonomous system path, actual geographical distance. Also, there are quite a

1

Readers are referred to [5] for more examples.

jS j ¼ 1 jS j  2

ð4Þ

If pi shows interest vector of player i, Simi(S) be the similarity of peer i to coalition S that he belongs to and Di(S) be the average distance of peer i from other coalition members, we have: Simi ðSÞ ¼

X

X X !  !  ! pj pi  ! qffiffiffiffiffiffiffiffi  Sim ! pi ; p j ¼ cos ! pi ; pj ¼ !   j6¼i;j2S j6¼i;j2S j6¼i;j2S pj  pi  !

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 Pd  yj;k k¼1 xi;k j6¼i;j2S

ð5Þ

P Di ðSÞ ¼

4.2 Characteristic function

0 pffiffiffiffiffiffi a:Simi ðSÞ jS j jS j 1 þ 1þDbi ðSÞ

jS j

ð6Þ

In Formula 4, α and β are constant such that α+β01 and α>β. We can select them to meet a good balance between similarity and topology awareness. The term |S|−1 is used for scaling Simt(S) into 1. It is noticeable that 8 is a function of time. That is, the interest vector of the peer (Simt(S)) could be changed according to the recent downloads or its location (Di(S)) could be modified due to movement. So the value of a coalition can adapt to changes in the network. To illustrate utility function behavior more clearly, consider a scenario of n peers in which peer i wants to estimate its payoff through joining them. While m peers in the network are quiet similar to peer i, there is no similarity between peer i and n-m-1other peers. We have drawn utility function behavior for different values of α, β, m and Di (S) in Fig. 1 in which “x” axis stands for coalition size and “y” axis shows peer i’s payoff. As it is obvious in Fig. 1 there is a maximum payoff value that divides the plot into two parts. The left part indicates the received payoff for forming coalitions

Peer-to-Peer Netw. Appl. (2014) 7:86–99

Fig. 1 Payoff function behavior

including quite similar peers, while the right part demonstrates payoff for adding dissimilar peers to coalition including peer i and quite m similar peers. As we can see the grand coalition is not an optimal solution for the maximization problem stated in Section 3. For α01 and β00, coalition including all m similar peers leads to maximum payoff for peer i (dotted line in Fig. 1). Due to the non-transferable nature of the characteristic function, we define an individual order to compare coalitions as follows:

91

coalition and there isn’t any subset of that blocking coalition in the current coalition structure (step 2 in Algorithm 1). In Algorithm 1, B i (t) is set of blocking coalitions including peer i at time t, and Co is the current coalition structure. As mentioned, each peer can join several coalitions. It is assumed that each peer can join at most m coalitions. In the first step, the player which gets the chance to revise its strategy, must select one of its coalitions (Si,j(t)) and evaluate the best strategy upon its current payoff in that coalition. At last step, if the number of its current coalitions is less than m, he can join to a new coalition without leaving any ones. Otherwise he replaces it with the selected coalition in step 1. To achieve adaptation to changes, each peer runs the coalition formation algorithm periodically (every Δ period of time) and decides to which of other peers, it must connect to. It is clear from Formula 4 that payoff of each player is a function of time. So through the proposed algorithm, we can make sure that peers dynamically reconfigure their connections with best ones.

Definition 2 For each player i preference operator ≻i is defined such that

4.3 Dynamic coalition formation algorithm Algorithm 1 shows the proposed coalition formation algorithm (CFA) that dynamically constructs the overlay network with interest locality and topology awareness properties, which is being executed by each peer in the network. As stated in Subsection 4.1, a player may select the coalition that brings him the strictly highest payoff according to the rule ℱ or may form a singleton coalition for destabilizing deadlock situations. In the coalition formation algorithm proposed in [3], players experiment by a random trigger (with probability ε) which may cause slower convergence of algorithm to the core. For example, assume the current coalition structure is {{1,2} {3,4,5}{6,7}} and {1,3,6,7} is a blocking coalition. If player 1 gets the chance to switch its coalition and also gets the chance to experiment, he may form a singleton coalition {1}, but it would be better to join {6,7} hoping player 3 to join them in the next period. The modified algorithm is then as follows. A player takes farsighted strategy (experiment) only when there is a blocking

Main computational complexity of CFA is in step 2 due to search for blocking coalitions which requires an exhaustive search to check m*(2N-1) different coalitions including peer i. As the number of peers increases, this number goes up exponentially. In [29] authors proposed to construct a random sample set from the set of all possible coalitions and examine if they are blocking coalitions. This blind search mechanism may cause the convergence time of the algorithm to increase. Instead, we propose a genetic algorithm to search for blocking coalitions. Genetic algorithms (GAs) belong to the larger class of evolutionary algorithms inspired by natural evolution

92

Peer-to-Peer Netw. Appl. (2014) 7:86–99

which have been used as a search technique of many NP-hard problems. We applied a simple genetic algorithm (SGA) in which each chromosome corresponds to a coalition and the fitness function corresponds to the

characteristic function. The crossover operator is defined as switching the members of the coalitions whereas mutation operator would be deleting some members of the coalition. Below the pseudo code for SGA is shown.

As it is shown, we commence by randomly generating a population of P candidate solutions (coalitions). Given such a population, SGA generates a new coalition by selecting two of candidate coalitions as the parent coalitions which are selected based on higher fitness value that is calculated from Formula 4. Given two parents we generate a coalition by switching members of them, which is called cross over, and then delete some members of that coalition to mutate. Then coalitions with higher characteristic function values are selected for next iteration. The process continues until a blocking coalition is founded or the max iteration number is reached. The accurate parameters’ values like cross over and mutation rate are provided in Section 5.1

it. By applying this architecture, updates are only sent to CHs. Although the coalition formation imposes some kind of structure to the network, one can notice that this is totally different from structured P2P systems such as Chord where a predefined position for each node in the ring is defined and that suffers from scalability problem. In our simulation for every coalition we select peer with the minimum id as a CH, because we didn’t consider any assumption about heterogeneity of peers. It is clear that peers with more computational power would be favored to assume the role of CH. In the case of a node failure in a CH, the network can quickly recover from failure since we define a backup CH, the peer with next minimum id, to take over in the event a CH quits or fails. It should be mentioned that in this proposed architecture, each peer in a coalition knows other coalition members. Unlike other hierarchical architectures that leaf peers only see their corresponding super-peers, coalition members in the proposed scheme are aware of each other. Therefore, a voting scheme can also be defined to choose the new CH.

4.4 Network architecture As mentioned in Subsection 4.3 the main computational complexity of CFA is detecting the blocking coalitions which requires the information of all peers in the network. Moreover, after switching between coalitions, state of the network, including new payoffs, new coalition structure, need updating. By increasing the number of peers in the network, this complexity would cause the proposed algorithm to perform ineffective. In order to solve this problem, a hierarchical architecture is considered so that each coalition contains a peer that has all of the information about its coalition members which is named coalition head (CH) and can communicate with other CHs to obtain information of their member coalition. Therefore, when a peer wants to investigate whether there is a blocking coalition or not, CH can do

4.5 Search mechanism When a query is generated, it is first processed in local coalitions (intra-coalition routing). If result wasn’t found, it will be routed to the coalitions with higher similarity to the query sequentially (inter-coalition routing). For this purpose, the requestor peer sends a message to CH to compute similarity of the query to the coalition’s interests. Interest of

Peer-to-Peer Netw. Appl. (2014) 7:86–99

each coalition is shown by a semantic vector which is sum of semantic vectors ! of all peers in that coalition and is maintained by CH. If ptS denotes semantic vector of coalition

4.6 Convergence of coalition formation algorithm In this section we prove that the distributed coalition formation algorithm with genetic algorithm for searching blocking coalitions, guarantees the maximum similarity between peers in coalitions. Theorem 1: Suppose the core of the game is not empty. By applying genetic algorithm for detecting blocking coalitions, if every peer in the network follows overlapping coalition formation algorithm, the best-reply process with experimentation converges to the core of the game, i.e.   lim wn ¼ wc w0 ¼ 1;

n!1

8w0 2 Ω

Where ωc 0(cc, xc). Proof: For this purpose we used the procedure used in [21]. Suppose there exist b blocking coalitions in state ωt. By applying genetic algorithm a population including p coalitions will be generated (assume p≫b), the probability of detecting k nout of

b blocking coalib 2 b k p k . So probability tions is equal to pk ¼ the

2n 2n b p p b of detecting all b blocking coalitions is pb ¼ . As a n p result, the probability of detecting blocking coalitions withonly checking p sets form 2n power set is positive. Detecting

93

! P ! S, then ptS ¼ i2S pti . Finally based on computed similarity, the query is forwarded to appreciate coalitions. Search mechanism is stated in Algorithm 3 as follows:

at least one blocking coalition causes destabilizing nonequilibrium absorbing state. According to theorem 2 of [3] allocation vector associated with an absorbing state of the best-reply process with experimentation coincides with the core allocation of the game. So it is enough to show that the proposed algorithm will converge to an absorbing state with probability one as time tends to infinity. This is proved by showing that the process will not stuck in ergodic sets other than the absorbing states. We can also commence the proof by contradiction. Assume there exists an ergodic set Ψ⊂Ω such that |Ψ|≥ 2. According to Theorem 2 of [3], none of the states in Ψ involve core because absorbing states are singleton ergodic sets. Since core of the game is not empty, for every ω ∊ Ψ there exist a S′ ∈ C such that 8i 2 S 0 ; S 0 i SðiÞ. So some peers have incentive to experiment. So there is a positive probability that one of peers in that blocking coalition experiment and form a singleton coalition (assume S1) that leads to ωt+1. Since ωt+1 can be reached from ωt with positive probability, ωt+1∊ Ψ. Now if other peers in blocking coalition S join S1, an absorbing state ωc 0(Cc,xc) can be reached in one step. Therefor starting from ωt there is a positive probability to reach an absorbing state with two steps. This contradicts the assumption that ωt+1 is an element of an ergodic set and this completes the proof. □ So, by applying coalition formation algorithm (Algorithm 1) and by using simple genetic algorithm for searching blocking  coalitions (Algorithm 2), we have a Markov chain w0 ; w1 ; w2 ; . . . which converges to ω2, and remains in that state if no changes happen in the network.

94

Peer-to-Peer Netw. Appl. (2014) 7:86–99 4000

Table 1 Network parameters

Gnutella

3500

Value

General parameters Number of peers Content distribution

SON

3000

Variable x  Zipf ðs ¼ 0:1; N ¼ 100Þ

Search Time

Parameter

CFA

2500 2000 1500 1000

1=xs

PX ðx; s; N Þ ¼ PN

500

1 n¼1 ns

0

x  logN ðμ ¼ 5; σ2 ¼ 6:25Þ

Query distribution

1 e fX ðx; μ; σ2 Þ ¼ xpffiffiffiffiffiffiffi 2pσ2

Max number of neighbors in Gnutella TTL Game’s parameters α, β

10

γ Number of peers with dynamic interests Number of peers with dynamic location Max number of coalitions of each peer (m) Genetic algorithm’s parameters Population size (p) Crossover rate Mutation rate Max number of iteration of genetic algorithm

0.5 n/3

ðlnx μÞ2 2σ2

6 α00.7, β00.3

n/3 3

Variable dp=4e dp=10e 100

5 Numerical evaluation 5.1 Simulation environment For simulating our algorithm in unstructured peer-to-peer networks, we have extended the PlanetSim simulator. PlanetSim is a discrete event simulation framework in Java for both structured and unstructured overlays [32]. Since it has only two implemented structured overlays, namely Chord and Symphony, we have extended it for unstructured overlays. More precisely we added three extensions to PlanetSim in overlay layer:

50

100

150

200

500

1000

1500

Network Size

Fig. 3 Average search time in dynamic network

Gnutella v0.4, static SON [2] and proposed overlay. Also we extended network layer to report delay and gather statistics. Moreover, an appropriate workload is required for evaluating our search algorithm. Although the system workload can be more realistic using real traces, we have opted to use the synthetic to make the results reproducible and hence easy to compare in future works. Besides, use of real traces did not seem as a necessity to compare the overall network utility using the proposed scheme to those using previous schemes. None the less, to model a real situation, we set the environment parameters as shown in Table 1. To model both peer interests and queries in the network, we use ontology-based ACM classification including 11 categories (A-K) [33]. For simulation, we distribute documents according to leaf categories in ACM classification. For query distribution, a set of 110 queries (including 10 queries for each of 11 categories) are generated. According to [6] the number of content in each peer follows Zipf distribution, since in a peer-to-peer network there are a few peers that hold a large number of documents whereas the majority of the peers share few files. Also it is shown that in a peer-to-peer network, number of queries per a connected session is according to lognormal distribution [6]. Hence we use lognormal distribution for number of queries. We compare our algorithm with two overlays: Gnutella 0.4 and SON [2]. The reason for choosing Gnutella 0.4 instead of 0.6 is that Gnutella 0.6 is a two tier overlay 90

Gnutella

Gnutella

80

SON

SON

70

CFA

CFA

60

3500

Search Time

3000

Hit Ratio(%)

4000

2500 2000 1500

50 40 30

1000

20

500

10 0

0 50

100

150

200

Network Size

Fig. 2 Average search time in static network

500

1000

1500

50

100

150

200

500

1000

Network Size

Fig. 4 The hit ratio with TTL 0 6 in static scenario

1500

Peer-to-Peer Netw. Appl. (2014) 7:86–99

95

100

Gnutella

SON CFA

70 60 50 40 30

200

Number of peers

80

Hit Ratio(%)

250

Gnutella

90

Coalitions 150

100

50

20 10

0

0 50

100

150

200

500

1000

5

1500

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

Average Distance

Network Size

Fig. 5 The hit ratio with TTL 0 6 in dynamic scenario

Fig. 7 Histogram of average distance between each peer and it’s neighbors for network size 0 500

network, consisting of two types of nodes: ultra-peer & leafpeer. However, there is no direct connection between any two leaf-peers in the overlay network. A leaf peer doesn’t forward query received from an ultra-peer. On the other hand, ultra-peers perform query searching on behalf of their leaf peers. Gnutella 0.6 incorporates dynamic querying as a limited flooding search technique. In dynamic querying, an ultra-peer incrementally forwards a query in 3 steps: TTL (1),TTL(2),TTL(3). Consequently dynamic querying uses TTL(3) only for rare searches. Although a hierarchical architecture is employed in Gnutella 0.6, there is no formal strategy to guide the system to its “core solution” that is the thrust of this paper. Ultra-peer selection in Gnutella 0.6, is based on system and network-layer information such as available bandwidth and up-time. This is totally different from our approach that is based on semantic information. In terms of performance comparison, in addition to Gnutella 0.4, we have compared our scheme to SON which is similar to our approach in terms of semantic clustering and is a far superior approach compared to Gnutella 0.6. Therefore, in our performance evaluation, we have considered both nonhierarchical (Gnutella 0.4) and semantically hierarchical (SON) alternatives. Performance of Gnutella 0.6 will lie somewhere in between Gnutella 0.4 and SON and was not of particular attention in the context of this paper that is concerned with dynamic clustering. For simulation studies, we design three scenarios as follows:

&

average distance

Fig. 6 Average distance between each peer and it’s neighbors

&

&

Scenario one: We consider a static scenario in which peers have static interests. So established connections won’t change after constructing overlay upon coalition formation algorithm. This will show, without changing the overlay how much our algorithm is effective. Scenario two: All nodes are free to leave and join the network, but network is stable after a query until the result is successfully sent. To verify the impact of dynamics in the network, we induce changes in the environment including changes in the peer’s interests and their locations. For this purpose we allow n/3 of peers to change their location and n/3 of peers to change their interests randomly. Scenario three: In order to examine the convergence of proposed game to core, we construct small networks and achieving number of iteration which is sufficient to reach stability.

For performance evaluation, we measure average query search time and query hit ratio that indicates the number of successful queries. Also we are interested in measuring the convergence time of proposed coalition formation algorithm. 5.2 Simulation results Now in this subsection we describe the result of the various experiments conducted and analyze the results.

60

Gnutella

50

Coalitions, m=3 Coalitions, m=5

40

Coalitions, m=7

30 20 10 0 50

100

150

200 network size

500

1000

1500

96

Peer-to-Peer Netw. Appl. (2014) 7:86–99

Table 2 Convergence time of algorithm Network size 50 100 150 200 500 1000 1500

Iteration number

Population size for genetic algo.

Convergence time

45 75 120 180 200 320 400

25 55 60 70 75 75 80

847 2531 7512 40000 57603 80023 85014

Figures 2 and 3 show the average search time for 110 generated queries. In Fig. 2 simulation results are based on a static peer-to-peer network and as we can see, the coalition formation algorithm and SON performance are close to each other. More precisely since we consider topology awareness partly (β00.3), achieved improvement in search time is small compared to static SON. It comes from the fact that CFA groups peers with small network distances, but in doing that it sacrifices the similarity of semantic interests in coalitions. The simulation results in Fig. 3 show that our proposed algorithm can reduce search time greatly in comparison with Gnutella and static SON in a peer-to-peer dynamic network, as it was expected. In addition, in Figs. 4 and 5 the query hit ratio is compared to Gnutella and static SON. The plots in these figures clearly show that our method can acquire more query hit ratio compared to two other overlays, specifically Gnutella. This is because in our overlay and static SON queries are routed in an intelligent way instead of flooding them blindly. Since one of our main contributions is preserving geographical locality, in Fig. 6 the average distance between each peer and its neighbors is compared to Gnutella overlay and the constructed overlay through coalition formation algorithm for different network sizes and m. For this purpose we set α 00 and β 01 in Formula 4. The plot shows that peers in the same coalitions are considerably closer compared to Gnutella Fig. 8 Convergence time comparison

overlay. As a result, we expect the average search time can be significantly reduced when closer peers connect to each other in the overlay network. Also the histogram of average distance is shown in Fig. 7 for network size of 500. As we can see in coalitions the average distance from a peer to its neighbors is 25, while in Guntella most peers have average distance equals to 50 from their neighbors. Finally, we investigate convergence of dynamic coalition formation algorithm through simulation. For this purpose, we run the algorithm until it converges to the core where no peers can find better coalitions to join. In Table 2 convergence times for constructing stable overlays are provided. As simulation results shows our algorithm is able to converge to the core within a reasonable time as provided in Theorem 2. Please note that all time presented in this section are simulation time, not system time. For example in Table 2, simulation time required for convergence of algorithm with a network with 200 nodes equals 40000 whereas in our system with a 2.53 GHz dual core and 4 GB RAM, it takes only 453.2 sec. We claim that applying genetic algorithm for detecting blocking coalitions reduces the convergence time in comparison with heuristic algorithms which rely on “random experiments” to approach the core solution. For proving our claim, we replaced the GA function with a random experiment algorithm that was proposed in [29] in order to search blocking coalitions. The number of required iterations to reach to the core solution is recorded. The results are shown in Fig. 8. As it was expected, applying the GA reduced the time needed for converging to the core solution (by roughly 19 %), since it searches the space in a more intelligent way. As the network size grows, the quality of overlay formation plays a more critical role in network performance. This is visible from the results presented in this paper. It is seen that the performance gap between CFA and other schemes widens as the network size grows. Therefore, CFA would perform much better in the network with more than 10000 nodes. However, as the

Convergence Time

600000

CFA with random experiment CFA with GA.

500000 400000 300000 200000 100000 0

50

100

150

200 500 Network Size

1000

1500

Peer-to-Peer Netw. Appl. (2014) 7:86–99

97

Iteration=0

Iteration=10

Iteration=20

Iteration=30

(a ) Fig. 9 Coalition-based overlay formation process in two sample scenarios. Red circles show the coalition heads

(b)

98

network size grows, the convergence time to arrive at the core solution also grows. The complexity mainly lies in coalition set calculation. The proposed GAbased coalition set calculation is aimed at limiting protocol complexity by avoiding an exhaustive search. In short, the scalability of algorithm is guaranteed by scalability of GA. For a better perception, we have shown the process of overlay construction through proposed algorithm in a network with 30 peers in Fig. 9. The graphs are drawn via yEd graph editor for different number of iterations of the algorithm. In Fig. 9(a) a network is defined such that only two interest spaces exist. Nodes with odd IDs have interest 1 and nodes with even IDs have interest 2. In Fig. 9(b) five interests are defined in the network in such a way that first, seventh, thirteenth, nineteenth and twenty fifth peers have each interests to their five consecutive following. One can see that both scenarios have reached their core solutions.

6 Conclusion and future work This research applied coalition formation game theory to improve the performance of search in peer-to-peer networks through overlay construction. For this purpose, we developed a dynamic coalition formation algorithm in which peers with similar interests form coalitions considering topology awareness. The proposed algorithm assumes two strategies for players: myopic strategy where the player tries to maximize its payoff for the next period and farsighted strategy where the player chooses a coalition which brings him more payoff in future, but not necessarily in the next period. Also we proved the convergence of proposed algorithm to the core of game which corresponds to an equilibrium overlay. We evaluated our proposed distributed algorithm in both static and dynamic environments. Simulation results show significant performance benefit in dynamic peer-to-peer environment that leads to reduce the average search time. In this work we have only focused on interest locality and geographical proximity whereas considering other criteria would also be advantageous. For instance, the problem of malicious peers or free-riders in peer-to-peer networks can easily be studied by considering them in the characteristic function. For example, in [30] the selfish behavior in peer-to-peer networks, in particular the price of anarchy of peer-to-peer overlay construction is studied and is shown that under some boundaries, the resulting topology may never stabilize. Furthermore, an interesting extension of our work could also be modeling the problem as a coalition game in partition form. In

Peer-to-Peer Netw. Appl. (2014) 7:86–99

these games unlike the characteristic form, the value of a coalition S has a strong dependence on how the players in N\S are structured [2]. It seems modeling through coalition game in partition form would cause more preciously study of peers’ behavior in the network.

References 1. Zaharia MA, Chandel A, Saroiu S, Keshav S (2007) Finding content in file-sharing networks when you can’t even spell. In IPTPS’07: International Workshop on Peer-to-Peer Systems 2. Crespo A, Garcia-Molina H (2002) Semantic overlay networks for P2P systems. Computer Science Department, Stanford University, Technical report 3. Arnold T, Schwalbe U (2002) Dynamic coalition formation and the core. J Econ Behav Organ 49(3):363–380 4. Branze R, Tijs S, Dimitrov D (2005) Models in cooperative game theory: crisp, fuzzy, and multi-choice games. Springer, Berlin Heidelberg 5. Saad W, Han Z, Debbah M, Hjørungnes A, Basa T (2009) Coalitional game theory for communication networks: a tutorial. IEEE Signal Proc Mag Spec Issue Game Theory 26 (5):77–97 6. Androutsellisand S, Spinellis D (2004) Survey of peer-to-peer content distribution technologies. ACM Comput Surv 36(4):335– 371 7. Stoica I, Morris R et al (2003) Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans Networking 11(1):17–32 8. Lv Q, Cao P, Cohen E, Shenker S (2002) Search and replication in unstructured peer-to-peer networks. In Proc. of 16th International Conference on Supercomputing, pp. 84–95 9. Yang B, Garcia-Molina H (2002) Improving search in peer-topeer networks. In Proc. of the 22nd IEEE International Conference on Distributed Computing (IEEE ICDCS’02), pp 5–14 10. Tsoumakos D, Roussopoulos N (2003) Adaptive probabilistic search for peer-to-peer networks. In Proc. of the 3rd IEEE International Conference on P2P Computing, Sweden 11. Condie T, Kamvar SD, Garcia-Molina H (2004) Adaptive p2p topology. In International Conference on Peer-to-Peer Computing, pp 53– 62 12. Hsu C, Chou C, Hsu C, Chen S (2008) On improving message passing in unstructured peer-to-peer overlay networks. In Proc. of International Conference on Grid and Pervasive ComputingWorkshop, pp 358–363 13. Li M, Lee WC, Sivasubramaniam A (2004) Small world: an overlay network for peer-to-peer search. In Proc. of International Conference on Network Protocol (ICNP 2004), pp 228– 238 14. Kobayashi H, Takizawa H, Inaba T (2005) A self-organizing overlay network to exploit the locality of interests for effective resource discovery in P2P systems. In Proc. of IEEE Symposium on Applications and the Internet, pp 246–255 15. Kobayashi Y, Watanabe T, Kanzaki A, Yoshihisa T, Hara T, Nishio S (2009) A dynamic cluster construction method based on query characteristics in Peer-to-Peer networks. In AP2PS ’09 Proc. of the 2009 First International Conference on Advances in P2P Systems 16. Ratnasamy S, Handley M, Karp R, Shenker S (2002) Topologically-aware overlay construction and server selection. In Proc. of INFOCOM

Peer-to-Peer Netw. Appl. (2014) 7:86–99 17. Rostamiand H, Habibi J (2005) A mathematical foundation for topology awareness of P2P overlay networks. In in Proc. of 4th International Conference On Grid and Cooperative Computing (GCC2005), pp 906–918 18. Rasanjalee Himali DM, Prasad SK (2011) SPUN: a P2P probabilistic search algorithm based on successful paths in unstructured networks. In Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp 1610–1617 19. Lei Y, Hao Y, Can W, ZhiGuang Q (2010) An adaptive search method based on the interest in Gnutella-like network. In Computer Application and System Modeling (ICCASM), Oct 2010, pp V11302–V11-306 20. Garbacki P, Epema DHJ, van Steen M (2010) The design and evaluation of a self-organizing superpeer network. IEEE Trans Comput 59(3):317–331 21. Datta S, Giannella CR, Kargupta H (2009) Approximate distributed K-means clustering over a peer-to-peer network. IEEE Trans Knowl Data Eng 21(10):1372–1388 22. Armetta F, Haddad M, Hassas S, Kheddouci H (2010) Selforganized routing for unstructured peer-to-peer networks. In 4th IEEE International Conference on Self-Adaptive and SelfOrganizing Systems (SASO), pp 273–274 23. Xu M, Liu G (2011) Building self-adaptive peer-to-peer overlay networks with dynamic cluster structure. In 13th International Conference on Communication Technology (ICCT), pp 520–525 24. Ghit B, Pop F, Cristea V (2011) Using bio-inspired models to design peer-to-peer overlays. In P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), pp 248–252 25. Saad W, Han Z, Basar T, Debbah M, Hjorungnes A (2010) A coalition formation game in partition form for peer-to-peer file sharing network. In Proc. for IEEE Globe Communication Conference, Miami 26. Belmonte MV, Conejo R, Díaz M, Pérez-de-la-Cruz JL (2006) Coalition formation in P2P file sharing system. In Lecture Notes in Artificial Intelligence, CAEPIA’05, 4177, pp 153–162 27. Soltani Panah A, Khorsandi S (2011) Overlay construction based on dynamic coalition formation game in P2P networks. 5th International Conference on Signal Processing and Communication Systems (ICSPCS), Hawaii 28. Shehory O, Kraus S (1996) Formation of overlapping coalitions for precedence-ordered task-execution among autonomous agents. ICMAS-96, pp 330–337, December 29. Namvarand O, Krishnamurth V (2010) Coalition formation for bearings-only localization in sensor networks—a cooperation game approach. IEEE Trans Signal Process 58(8):432–4338 30. Moscibroda T, Schmid S, Wattenhofer R (2006) On the topologies formed by selfish peers. In Proc. of the twenty-fifth annual ACM symposium on Principles of distributed computing, pp 133–142

99 31. Chalkiadakis G, Elkind E, Markakis E, Polukarov M, Jennings N (2009) Stability of overlapping coalitions. ACM SIGecom Exchanges 8(1) 32. http://planet.urv.es/planetsim 33. http://www.acm.org/about/class/1998

Arezou Soltani Panah received her BSc and MSc degrees both in Computer Engineering and Information Technology from Amirkabir University of Technology, Tehran, Iran in 2008 and 2011 respectively. During her tenure as a graduate student at Amirkabir Univ., she worked on peer-topeer systems especially in the area of self-organization. In this period she was able to build a emulationbased test-bed to evaluate distributed algorithms in a multi-agent environment in High-speed networking laboratory. She graduated with honours in Jan. 2011.

Siavash Khorsandi received his PhD degree in Electrical and Computer Engineering from University of Toronto in 1996. From 1996 to 2000 he worked for Nortel Networks as a senior engineer on advanced network architectures and since then has been a faculty member of the department of computer Engineering at Amirkabir University of Techn o l o g y, Te h r a n , I r a n . H e served as a member of board of directors of Computer Society of Iran from 1999 to 2003 and since 2007 has been the head of Amirkabir ITC research center. Dr. Khorsandi has published more than 80 articles in journals and international conferences and directed several national projects on next generation networking technologies. His research areas of interest are peer-to-peer network algorithms, computer network security and modeling and evaluation of computer networks.

Suggest Documents