Network overlays for efficient control of large scale dynamic groups George V. Popescu, Zhen Liu
[email protected], zhenl@ us.ibm.com Abstract Scalable data distribution in large-scale dynamic collaborative systems requires efficient, low overhead communication control. We propose here efficient algorithms for clustering network nodes dynamically based on their communication interest. Several group communication architectures have been proposed to date without considering the constraints imposed by the communication infrastructure. Among these, distributed hash tables are scalable and resilient data structures used for data dissemination control. However DHT’s are not optimized for high dynamics of network node interest and real-time end-to-end performance requirements. This paper proposes efficient control algorithms for large-scale collaborative systems optimized for scalability as well as end-to-end data dissemination. Network node communication interest is modeled as a multi-dimensional attribute space partitioned into interest cells mapped to multicast communication groups. The proposed control algorithms use proximity-based clustering of network nodes and hierarchical communication interest aggregation. We show that network overlay control algorithms achieve scalability and low overhead with a controlled degradation of endto-end data path performance.
1. Introduction Scalability is an important design consideration in large-scale group communication. Scalable data distribution in large scale distributed applications such as distributed simulations, interactive networked applications requires efficient data path control algorithms. Two models of group management are prevalent: 1) clients are grouped according to their communication interest propagated in the control hierarchy; 2) clients are grouped by matching their interest to a fixed partitioning of the communication semantic space. Communication interest is modeled as a multi-dimensional space partitioned into cells mapped to communication groups. Advanced features such as dynamic partitioning of communication semantic space
into variable size cells, transparent migration of users between cells and dynamic repartitioning of application space can be supported by the control architecture given an adequate communication semantics abstraction. The design of scalable data distribution networks for large-scale interactive group communication has therefore multiple objectives: 1) grouping participants according to their communication interest, 2) organizing the data path to guarantee end-to-end network latency and 3) reducing the signaling overhead generated by frequent changes in client multicast group membership. We define here the “communication interest space” as the union of interests of clients participating in a group communication application. Efficient group communication requires minimizing the wasted communication capacity when multicasting messages to groups of clients with similar interest. Various methods for efficient interest-based communication have been proposed [2, 3] without considering the constraints imposed by the communication infrastructure. Recent work [3] modeled the communication interest as a topicbased publish/subscribe relation where the receivers specify their interest in sub-domains (cells) of the communication semantic space. The solution proposed in [4] however requires that the communication space information be distributed to all receivers, which will incur a large overhead when the network partitioning is dynamic. Others [5] propose a decomposition of the attribute space such that each attribute is filtered independently; this introduces an overhead that scales linearly with the number of attributes. The interactive group communication proposed here models the group participants as static as well as dynamic objects. State updates generated by each modification of attributes of dynamic objects are disseminated to receivers whose interest matches that of the cell containing the updated object. Attribute changes are recorded in a global application state, which is managed independently of communication interest semantic space. In order to maintain a consistent state of the application, state updates (changes of object attributes) are distributed to all replicas controlled by the clients. Multiple objects
may change state simultaneously, requiring distribution of the state updates to non-overlapping groups of receivers. Several overlay based communication architectures for large-scale group communication have been proposed to date [5, 6]. Scalability of large scale group communication applications requires clustering of participants according to their communication interest. In addition, real-time collaborative applications have quality of service constraints (end-to-end delay, frequency of state updates), which in turn requires an efficient design of supporting communication infrastructure. These requirements translate into optimal clustering of clients to reduce the amount of wasted bandwidth for group communication and in construction of optimized distribution trees to satisfy real-time (end-to-end delays) and node forwarding capacity constraints. While application and network constraints can be addressed independently, more efficient communication networks can be built using combined network and application level (communication interest) information. The group communication control architecture considered here uses an abstraction of communication services through hierarchical mapping of participating nodes in a multidimensional communication semantic space and algorithms for efficient control of network communication interest. The communication interest information is organized in a hierarchical structure; receivers are grouped dynamically according to their communication interest. Multiple large-scale interactive sessions can be supported by the same communication infrastructure through hierarchical indexing of communication interest. The session indexing structure is replicated at all forwarding nodes and aggregated at higher levels in the hierarchy; dynamic changes in membership triggers modifications at the forwarding nodes participating in the modified session. Control of communication interest is symmetrical with respect to receive/send operations. The send and receive primitives are decoupled by the communication space indexing structure: senders and receivers need only to register their group communication interest to a parent node in the hierarchy; the session control structure aggregate receivers interest and dynamically update the replicas at control nodes according to receiver dynamics. Our network model assumes that round trip time (RTT) measurements between pairs of network nodes, node forwarding capacity and packet loss ratio information is available to group communication controllers. Each control node indexes a subset of multicast groups and controls only the network nodes in its proximity. Interest registration is symmetric in the sense that senders and receivers need only to register their communication interest to a parent node in the hierarchy. The control structure aggregate receivers interest and
dynamically update the replicas at control nodes according to changes in registered receiver interest. The paper is organized as follows: the next section presents the communication interest and network measurement model. Section three describes the distributed group membership control architecture. The fourth section presents the clustering algorithm. The evaluation of the network overlay control architecture is presented in section five. The performance of proposed communication control algorithms is evaluated in section six. Section seven concludes the paper.
2. Modeling communication interest and network overlay The communication semantic space is constructed and partitioned at a root control node and subsequently distributed to all cluster leader nodes participating in the control of data distribution. The dynamic control of session information is performed by propagating messages in the control structure for each change of session information recorded at cluster leader/control nodes. The dynamics of receiver’s communication interest as well as dynamics of session information requires frequent distribution of control information to dynamic sets of overlay nodes. Changes in user communication interest triggers propagation of join/leave session membership control messages. Since communication interest information is aggregated at control nodes, the control messages are propagated only in the control hierarchy (do not reach other clients in the session). Furthermore, the distribution of control messages is restricted to the control nodes involved in the same group and to higher levels of the control hierarchy. Each cell in the communication semantic interest space maps a dynamic group communication session. A session groups all receiver and sender nodes whose communication interest overlaps with the multidimensional cell. Dynamic changes in node interest results in changes in the session membership, propagated on the control path of the data dissemination architecture. 2.1 Network overlay topology The network overlay topology is organized as a two level hierarchy, with sender/receiver nodes at the leaf level and the control nodes at the top level. The clients (leaf level nodes) are grouped based on network proximity and connected to the closest cluster leader control node. Senders and receivers forward data directly to the cluster leader nodes in the hierarchy. Data forwarded at a control node includes a session identifier that indicates the communication interest domain. Each cluster leader node has a message
processor, which implements algorithms for forwarding and tree construction/modification. The controllers perform data forwarding to sibling nodes that participate in the same session, which is forwarded further to receiver nodes participating in the same session. The data distribution path is modified without controlling the state of group membership at other overlay nodes using a stateless group communication protocol. A state based control requires signaling between control nodes, which can amount for a large overhead when data path changes often. The stateless multicast protocol proposed in [10] has the advantage of reducing the signaling overhead at the expense of per message processing overhead at each forwarding node. Dynamic change of data-path requires little overhead: encoding and decoding of application level headers at all data forwarding control nodes. Group membership control requires that the sender (cluster leader) node has full knowledge of receivers’ address and their network information. The cluster leader acquires this information when the node joins the overlay. Based on this information of receiver space, controllers construct communication trees optimized to satisfy endto-end delay and bandwidth constraints [10]. For each group of nodes with the same communication interest, the communication graph is constructed in a distributed manner by nodes in the control hierarchy, each node controlling the part of the communication graph containing child and sibling nodes only; control nodes optimize the construction of the communication path independently, using the network and communication preference data of overlay nodes. 2.2 Modeling communication interest We propose here a new model of communication semantic in collaborative group applications which consists in representing clients’ communication interest as a multi-dimensional space partitioned into partially overlapping communication interest cells (e.g. networked virtual environments use two-dimensional maps partitioned in rectangular cells of variable size). The communication interest descriptor is a multi-dimensional feature vector containing non-numerical attributes (e.g. object type) hashed into numerical values and coordinates mapping the virtual representation of clients’ interest. The client communication interest is represented as a point, a domain or unions of domains in the multidimensional communication interest attribute space: - a communication interest point:
i = [i0 ,… in ]
(2.1); - a communication interest domain:
c = [i1 , i2 ] , (2.2)
where [i1 , i2 ] = [i11 , i12 ]x[i21 , i22 ] x...[in1 , in 2 ]
is the
notation for the Cartesian product; - a union of multiple interest domains representing the communication interest of a single client:
mc = ∪ [i1 k , i2k ] (2.3). k
The attribute space contains static as well as dynamic objects. State updates generated for each modification of attributes of dynamic objects are disseminated to clients with a matching interest. Attribute changes are recorded in a global application state, which is managed independently of the communication interest space. In order to maintain a consistent state of the application, state updates (changes of object attributes) are distributed to all replicas controlled by the clients. Multiple objects may change state simultaneously, requiring dissemination of the state updates to non-overlapping groups of receivers. To measure the similarity between client’s interests, we define non-linear distance functions based on the above definitions of communication interest domains. The communication interest feature vector is a union of several domains scattered in the interest space; non-linear distance functions - such as the measure of overlap between multiple cells - are used to measure the similarity between clients’ communication interests. Following are several definitions of the distance function in the communication interest space. A distance function based on the degree of overlapping of multiple communication interest domains can be computed as follows:
1, if (i1 k + i2k ) / 2, (i1 p + i2 p ) / 2 < o(ck , c p ) = (i1 k − i2k ) / 2 or (i1 p − i2 p ) / 2 (2.4) 0, ow
d (mcx , mc y ) =
card ( mcx ) card ( mc y )
∑ p
∑
o(ck , c p )
(2.5)
k
Using this definition, the distance between client’s multiple cell communication interest and a cluster is simply the overlap between node’s communication interest and the union of the interest domains of all nodes in the cluster. The distance between two clusters is the overlap between the set of cells representing the communication interest of each cluster. For discrete sets of communication interest topics, an efficient representation is the matrix of client communication interest. The matrix entry r (i, j )
represents in this case the interest of node ni in subject
t j . The partition domain membership is composed of all nodes with an interest in t j :
m _ delay (c) = max (
m(ci ) = {∀N j , mc( N j ) ∩ ci ≠ null}
(2.6).
The communication waste for grouping two partition domains ci and c j is:
Wd (ci , c j ) =
card ( ci ) card ( c j )
∑ ∑
m =0
(1 − δ ( m − n)),
n =0
where δ (i ) is the discrete (Kronecker) delta Dirac function. The increase in communication waste when adding partition domain ci to a cluster of partition domains CL is:
Wd (ci , CL ) =
∑ j =0
Wd (ci , c j ), where c j ∈ CL
(2.8)
The network delay sub-space uses round-trip time measurements to assign coordinates in an Euclidean space. Using network round trip time measurements, a multidimensional network delay space is constructed as in [9]. The Euclidean distance in the network delay space is used then to approximate the distances between overlay network nodes: p
d (n1 , n 2 ) = [∑ ( xnk1 − xnk2 ) 2 ]1/ 2
(2.9)
k =0
where
ni
is
the
N-dimensional
position
card (Tr ( k )) − 2
k
∑ i =0
d (nik , nik+1 )),
k Tr (k ) = [n0k ,… ncard (Tr ( k )) −1 ] path of
(2.10)
k − th traversal of TC tree This metric is used to construct delay minimized multicast trees for group communication [10].
(2.7)
N m ∈ ci , N n ∈ c j
card ( CL )
Another metric for clustering nodes with topology constraints is the maximum delay on the tree constructed with the cluster nodes.
vector
representing the overlay node N i in the network delay Euclidean space. The distance from a node to a cluster of nodes is defined as the average distance to the nodes within the cluster. When nodes are represented by their network position vectors, this corresponds to the distance between the node and the center of the cluster; when network maps containing the delays between pairs of nodes are used, this distance is computed by simply averaging the delays to the nodes within the cluster. Another network delay metric used in data path optimization is the network path distance on a tree constructed using the nodes in the cluster. The distance to the cluster of nodes is evaluated on the topology constructed with the nodes in the cluster.
3. Algorithms for communication interest clustering We consider here two alternatives for clustering clients in a multi-dimensional communication interest and network attribute space: I. Clustering client’s communication interest. II. Clustering communication interest partitions – consist in partitioning of the interest space followed by the clustering of partitions according to the similarity of their node membership list. We describe in the following an algorithm for clustering receivers using a generalized distance function of communication interest and network QoS parameters. As described in the previous section, the set of nodes to be clustered have network attributes and communication interest attributes. The network attribute vector consists of node out degree and network delay space positions (the set of coordinates that describe the position of the overlay nodes in an approximation network overlay Euclidean space). The network position attributes are used to approximate the distance between any two nodes in the overlay within an error bound that depends on the dimensionality of the space. The distance between a node and a cluster in the network attribute space is computed as the average shortest path distance between a node and the set of nodes in the cluster: (3.1) Dn(n, CL ) = ∑ n − nk k∈CL
The k-Means algorithm [7] for clustering overlay nodes according to their proximity proceeds by iterating through the set of nodes and assigning the node to the closest cluster until the stopping criteria is met. The clustering algorithm selects the cluster C L which corresponds to the min D ( n, C L ) (3.2). It can be L
shown that for this distance function definition the iterative clustering algorithm also converges to the
average distance between nodes and the corresponding
1 min N
clusters:
N −1
∑ D(n k −0
k
, C L ) where n k ∈ C L
(3.3).
The generalized distance function is defined as follows:
MD (n, CL ) = w1* Df (n, CL ) + w2* w(n, CL ) + w3* Dn(n, CL )
The distance in the communication interest space is defined as the similarity between node communication interest and the aggregated communication interest of cells grouped in cluster CL . The measure of communication efficiency is the total wasted communication bandwidth (under the assumption that all nodes in group transmit at the same rate):
w( n, C L ) =
∑ (1 − r (n, n ))
ni ∈C L
i
(3.4)
(3.11)
for clustering node interest. When clustering partition the partition distance functions (3.9, 2.8, 3.10) are replaced in (3.11). Using this weighted distance function, the k-Means clusters the overlay nodes such that to jointly minimize the average network distance, wasted communication bandwidth and the average difference between cluster degrees. A variant of this clustering method is detailed in [4].
The preference-based grouping algorithm assigns a node to the cluster CL corresponding to the
4. Scalable control of large scale group communication
min ( w(n, C L )) (3.5), updating the cluster membership
Data dissemination solution needs to scale with the number of sessions - since the same communication infrastructure supports simultaneously multiple sessions and with the number of clients – sessions may involve thousands of participants simultaneously. Minimizing the end-to-end delay between participants reduces the communication capacity and the communication control needs in comparison with a solution based on fix partitioning of the virtual space with dedicated control nodes managing each partition. We propose here algorithms for dynamic management of communication interest at cluster leader (control) nodes. Each control node has a message processor, which implements algorithms for forwarding and data distribution tree processing. Messages received at the cluster leader nodes are matched to a communication interest cell, which index all siblings participating in the same interest group. Cluster leader nodes forward messages to siblings with similar communication interest.
L
after each iteration; it can be shown that the iterative algorithm converges to a solution that minimizes the overall waste: N −1
min ∑ w(nk , CL ) where nk ∈ CL
(3.6).
k =0
The average node degree of partition domain member nodes as:
F (c i ) =
1 ∑ ( f (nj )) card (ci ) nj∈ci
(3.7); the
distance between a partition domain and a cluster of partition domains is then:
Df (n, CL ) = exp( β (− F (n) − F (CL ) )
(3.8)
this distance function favors the addition of partitions with high degree per node to clusters with degree deficit per node. The exponential function stabilizes the clustering algorithm as small variations in node degree do not change cluster membership. When clustering partitions the distances are modified as follows: 1) the distance in the network attribute space is the sum of all members of the partition to the nodes in the cluster:
1 ∑ ∑ n j − nk (3.9) card (ci )* card (CL ) nj∈ci k∈CL 2) the communication interest distance between a partition domain and a cluster of partition domains: Wd (ci, CL ) using (2.8) and 3) the node degree distance is Dn(c i , CL ) =
defined as: (3.10).
Df (ci , CL ) = exp( β (− F (ci ) − F (CL ) )
Dynamic changes in user communication interest triggers membership changes at control nodes. Since communication interest information is aggregated at control nodes, messages are propagated only in the control hierarchy, reducing the communication capacity required for control messaging. Furthermore, the distribution of control messages is restricted to control nodes involved in the same group and to higher levels of the control hierarchy. The hierarchical communication interest aggregation filter the control messages required for dynamic changes in receiver interest. To change group membership from cell 1 to cell 2, the peer node identified by nodeID and clustered at the parent parentID, perform a leave of cell 1
and a join of cell 2. The algorithms for dynamic join and leave are presented in Figure 1 and 2. The join algorithm finds the closest control node that manages the required communication interest group; this node returns the list of all current members; subsequently the proximal control node inserts the new group and the corresponding membership list and notifies all interested controllers of the addition of a new member. The algorithm has three steps: I. control message propagation, II. index processing at the proximal control node and III. index processing at control nodes participating in the same group. I. Message propagation: 1.Insert nodeID in the cell 2 entry at the parentID node; trailID = parentID 2. If cell 2 list == empty a. Propagate up: b. Send (trailID, parentID, cell 2) control message to the parent node; c. Insert the trailID in parent’s cell 2 list d. If cell 2 list has more elements goto f. e. If current node is not root: Substitute the trailID with the ID of current node and repeat from a. f. Propagate down: g. Propagate the (parentID, cell 2) to a node selected at random in the cell 2 entry of the current node; h. If control leaf level: insert parentID in the cell 2 entry; reply to the nodeID’s parent with the list of nodes in cell 2 entry; stop i. Else repeat from f II. Index processing at originator control node: Insert the control nodes IDs from the reply list in the cell 2 entry; notify all nodes in this list of the addition of parentID III Index processing at control nodes in cell 2 list: insert parentID
Figure 1: Distributed node join algorithm
I. Message propagation: 1. Delete nodeID from cell 1 entry at parentID node; trailID=parentID; 2. Notify control nodes in the cell 1 entry of parentID leave; 3. Query to retrieve the index of cell 1 4. If leaf_node entry list not empty stop; 5. Propagate leave(trailID, cell 1) (substituting trailID with current node ID) message to the parent node; repeat from 3. II. Session tree processing: At each control node receiving a leave(parentID, cell 1) message: remove the parentID entry from the cell 1 entry in the indexing structure.
Figure 2: Distributed node leave algorithm The distributed algorithm executed when nodeID leaves cell 1 is presented in Figure 2. The leave message is propagated in the control hierarchy when there are no other cell I group members registered at nodeID’s proxy. The leave message is sent to all controllers members of cell I and is forwarded to control nodes in the hierarchy
as long as the cell I list does not contain additional entries. Nodes receiving the leave message delete the nodeID specified in the leave message from cell I membership list.
5. Performance evaluation of distributed control algorithms To compare reduction in signaling among several distributed data structures lets consider an overlay of N nodes organized in clusters of m, 1 < m < N nodes per control node. The group membership is uniformly distributed among k , 1 < k < N multicast groups. Let P be the probability that one out of m nodes of cluster C participates in the multicast group T. Assuming the probabilities that two nodes n1 and n2 participate in group T are independent, then the above probability can be computed with P = 1 − (1 − 1/ k ) . A node changing the group membership sends a single message – to its cluster control node - with probability (1-P); with probability P the cluster leader propagate the control message to other control nodes in the hierarchy. m
The number of messages propagated in the control architecture depends in this case on the number of the control nodes that have receivers interested in group T and on the structure of the control hierarchy. We compare here two hierarchical control solutions: control flooding where control nodes broadcast the control messages and control hierarchy where control messages propagate in a control tree hierarchy as described in the previous section. The simplest such control structure is the twolevel control hierarchy where a root node aggregates the group membership of all cluster leader control nodes. Let f(p) be the number of control messages to reach p control nodes. In the general case of an arbitrary hierarchical control structure f(p) depends on the fan-out of the control nodes in the hierarchy and the probability P. For a two-level control hierarchy, f ( p ) = p + 1 (5.1). The probability that the control message propagates to p out of the remaining
N − 1 control nodes of the base m
level control hierarchy is p P N / m − p −1 (1 − P) p (5.2). N m − 1 The average number of control messages for the twolevel hierarchy is then:
m c = 1 * (1 − P) + N −1 m
N − 1 N − p −1 P * ∑ ( f ( p) + 1) * m P m (1 − P) p p p =0
(5.3)
The average number of messages for control flooding is: m f = 1 * (1 − P) + P *
N (5.4). For a flat structure – m
where a single node is maintained at the root node which sends control messages to all other group members - the average number of control messages is m p =
N . The k
control signaling ratio – the ratio between the average number of control signals - is:
m f / m p = (1 * (1 − P ) + P *
N k ) m N
(5.5)
when
comparing control-flooding with a flat control structure. The ratio increases as ≈ c *(1 − 1/ k ) number
≈
of
clusters
N /c
(5.6) with the
c = N /m
and
k *[ct + (1 − 1/ k ) ] (5.7) with number of multicast N m
groups. The control signaling mc for the two level hierarchy decreases to one message when the number of groups and the number of cluster is large. The largest reduction in control signaling vs. control-flooding is obtained when the number of clusters is large. The average number of messages propagated in the control hierarchy increases linearly with the number of clusters (centralized architecture requires one message per group operation only); however, the number of messages per node increases linearly with the number of nodes per cluster, resulting in overload for the centralized architectures. The control-hierarchy structure reduces (compared to control-flooding) the control signaling exponentially when the number of clusters is large. Therefore the twolevel control hierarchy with interest aggregation is the most efficient from signaling perspective for large-scale dynamic group applications.
6. Discussion Recent work on improving scalability of p2p systems resulted in efficient, decentralized control architectures for collaboration and distributed file look-up [5, 11]. The query language used in these distributed applications is comparable to the communication semantic space model proposed here. However, since p2p systems are optimized
for distributed look-up, the data path issues not being addressed. The look-up in p2p systems uses different heuristics to optimize the query forwarding and do not exploit several features of collaborative systems such as clustering of communication interest space for efficient message distribution. Content-based publish/subscribe paradigm [1] also proposes a flexible method for indexing receiver interest. A publish/subscribe system is composed of a matching engine and a data distribution architecture. The matching engine selects for each published data packet the list of receivers, based on the subscription filters, which aggregate receiver interest. Various data structures have been proposed to perform efficient filtering/matching. The data distribution architecture is composed of nodes that forward data packets to lists of recipients indicated by the matching engine. While designed for efficient filtering at forwarding nodes, the publish/subscribe architectures do not optimize the data path according to end-to-end performance requirements imposed by the distributed interactive applications. Also, when the number of publishers is very large and the subscription process dynamic, the control signaling between filtering/matching nodes becomes a bottleneck. In contrast a solution based on DHT [12, 13] provide scalability but is less flexible in supporting multi-attribute communication semantics. Multiple attributes can be mapped using disjoint DHT’s, which do not scale well with the size of the attribute space. In addition, the endto-end delay of propagating the queries in such structure is increased (the query has to be propagated along each dimension, and the results aggregated). The overhead of DHT management is also large due to the optimization introduced for reducing the search time to logarithmic efficiency. Each addition of a new DHT node involves the migration of indices from the new predecessor and the update of all finger tables of nodes that point to the added node. In contrast the addition of a new node in the hierarchical structure described in this paper involves only updates of the nodes of a limited set of nodes in the control hierarchy.
7. Conclusion We propose in this paper an abstraction of communication semantic for large-scale dynamic applications through hierarchical mapping of overlay nodes in a multidimensional communication space. The communication interest semantic is modeled as a multi-
dimensional space partitioned in interest cells mapped to communication interest groups. We propose several algorithms for clustering the communication interest space and for controlling data distribution. Multiple large-scale interactive sessions can be supported on the same distributed information dissemination architecture by the proposed hierarchical indexing of communication semantic. We contrast this approach to a solution consisting of partitioning the communication space on a set of control and forwarding nodes. We show that the proposed solution achieve scalability with low control signaling overhead.
8. References [1] Banavar G., T.D. Chandra, B. Mukerjee, J. Nagarajarao, R.E. Strom, and Sturman D.C., “An efficient multicast protocol for content-based publish-subscribe systems”, In Proceedings of ICDCS 1999, Austin, Texas, May 1999, pp. 262-272. [2] S. Banerjee, C. Kommareddy, K. Kar, B. Bhattacharjee, S. Khuller, “Construction of an Efficient Overlay Multicast Infrastructure for Real-time Applications”, In Proceedings of IEEE Infocom 2003, April 2003. [3] T. Chang, G. Popescu, C. Codella , “Scalable and Efficient Update Dissemination for Interactive Distributed Applications”, In Proceedings of ICDCS 2002, Vienna, 2002, pp143-151. [4] T. Chang, J. Fan, M. Ahamad, G. Popescu, Z. Liu, “Preference-aware overlay topologies for group communication”, In Proceedings of Globecom 2005. [5] Yang-Hua Chu, Sanjay G. Rao, S. Seshan, and Hui Zhang, “Enabling conferencing applications on the Internet using an overlay multicast architecture”, in Proceedings of SIGCOMM, San Diego, CA, August 2001. [6] Y. Chawathe, S. McCanne, and E. Brewer, “RMX: Reliable Multicast for Heterogeneous Networks”, in Proceedings of the IEEE Infocom ’00, Tel-Aviv, Israel, March 2000. [7] A. K. Jain, Dubes, R. Algorithms for data clustering, Prentice Hall, 1989. [8] J. Liebeherr, M Nahas, W. Si, “Application Layer Multicasting with Delaunay Triangulation Overlays”, IEEE Journal on Selected Areas in Communications, Vol.20, No.8. [9] E. Ng, H. Zhang, “Predicting Internet Network Distance with Coordinates-based Approaches”, In Proceedings of INFOCOM 2002.
[10] G. Popescu, Z. Liu, “Stateless group communication for dynamic group communication”, In Proceedings of DSRT 2004,
pp. 20-28. [11] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A scalable content addressable network”, In Proc ACM SIGCOMM, 2001, pp. 161-172. [12] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Content-Addressable Network”, In Proceedings of the ACM SIGCOMM 2001 Technical Conference, San Diego, CA, USA, August 2001. [13] Stoica I., Adkins D., S. Zhuang, S Shenker, S. Surana, “Internet Indirection Infrastructure”, In Proceedings of SIGCOMM 2002.