A Distributed Topology Control Algorithm for P2P Based Simulations Behnoosh Hariri1, Shervin Shirmohammadi2, and Mohammad Reza Pakravan1 1 Department of Electrical Engineering Sharif University of Technology, Tehran, Iran [hariri | pakravan]@ee.sharif.edu 2 Distributed Collaborative Virtual Environment Research Laboratory University of Ottawa, Ottawa, Canada
[email protected] Abtract Although collaborative distributed simulations and virtual environments (VE) have been an active area of research in the past few years, they have recently gained even more attention due to the emergence of online gaming, emergency simulation and planning systems, and disaster management applications. Such environments combine graphics, haptics, animations and networking to create interactive multimodal worlds that allows participants to collaborate in realtime. Massively Multiplayer Online Gaming (MMOG), perhaps the most widely deployed practical application of distributed virtual environments, allows players to act together concurrently in a virtual world over the Internet. IP Multicasting would be an optimal solution for the dissemination of updates among participants, but IP multicasting is not available to home users on the Internet, due to a number of technological, practical, and business reasons. In light of the lack availability of IP Multicasting on the global Internet, researchers have recently tended to shift multicasting from the networking layer to the application layer, known as Application Layer Multicasting, effectively constructing an overlay network among participants of the distributed simulation where end hosts themselves participate in the dissemination of update messages. In this paper, we propose a topology control architecture to support P2P based collaborative distributed simulations over the Internet by using ALM. We present our networking model and its rationale, theoretical proof, and simulation measurements in comparison with other methods as proof of concept.
1. Introduction The popularity of distributed simulations can be attributed in part to advancement in computer graphics, artificial intelligence, and the availability of high speed networks. Massively Multiuser Online Games (MMOG) are currently the most widely deployed type
of such distributed collaborative environments. Thousands of users interact with each other at online games such as Sony’s Everquest, Valve’s Half-Life, and Blizzard’s World of Warcraft. According to the general reports from game companies, the number of users playing online games at a given moment is between 6,000 and 10,000, which lead to an earning of around US$1 billion in subscription revenue in 2004, and projected to more than double by 2009 [1]. One of the main challenges in such environments is the synchronous communication and the proper coordination among the parties, specially considering the strict end-to-end delay requirement of no more than 200 msec[2]. In these distributed simulations, the interactions among parties are highly reactive; therefore, the requirement for frequent updates with a moderate end-to-end delay imposes hard time constraints, especially on the Internet where Quality of Service is limited. Network lag therefore becomes a serious problem for such systems since it affects the performance of the distributed collaboration. When a user participates in a simulation, its interactions with other nodes must be sent to all the participants over the network such that all entities involved in this collaboration can update their states. Because of the networking limitations and the traffic conditions, some of these “updates” are lost or delayed. Much research has been conducted to overcome the networking limitations to provide better distributed simulators. Some of these studies provide receiver-initiated and selectively-reliable transport protocols [3] that can be used to deliver important messages with a high degree of reliability, while others use sender-initiated approaches to transmit key updates with guaranteed reliability [4]. The IEEE DIS standard [5] has also been successfully used in controlled environment with vast resources, mostly for military simulations. These approaches are all based on IP multicasting, and, although they achieve good results in Intranet
environment, they are not readily deployable on the Internet. The lack of applicability of IP multicasting on the Internet, even for IPv6, has been well documented [6][7] . The reasons include scalability, the fact that IP Multicasting is designed for a hierarchical routing infrastructure and does not scale well in terms of supporting large number of concurrent groups, the deployment hurdles caused by manual configuration at routers and Internet Service Providers’ unwillingness to implement IP multicasting, marketing reasons due to the undefined billing at the source (content provider) and receivers, and many other obstacles. Due to the practical lack of multicasting infrastructure on the Internet, an alternative has been proposed to shift multicast support from the networking layer to the end systems and the proxies. This is Application Layer Multicasting (ALM). In ALM, data packets are replicated at end-hosts instead of routers. The end-hosts form an overlay network, where the end-hosts (or proxies) themselves relay the data to one another. However, ALM does come with tradeoffs: more bandwidth and delay (compared to IP multicasting) for the sake of supporting more nodes and scalability. But it has been shown that ALM-based algorithms can have “acceptable” performance penalties with respect to IP Multicasting and other practical solutions [8]. In our previous work[9] ,we presented a distributed routing algorithm that can be used in distributed simulations with large numbers of users. In this paper, we propose a distributed topology control architecture to support collaborative distributed VEs over the Internet by using ALM. We present our networking model and its rationale, our performance model, theoretical proof, and simulations results as proof of concept.
2. Proposed Approach Topology control aims at providing the VEs with an optimum overlay network (the ALM network on top of the existing Internet) over which users may exchange their update messages. The optimality lies in both getting the most capacity and the best QoS from the underlying network. Therefore the resulting overlay topology should consider the routing efficiency as well as the robustness required to offer a level of QoS required for real-time data transfer. The goal of topology control may be considered as the derivation of a minimum cost subgraph over the full-mesh connected underlying network. In a full-mesh network where each node has the possibility of being directly connected to the other peers, we aim at reducing the node degrees. This implies reducing the number of nodes’ neighbors and choosing a group of neighbors among all those nodes
that can be potential neighbors of a specific node. Decreasing the nodes’ degrees and routing redundancy shall result in a network that is more susceptible to node and link failure: a condition that is very probable to happen in a VE where nodes are volatile and users come and go as they please. Therefore we shall consider the k-vertex connectivity in the overlay graph to have a K-1 fault-tolerant network where the message exchange path shall not be interrupted in the case when K-1 nodes fail or leave the network or k-1 connections face a problem. At the same time, we wish to achieve a reasonable value of delay for data transmission to meet the VE real-time data transfer requirement. To achieve such a requirement, we will propose a way to minimize a cost function defined as the sum of product of path delay and traffic rate defined in (1): C= (X ijmn d mn )Pij (1)
∑∑ i, j m,n
Where Xijmn=1 represents the existence of the link between m and n in the path starting from source node i and destination node j, Pij is a value representing traffic demand on the path between i and j, and dmn is the delay of the link between m and n. Therefore our problem summarizes into the proposal of a K-1 fault tolerant distributed topology control algorithm that minimizes the cost function described in (1). To reach this goal, we primarily propose a centralized algorithm and extend it to a local one (and therefore distributed) afterwards. Let G=(V,E) denote the virtual environment initial arrangement where V={v1,v2,…,vN) denotes the finite set of nodes and edge set E denotes the set of communication links where E includes all one to one connections when the underlying network is assumed to be the Internet. We can now consider the output of topology control protocol P as a spanning subgraph GP= (V,EP) of G where GP is required to be kconnected. This implies that for any two optional v1 and v2 vertices in V(G), there exists k disjoint paths from v1 to v2 and removal of up to k-1 nodes or links does not partition the overlay network.
2.1. Global Fault-Tolerant Overlay Network (GFTON) Here we describe the above-mentioned centralized algorithm for the overlay network generation. This algorithm is partly based on Kruskal’s algorithm[10] for minimum spanning tree generation. The idea of trying to add safe edges to the graph still exists here, but the algorithm differs in many ways including the change in the concept of safe edge to safe path and the k-connectedness of the resulting spanning subgraph instead of the minimum spanning tree generation. A safe edge is in fact defined to be an edge that
minimizes the maximum of path costs where the path cost for a path between nodes i and j is defined in (2): C ij = (X ijmn d mn )Pij (2)
∑ m,n
Where the definition of parameters is the same as equation (1). It should be noted that, when several paths exists between i and j due to the fault-tolerance requirement of the problem, we will consider the main path in the above equation. The main path is defined to be the one with the smallest amount of delay. However, if the delay for two main paths becomes equal and we cannot decide based on the main path cost, we will switch the decision to the next smallest delay reserve path and this procedure may be continued for all other reserve paths. The algorithm is described in the following steps: 1. Create a forest Gp from graph G where each vertex in the graph is a separate tree. 2. Create a set S containing all the edges in the graph. 3. Select an edge e from S that minimizes the maximum of Cij defined in (2) for all values of i and j .Cij is in fact the cost of the main path as described before and alternate paths may be considered if two edges from set S were equal regarding the minimization of the maximum main path cost. 4. Remove the edge e from the set S and add it to the subgraph Gp. 5. Search the graph Gp and remove any edges that seem not to be redundant for the k-connectedness of the subgraph, in the presence of the new recently added edge e. 6. Continue with step 3 if set S is non-empty and the subgraph Gp is not K-connected yet. Terminate the algorithm otherwise. Proof of Optimality Let G be a connected, weighted graph and let Gp be the subgraph of G produced by the algorithm. Furthermore, let Gp’ be an optimal K-connected spanning subgraph. If Gp = Gp’ then Gp is the optimal spanning subgraph. Otherwise, let e be the first edge considered by the algorithm that is in Gp but not in Gp’. Obviously the subgraph Gp’ U e has a redundant edge(s), because you cannot add an edge to a kconnected spanning subgraph and still have a kconnected spanning subgraph. Therefore there exists another edge f in Gp’ which at the stage of the algorithm where e was added to Gp, has not been considered. This is because otherwise e would become redundant in the k-connectedness condition. Then Gp”= Gp’ U e/f is also a k-connected spanning subgraph. As Gp’ was assumed to be an optimal Kconnected spanning subgraph, the cost of Gp”
according to equation (2) is less than or equal to the total weight of Gp’. This is because the algorithm visits e before f and therefore C(e)≤ C(f) . If the weights are equal, we consider the next edge e which is in Gp but not in Gp’. If there is no edge left, the cost of Gp is equal to the cost of Gp’ although they consist of a different edge set and Gp is also an optimal spanning subgraph. In the case where the weight of Gp” is less than the weight of Gp’ we can conclude that Gp’ is not an optimal K-connected spanning subgraph. And therefore Gp is optimal K-connected spanning subgraph (equal to Gp’ or with a different edge set, but with same cost).
2.2. Localized Fault-Tolerant Overlay Network (LFTON) In the previous section we proposed a centralized algorithm that results in an overlay network minimizing the cost function described in (1). However, a centralized algorithm may not be of much practical use in VEs with a large number of participants since a central entity (such as a server) becomes a bottleneck. Therefore, a distributed algorithm is more desirable for a better and more scalable architecture for virtual environments. The algorithm that we propose here is somewhat related to XTC [11] which is defined as a distributed algorithm for minimum spanning tree generation. However there are significant differences in the use of a different cost function in our approach and also the K-connectedness of the resulting subgraph that was partly discussed in [12] under the K-XTC name. Let us introduce a new concept known as request profiling that lets the nodes obtain an approximation of the traffic matrix without actually getting it from a central server. When a node forwards a request toward a final destination, it monitors and profiles the destination ID of the request. Therefore each node i will be able to create a traffic vector for its own use to know the traffic demand between itself and the other nodes. Therefore, the matrix R has a number of elements r(i,j) each describing the traffic demand between node i and j. The traffic matrix elements are not exactly the same as Pij that we described before. This is due to the fact that the node i measures forwarded and generated traffic while matrix P described the traffic demand. However it is useful in the case of the distributed algorithm. The nodes also try to find a measure of delay between themselves and their neighbors using ways such as ping requests and find a cost for each of its connections as shown in (3): C ij = rij ∗ d ij (3) Where Cij is the cost that node i calculated as for its neighbor j. The algorithm can be summarized in the following steps:
1. 2. 3.
Each node i measures a cost Cij for all its neighbors where Cij is defined in (3). The node i establishes order