Cluster Comput (2011) 14: 145–163 DOI 10.1007/s10586-010-0140-9
Design of distributed microcell-based MMOG hosting platforms: impact study of dynamic relocations Bruno Van Den Bossche · Bart De Vleeschauwer · Filip De Turck · Bart Dhoedt · Piet Demeester
Received: 9 March 2009 / Accepted: 24 August 2010 / Published online: 17 September 2010 © Springer Science+Business Media, LLC 2010
Abstract Networked Virtual Environments (NVEs) and Massively Multiplayer Online Games (MMOGs) in particular offer huge digital environments characterized by tens of thousands of simultaneous users. To maintain a high Quality of Experience (QoE) these applications are typically hosted on dedicated server clusters and require custom management software which relies on knowledge of the inner workings of the application. We propose the concept of microcells in the design of an MMOG hosting platform capable of hosting contiguous virtual worlds. Microcells are small parts of the virtual world which can be relocated, allowing to dynamically distribute the load over multiple servers and the use of generic management software. In order to evaluate the impact of microcell relocations, a platform prototype has been designed and evaluated with three microcell specific load balancing algorithms. The obtained evaluation results are presented in this paper and the impact of the microcell relocations are characterized in detail. Keywords Networked virtual environment · Massively multiplayer online games · Service hosting · Microcell · Load distribution algorithms · Dynamic relocations
B. Van Den Bossche () · B. De Vleeschauwer · F. De Turck · B. Dhoedt · P. Demeester Department of Information Technology (INTEC), Ghent University, IBBT, Gaston Crommenlaan 8, Bus 201, 9050 Ghent, Belgium e-mail:
[email protected] B. De Vleeschauwer e-mail:
[email protected]
1 Introduction With the advance of technology and the availability of broadband Internet access, interactive multimedia applications are becoming increasingly popular. One class of these applications is online games, especially Massively Multiplayer Online Games (MMOGs). These applications offer their users a huge virtual world in which they can interact with tens of thousands of other players and build virtual identities. Since their appearance in the eighties, the customer base for MMOGs has grown from a few thousands to tens of millions of paying clients. For one instance of such an application, the level of concurrency has also increased from a few dozens to tens of thousands simultaneously online clients. Examples of these applications include World of Warcraft [1] and Second Life [2], the former currently having over 11,500,000 paying users. An efficient platform is needed to manage the virtual worlds and to provide a continuous service to its players. One of the traditional approaches, as used by World of Warcraft, is to divide the virtual world in independent realms. By not allowing realms to interact, the load is divided over a number of independent server clusters that may even be geographically dispersed. The approach used by Second Life, is to divide the virtual world into several cells. A cell is a part of the virtual world and each of these cells is managed by one single server. Players can move freely from one cell to another. Neither of these approaches is able to cope with a highly uneven or dynamically changing player distribution. When a lot of players are concentrated in one realm or cell, the responsible server or cluster gets overloaded. We propose to split the virtual world into several small pieces, called microcells [3–5]. These microcells can then be distributed at runtime over a number of servers in such a way that the objectives of the application are realized. An important advantage of the microcell approach is that it can be
146
used in service hosting platforms exploited by a third party. This does imply that the MMOG developers make use of the microcell approach by adhering to a number of interfaces. Currently the typical MMOG infrastructure is a dedicated platform for a single game. This requires significant investments in hardware, software and maintenance for MMOG vendors. Current Service Hosting platforms focus on distributing parallel or independent service components to optimize the load distribution. The use of microcells allows an MMOG platform targeted toward MMOG developers and third party service hosting providers alike. A platform which is capable of hosting multiple independent MMOGs and dynamic resource usage optimization. It allows the developers to outsource and reduce the infrastructure costs and risks and the hosting providers can make an efficient use of the available resources and optimize costs through scale increase. A key requirement of an autonomic MMOG hosting platform is that it must be able to manage the virtual world in such a way that the Quality of Experience (QoE) decreases due to a server overload are avoided or solved as quickly as possible. We define that a server is overloaded if an additional increase in load would result in undesirable behavior of the platform, for example an unacceptable delay experienced by the end users. The most important challenge in doing this, is to efficiently react to sudden changes in player density in parts of the virtual world. These changes in the player distribution may occur suddenly, are unpredictable and need to be responded to in a timely fashion. The main contributions of this paper are: the design and implementation of a platform prototype and microcell relocation algorithms, the experimental validation of the platform and the algorithms for performing microcell relocations and microcell load balancing. This includes a detailed characterization of the impact of microcell relocations. This paper is complementary to the work reported in [3–5]. In these papers the focus was on the actual load balancing algorithms for an MMOG and the performance of those algorithms was validated using a simulation environment. The outline of this paper is as follows: First an overview of related work in the area of MMOGs, distributed architectures and autonomic load balancing is presented in Sect. 2. Next the use of microcells is discussed in Sect. 3 which gives an overview of the characteristics of this approach and the requirements it puts on the underlying architecture. This is followed by a description of the software architecture we propose for hosting microcell based applications in Sect. 4 and the associated microcell relocation algorithms in Sect. 5. In Sect. 6 the prototype implementation of the architecture is discussed. Three MMOG load balancing algorithms are proposed in Sect. 7, followed by the evaluation setup and the obtained results in Sects. 8 and 9 respectively. Finally, the conclusions are presented in Sect. 10.
Cluster Comput (2011) 14: 145–163
2 Related work A number of different approaches to optimize the deployment NVEs and MMOGs have been developed in recent years. In this section we give an overview of the current state of the art. Bigworld [6] and EVE Online [7] attempt to solve load balancing problems by dynamically shifting processing resources to those regions of the virtual world which are most loaded. This allows a centralized data model which simplifies the application development but could potentially become a bottleneck as well. In the microcell approach the data is migrated to the server the hosts the microcell. The authors of [8] propose an architecture where players are dynamically assigned to the servers based on their location in the virtual world in an attempt to minimize the inter-server communications. This approach is best suited for applications which, unlike typical MMOGs, do not need to maintain a consistent game state. Therefore, the central aim of that approach differs from our platform. Other techniques to partition the virtual world and allocate server resources using quadtrees is proposed by [9]. In [10] another cell-based distribution technique for an MMOG platform is proposed. The focus of their research is on a load shedding algorithm to solve overload situations in the virtual world only when they occur. In previous work we already provided an analysis of the algorithm presented in [10] in the context of our microcell based platform, this can be found in [5]. The algorithms we present in this paper focus on a continuous load balancing rather the solving overload situations when the occur. In [11] the authors discuss a microcell based approach for peer-to-peer management of massively multiplayer online games. The problem that is studied is determing a set of microcells that alleviate an overloaded server. One of the differences with the work presented there is that our algorithms are keeping a globally good state over the whole server set, without waiting for overload to occur. Additionally the focus in our paper is to present the full architecture for realizing a centrally managed MMOG platform whereas in [11] the results are based on simulations. Recently, with Project Darkstar [12] a middleware architecture targeted to large scale online games was introduced. This open source project focuses on a distributed, fault tolerant communication and event processing framework which can be used in the back end implementation of online games and serve as a Rapid Application Development Framework. In [13–15] communication architectures for MMOGs are presented. All leverage the interest region or aura of the players to filter out only the relevant messages, i.e. which have influence on their direct environment. The authors of [16] employ active networks to minimize the number of required communication channels. The architecture is self-adjusting and can be deployed on both local and
Cluster Comput (2011) 14: 145–163
wide area networks. Likewise, the authors of [17] propose the use of Steiner trees to enable application level multicast. An in depth overview of the current state of the art of network and communication level optimizations for multiplayer computer games can be found in [18]. Such techniques could be considered complimentary to the platform that we present here and could be used to handle the serverclient communication in the microcell system and to manage the network bandwidth. To simplify the problem of resource assignments one could use virtualization of hardware resources to dynamically assign extra resources to an application as suggested by [19, 20]. However, that approach still requires support from the application for deployment on a cluster, be it virtual or not. Distributing the load over cluster nodes with different processing capabilities is described by [21] and a solution is proposed for distributed systems with heterogeneous resources and different classes of processing tasks. The focus of their work is to optimize and speed up the relocation of tasks by reassigning batches of tasks to optimize the resource negotiations. To determine the most appropriate load thresholds to initiate the distribution of tasks or application components, the authors of [22] propose a solution which takes into account an estimation of the actual load and the analysis of response times. Traditional load balancing techniques such as round robin or redirecting request to the least loaded server have been applied to middleware platforms [23]. Similarly has the migration and mirroring of application components or complete application to available processing resources been included in middleware as well [24, 25]. Comparable solutions are applied to online games to create mirrored game servers and as such accommodate more parallel independent game sessions [26] or to provide better latencies to geographically dispersed players [27, 28]. The microcell approach is not strictly limited to MMOGs and could be used for all applications which rely on complex and large virtual worlds such as Virtual Reality Systems used for complex simulations with a large number of participants [29]. The advantage of using the microcell based algorithms is that they describe the logic of the application and use this to make an intelligent decision.
3 Microcell concept The use of microcells consists of dividing the virtual world in a large amount of smaller parts which can be assigned to a number of servers. This extends the more traditional approach of dividing the world into basic cells equal in size as shown in Fig. 1 and which is applied by Second Life [2]. These cells communicate with their neighbors and the players populating them and manage all objects located within
147
Fig. 1 Distribution of the virtual world into multiple cells in order to distribute the load. Each server is statically assigned an, equally sized, part of the world
Fig. 2 Distribution of the virtual world into a large amount of smaller microcells which can be reassigned to the available servers in order to distribute the load
their boundaries. An important requirement of the cell based architecture is that the cell-concept is completely transparent. Players do not need to know in which cell they currently reside or if they are crossing a border. This implies that players are able to see across borders and even interact with other players in another cell. Dividing the world into microcells allows a much more balanced load distribution among the available servers. Figure 2 shows a microcell assignment where servers responsible for highly populated microcells manage less microcells than servers managing lightly populated cells. While using a microcell based approach offers advantages for load balancing, a certain processing overhead is introduced as well. Smaller microcells require more player migration across microcells and thus result in a higher cost. The results of actions performed by players need to be forwarded to neighboring microcells. The decision if this result needs to be forwarded depends on the viewing distance in the virtual world. Both these parameters determine a significant part of the overhead introduced by the microcell approach. Tightly coupled with the microcell size is the movement speed of the players in the virtual world. A detailed analysis of the optimal microcell size for a given player speed is performed in [4] and is not repeated in this article as the focus here lies on the hosting platform. Although the focus of the microcell approach lies in the fine grained load balancing, another advantage is that it allows to distribute the data of a NVE. Each microcell is responsible for the data contained in its own part of the virtual
148
world and the data related to the players contained in the microcell. As such, there is no need for a complex centralized, high end and high cost data store. However, extra care is required to make sure the data remains consistent at all times, especially when migrating a microcell from one server to another. When using the microcell approach a number of requirements need to be fulfilled by the platform and its users. We distinguish three types of users related to the platform: the end users or players, the developers and the hosting provider. A major requirement imposed on the platform by the end users is that the delay during the gameplay is minimized to provide an optimal gaming experience. Furthermore, the end user should be completely unaware and independent of the underlying platform architecture. To be able to balance the load, parts of the virtual world, i.e. the microcells, need to be relocated from one processing server to another. Relocating microcells needs to be performed at runtime and is required to be completely transparent to the end users. To enable this capability the platform needs to define interfaces which the microcell need to implement. A microcell needs to be able to serialize itself into a format suitable for transmission over the network and deserialize itself to restore the original microcell. An MMOG developer is required to use the microcell paradigm and each microcell must implement the interfaces to support microcell migration. This implies that the communication between microcells must occur using the communication facilities of the platform, as the microcells itself is unaware of the actual deployment location of the other microcells. Likewise, the platform is required to implement the necessary communication facilities as well as the required logic to relocate microcells without the loss of information or messages arriving during the relocation process. The relocation algorithms and logic for gathering the required data used in the decision making are to be provided by the MMOG platform as well. However, allowing the microcells to provide additional statistical data about the generated load can improve the efficiency of the relocation algorithms. Allowing microcells to report statistics themselves does include certain risks as well. In a scenario where the hosting platform and infrastructure and the actual game implementation are provided by different vendors, this requires that both vendors trust eachothers application components as reporting incorrect statistics could disturb the correct working of the platform. This is especially the case in a shared hosting environment.
4 Platform description
Cluster Comput (2011) 14: 145–163
4.1 Infrastructure assumptions For the remainder of the paper, we assume the target deployment environment to be a cluster-like environment consisting of processing nodes or a group of clusters connected through high speed network links with Quality of Service guarantees. In such an environment it is feasible to have nodes communicate with a minimal delay or processing overhead and through high bandwidth links. Additionally, we assume that all players communicate directly with the processing servers hosting the virtual world. This is however undesirable in a production environment and an additional abstraction layer is required. For example, transparency to the end users about the platform can be achieved by introducing intermediate proxy nodes through which all client traffic is directed. These proxy nodes then redirect the client traffic toward the actual processing node containing the microcell where the player is located. This way, the network connection redirections due to player or microcell migrations remain hidden within the platform and transparency to the end users is guaranteed. These proxies can be located close to the processing cluster, or closer to the clients to improve performance and reduce the network delay by using a high speed reserved bandwidth network link between the proxy and the remote processing cluster. 4.2 Platform architecture overview An important goal of the platform is to separate the MMOG application logic, the underlying hosting platform and the load balancing component. To obtain this separation of business logic a component based approach is chosen. The platform management components take care of all the game agnostic tasks such as the internal communication and execute any microcell relocations initiated by the load balancing component. The third type of components are related to the actual game logic. An initial version of these components was introduced in [30]. Figure 3 gives an overview of the infrastructure and the platform components. The individual microcells, containing the game data, are distributed across the processing nodes and each individual microcell is deployed only once in the platform. The MicroCellController and ActorController, which contain the platform management logic, are deployed on every processing node and interact with the microcells and players respectively. Finally, the global load balancing component, called the MicroCellManager, initiates and executes the actual load balancing algorithms. This component resides on a separate node and initiate any microcell relocations. 4.3 Platform component description
This section introduces the assumptions made when creating the software architecture for the MMOG platform and gives a high level overview of the individual components.
A description of the functionalities of the components is given below:
Cluster Comput (2011) 14: 145–163
149
Fig. 3 Global overview of the platform architecture and the communication. The application components are hosted on a server cluster in a datacenter. The clients connect to the game servers over the Internet. Groups of microcells are deployed on the available servers and
are managed by per server components (MicroCellController) and one centralized component (MicroCellManager). All communication between the clients and the MMOG platform occurs between the client software and the ActorController
MicroCell is the component representing a small part of the virtual world which can be migrated. It contains all the data associated with a microcell, such as the items that can be found in that part of the world, the environment conditions and all players and their respective information, such as their current position, the items they carry etc. This also includes all player information in the microcell results in an automatic distribution of all data together with the load distribution. On the downside, this does require higher amounts of data to be transferred upon microcell relocations. In the remainder of the text we use the term MicroCell (with capital M and C) for the application component and microcell (no capitals) for the logical unit of the virtual world. MicroCellController is a per server component, i.e. on each server in the cluster, one instance of the MicroCellController is active. This key component in the architecture is responsible for the management of the MicroCell components on the server and acts as the central access point for all actions MicroCells participate in. The reloca-
tion of a MicroCell is, once initiated, under the total control of the MicroCellController. The MicroCellController contacts the destination server, relocates the data and takes all the necessary steps to arrange the redirection of message flows and clients. This requires that all communication amongst MicroCells and between MicroCells and clients passes through the MicroCellController. ActorController is a per server component as well. This component offers an interface to the clients and allows them to send events, such as movements, actions or interactions with the environment or other players etc. The ActorController then forwards all these messages to the MicroCellController which relays them to the appropriate MicroCells. Likewise, the results of all events occurring in the virtual world are relayed back to the players through the ActorController. MicroCellManager is the global manager of the virtual world and is responsible for initiating the relocation of the MicroCells. It monitors the application components, the generated system load and calculates improvements to the
150
MicroCell distributions if necessary. This component is not strictly necessary for the MMOG to function properly. Removing this component disables the redeployment of the MicroCell components, resulting in a situation similar to current MMOGs with a static deployment.
5 Platform component interaction Depending on the situation at hand, the components in the platform interact in different ways. This section gives an overview of the main scenarios that occur while players are active in the virtual world. Next, the algorithms associated with migrating a microcell between servers are discussed in detail.
Cluster Comput (2011) 14: 145–163
a player migrates between microcells located on different processing nodes is similar but requires a redirection of the player to the new server. To accomplish this, the MicroCellController signals it to the ActorController which then sends an event to the Player instructing him to connect to the server containing the destination MicroCell. As soon as the player is connected to the new server, the communication with the old server is canceled. The final and most complex scenario is the relocation of a microcell from one server to another. Such a migration is initiated by the MicroCellManager to improve the load distribution. This scenario includes two phases:
5.1 Scenario descriptions Consider the default scenario where a player is moving around in the world. For every movement or action the player takes, a message is sent to the game server. On the game server, this message is handled by the ActorController and then forwarded to the MicroCellController which forwards the message to the actual MicroCell the player resides in. The MicroCell component then processes the message and sends back a response which is routed from the MicroCell through the MicroCellController to the ActorController and back to all players in the scope of the initial action. The UML [31] sequence diagram describing this process is shown in Fig. 4. The result of a move or action, i.e. an event that has been processed by the platform, is referred to as an effect of that action. Figure 5 illustrates the decision flow when a player crosses a microcell border in the virtual world. The architecture components interact as follows: the MicroCell detects the player crossing the border and instructs the MicroCellController to initiate the migration of the player. If the destination MicroCell is located on the same physical server as the current MicroCell, no other changes are required and the MicroCell simply redirects the messages from this player to the new MicroCell and vice versa. The scenario where Fig. 4 The default component interaction when a player moves within the borders of a single microcell in the virtual world
Fig. 5 When a player crosses a microcell border, the flow diagram shown is used to determine the appropriate actions
Cluster Comput (2011) 14: 145–163
151
This process needs to be performed without hindering the gameplay and while maintaining the game state. A detailed description of the two relocation phases and associated algorithms is be provided in the following subsections.
it to the destination MicroCellController. These effects are processed by the destination MicroCell once its creation is finished, bringing the new copy up to date with the “old”, still functioning, copy. Note that it is important to forward the results of player actions to the new MicroCell copy, instead of the actions themselves. If the actions are forwarded directly, and thus processed both by the old and the new MicroCell copy, inconsistencies between both versions might arise without additional synchronization. Once the backlog of effects is processed and the new MicroCell is up to date, the source MicroCellController sends a message to the destination MicroCellController, asking for the next phase of the relocation to be started.
5.2 The migration phase
5.3 The redirection phase
The first part of relocating a microcell from one server to another is moving the MicroCell data (Fig. 6). This is done by creating a new MicroCell on the destination server, and loading that new MicroCell with the data of the MicroCell on the source server. Meanwhile, the current copy of the MicroCell on the source server can continue operating, ensuring uninterrupted gameplay. While copying the MicroCell and its data to the destination server, new player actions are arriving at the MicroCell. When the MicroCell processes these actions, its internal state changes, e.g. when the player cuts down a tree or picks up a rock. To solve this problem, the new MicroCell copy must be updated again after it has been created. This is performed as follows: when the MicroCell relocation starts, the MicroCellController informs the MicroCell that it is being relocated. From that moment on, each time the MicroCell processes an action, the result of the action (its effect) is sent to the MicroCellController, which forwards
The goal of the second phase of the MicroCell relocation is to start using the new MicroCell copy. At that point, the source copy becomes redundant and can be discarded, which was the goal of relocating the MicroCell. This can be done simply by making sure that all further actions and other messages sent to the source MicroCell are forwarded to the destination MicroCell. Since messages are sent by players and neighboring microcells (and their MicroCellControllers), these all need to be informed of the new location of the MicroCell. There is no further need to activate the new MicroCell copy. Once it is up to date, it is ready to process further player actions, and can thus immediately take over from the source copy. Figure 7 gives an overview of the process of redirecting all clients to the destination server hosting the new MicroCell copy. Initially all messages for the source MicroCell are forwarded by the source MicroCellController to the destination MicroCellController and back. In parallel, the source
1. Migration the MicroCell data from the source server to the destination server. This allows to recreate the execution environment for the microcell on the destination server. 2. Redirect all the players in the source microcell to the destination server. This finalizes the relocation and as all processing occurs on the destination server, the generated load is actually transferred from the source server to the destination server.
Fig. 6 The migration phase of the microcell relocation consists of 2 parallel tasks: transfer the MicroCell from the source server to the destination server and forward effects of the processed actions. These effects are then used to bring the copy up to date. Components on the source and destination server are denoted with the src and dst prefix respectively
152
Cluster Comput (2011) 14: 145–163
Fig. 7 The redirection phase of the microcell relocation consists of two parallel tasks. All existing communication is redirected to the new MicroCell copy while in parallel and all players are redirected tot the
destination server. Components on the source and destination server are denoted with the src and dst prefix respectively
MicroCellController requests the source ActorController to inform all player clients in the microcell of the new MicroCell location. After a client receives this a message, it sends further actions to the new location, and thus to the new MicroCell copy (via the destination ActorController and the destination MicroCellController). As soon as a player event has arrived at the destination MicroCell, the communication with the player client occurs through the destination ActorController. When all clients are connected to the destination ActorController, none of the communication needs to be forwarded through the source server and the redirection phase is complete. Note that clients, after connecting to the new server, still need to listen to the connection with the old copy for a moment. It is possible that a message to the client is underway via the source server and it would get lost otherwise.
The JavaEE application server provides support for dynamically deploying and undeploying components, logging, database abstractions and asynchronous messaging through the use of the Java Messaging Service (JMS). This section gives an overview of the practical Application Server deployment on the processing nodes, followed by a detailed description of the component implementation using JavaEE.
6 Implementation details To facilitate the development of a platform prototype JavaEE [32] was chosen as a base platform for the implementation.
6.1 Software configuration of processing node The software configuration of the processing node consists of GNU/Linux Debian [33] operating system with kernel 2.6.25.10 and the JBoss JavaEE Application Server version 4.0.5.GA [34] which was the latest stable release at the start of the implementation. JBoss is configured to use MySQL 5.0 [35] as the underlying database and is run using the Sun Java Virtual Machine version 1.6.0_7. The described setup gives an overview of the setup used for the remainder of the evaluation section. However the only restrictions for hosting the platform are that it requires a Java EE environment capable of hosting Enterprise Java Bean implemented according to the EJB 2.1 specification. Any current relational database or java virtual machine should be sufficient and any operat-
Cluster Comput (2011) 14: 145–163
ing system for which these packages are available, is capable of hosting the platform.
153
nodes and using this data the deployment optimizing algorithm is executed after which the MicroCell relocations are executed.
6.2 Component implementation details The logical components or building blocks described in Sect. 4.2 are implemented using standard Enterprise Java Beans (EJB). Data storage is implemented using Entities, execution logic using Session Beans and the asynchronous communication using Message-Driven Beans. An overview of the implementation of all logical building blocks is given below: MicroCell implementation consists of an Entity which contains the data of all objects it contains, a list of players and the data of the players themselves. The relevant business logic is implemented using a Stateless Session Bean which interacts with the Entity. For example when a player picks up an item in the game, the ownership of the item is changed from the microcell to the player. MicroCellController is a combination of a Message-Driven Bean which is responsible for the asynchronous processing of incoming messages and a Stateless Session Bean which allows the other components to generate the messages and send them to the correct destination. This design pattern is also known as the Message Facade [36]. Using this approach allows asynchronous processing and creates an abstraction layer to hide the use of JMS or any other messaging system used. A parallel implementation of these components is used for the relocation of MicroCells. This allows the messages considering MicroCell relocations to be handled separately to reduce the delays that a MicroCell relocation causes. For example, a very large message containing the MicroCell data (higher latency allowed) does not interfere with the player messages (lower latency required). ActorController similar to the MicroCellController it is composed of a Stateless Session Bean and a MessageDriven Bean. The client events arrive at the MessageDriven Bean component and are transformed to messages suitable for the MicroCellController. This transformation of messages, instead of calling business methods directly on the MicroCellController, reduces the need for locking and prevents deadlock situations. If messages need to be sent to the clients, for example when another player enters the visible range, the ActorController’s Stateless Session Bean is used to create and send the messages as with the MicroCellController. MicroCellManager is a Stateless Session Bean and is used to orchestrate the MicroCell relocations. A timer is used to initiate a deployment optimization round. For each iteration, the performance data is fetched from the processing
7 Dynamic microcell-specific load optimizing algorithms The MMOG hosting platform monitors the CPU load on the processing nodes and the MicroCellManager periodically recalculates the optimal assignment. The CPU load metric was chosen since this will have a great impact on the total delay that will be experienced at the client side. The goal of the algorithms is to provide a new assignment that has a better load balancing and should obtain this assignment with the least number of MicroCell relocations as relocating a MicroCell is an expensive operation. The algorithms used by the MicroCellManager to adjust the MicroCell deployments are discussed in the remainder of this section. 7.1 Player Route Adjustment (PRA) This algorithm is based on the assumption that if one player follows a certain itinerary, other players will do the same. Therefore, it selects a number of players randomly and follows them around as they move through the virtual world. As crossing a microcell border is an expensive operation when the neighboring microcell is located on another server, the algorithm tries to reassign all neighboring microcells to the same server as the one the tracked player is currently located on. When the algorithm is called, it tries to do this for all the players it tracks. Hence the player and all others following the same path do not need to be redirected to another server when crossing the microcell border. However, a MicroCell is only reassigned if this results in an improvement of the server loads. The following changes are considered a global improvement: the highest server load is reduced, the average server load is reduced, the overload percentage of all servers combined is reduced. A graphical overview of the algorithm steps is shown in Fig. 8.
Fig. 8 Illustration of the PRA algorithm: A monitored player is located in the world (a), the neighboring cells are considered for relocation to the same server (b) and if this results in an improved load distribution, the MicroCells are relocated (c)
154
The focus of this algorithm is to keep the load distributed as evenly as possible and to prevent overload situations instead of solving them after they occur. In order to obtain the desired behavior of this algorithm, it should be executed at least once before a player can move out of a cell and its adjacent neighbors. To guarantee this, the shortest possible time, i.e. the time required for a player to cross a microcell from one side to the other, is chosen as the execution interval. If a severe overload that cannot be solved using the default approach occurs, a random player is chosen on the highest loaded server and the associated MicroCell is reassigned to the least loaded server in an attempt to restore the balance between the server loads. To prevent unnecessary swapping of MicroCells, a temporary lock is placed on a MicroCell to prevent it from being swapped in consecutive iterations. An important parameter of this algorithm is the number of players to be monitored. The number of players that should be tracked correlates with the number of hotspots and their mobility. In the evaluation section, 25% of the total number of players was monitored. This value was chosen because this algorithm requires enough critical mass. In [5] a more detailed analysis on the configuration of this parameter can be found. There it is shown that less players in total require a larger percentage to be monitored to be successful. 7.2 Continuous Strong Locality Adjustment (CSLA) Continuous Strong Locality Adjustment is an algorithm that tries to exploit microcell locality. It is based on the idea that it is better to put two microcells that have much interaction on the same server. When looking at the resource usage, it is expected that the highest difference exists between player transfers between microcells on the same server and microcells located on different servers. Therefore, the algorithm tries to form clusters of microcells that have a large number of player transfers between them and to assign all the microcells in a single cluster to the same server. The algorithm starts by taking the highest loaded server and chooses the microcell with the highest player transfer cost to a single neighboring server. This microcell is relocated to this neighbor if this results in a better deployment, i.e. a significantly lower maximum server load. This process continues until no more microcells are relocated. It causes cascades of microcell moves across the servers. It is possible that in this optimization phase a microcell is relocated more than once, in that case, this algorithm will only execute the final relocation. A graphical overview of the algorithm steps is shown in Fig. 9. In this example, the dark grey server has the lowest load and the light grey server the highest load. The communication costs are also shown in steps (b) and (c). Step (a) shows the deployment of 16 cells on 2 servers. The CSLA algorithm evaluates the communication costs of the cells and moves the cell with the highest cost to the neigh-
Cluster Comput (2011) 14: 145–163
Fig. 9 Illustration of the CSLA algorithm: All microcells neighboring to the least loaded server are considered for relocation. The cell with the highest communication cost (shown in the circles) is chosen (b) and relocated (c). This step repeats until no more relocations occur
bouring server, this step is then repeated. In steps (b) and (c) the cells that will be relocated are striped. This algorithm optimizes the microcell allocation and does not just solve the overload situations. It is executed continuously at regular intervals instead of only in overload situations, thus proactively preventing servers becoming overloaded. This does incur more microcell moves as the algorithm might perform optimizations that are unnecessary but the relocations are more likely to be executed at a time they do not cause any additional overload to the servers and the load remains balanced across all servers during the entire runtime of the game. An alternative would be to execute the algorithms only when overload occurs, but this will result in a costly operation when the system is already overloaded. 7.3 Load Attraction (LA) The design goal of the Load Attraction (LA) algorithm is to provide an algorithm with characteristics similar to the CSLA algorithm but with a lower algorithm complexity and thus a shorter execution time. This algorithm tries to organize microcells into clusters, but does not explicitly take relocations or interactions between microcells into account. In the first step the algorithm determines the least loaded server and all microcells deployed on another server that are neighboring a microcell hosted on the least loaded server. We call this set the neighbor loads. Next the microcell with the highest load of all microcells in the neighbor loads is relocated to the least loaded server. Then the least loaded server is determined again, and the process is repeated until the difference between the highest loaded server and the least loaded server is small enough, for example 5%. A graphical overview of the algorithm steps is shown in Fig. 10. The dark grey server has the lowest load and the light grey server the highest load, the load generated by the individual cells is also depicted. In the consecutive steps, the microcell with the highest load is chosen to relocate to the least loaded server. This process is repeated until the difference in load between the highest and least loaded server is below a threshold. An implementation as described can result in an infinite loop as a MicroCell might be continuously relocated between the same two servers. Therefore, additional checks
Cluster Comput (2011) 14: 145–163
155 Table 1 Overview of the configuration parameters used in the evaluations
Fig. 10 Illustration of the LA algorithm: All microcells neighboring the least loaded server are considered for relocation (b). Next the highest loaded neighbor is relocated to this server (c) and this step is repeated until the load difference between the highest and least loaded server is minimized
where included to prevent this from occurring. As the other algorithms, this algorithm optimizes the load distribution of all server proactively and requires to be executed at regular intervals. During the execution the algorithm does try to maintain clusters of microcells to be deployed on the same server and thus implicitly minimizing the number of relocations between servers.
8 Evaluation setup details In order to evaluate the dynamic hosting platform a testing environment was set up on which the platform can be deployed and installed. This section describes the hardware platform used and the evaluation parameters of the virtual world used in the evaluation. 8.1 Hardware setup The hardware platform consists of 4 to 16 processing nodes in cluster configuration with the software configuration described in Sect. 6.1. All nodes are equipped with two dual core opteron 2212 processors and 4 GB RAM. The nodes are interconnected using gigabit lan and are part of an Emulab testbed [37], which allows automated testing. 8.2 Virtual world configuration The virtual world is divided into 1024 equally sized rectangular microcells arranged in a 32 by 32 grid. The worlds used in the simulation are 1024 units wide resulting in a microcell of 32 by 32 units. The world used was 5 km wide and has a toroidal shape, which means that players can move freely in any direction without reaching the end of the world. The effects of all actions occurring in a microcell are forwarded to all neighbors. This results in an effective viewing distance of the size of a microcell. The world contains a number of hotspots or points of interest. These hotspots represent important game elements, such as quest locations, resources, important buildings, etc.
Parameter
Value
Microcell count
1024
Microcell size (data)
1 MB
Player count (4 servers)
100
Player count (16 servers)
300
Player size (data)
50 KB
Events a player generates
2/s
Events size (data)
1 KB
Hotspot count
16
Hotspot distribution
grid
Hotspot mobility
1 relocation/hour
Testcase runtime
12 hours
Hotspot locations are allowed to change during the evaluation to represent temporary points of interest disappearing and new ones appearing. This could be for instance large battles or parties. At the start of the evaluation 16 hotspots are evenly distributed across the virtual world in a balanced 4 by 4 grid configuration. As the evaluation progresses, every hour one of the hotspots is randomly relocated in the virtual world. These evaluations were run for 12 hours after which the experiment was terminated. As such, 75% of the load is relocated during a single evaluation. The data size of one MicroCell was set to 1 MB. This value is based on the size on disk of microcell sized game maps of the popular multiplayer game Counter Strike [38]. Note that this only contains a blueprint of the microcell layout. Additional data such as textures which may require large amounts of disk space are assumed to be installed on the game client and do not require to be installed on the hosting platform. A summary of world configuration parameters is included in Table 1. 8.3 Player behavior model Player movement follows the model outlined in [8] which is based on the Random Waypoint Model originally described in [39]. A player selects a random hotspot in its vicinity and walks to the hotspot in a straight line. After arriving at the hotspot, he stays in the neighborhood for 2 minutes while walking around randomly. Next, the player selects a new hotspot which is located within a range of one third of the width of the virtual world. Additionally, players moving to their new destination have a small chance to change their mind and select a new random hotspot. As the virtual world is toroidal, the proposed model does not suffer from the problem that hotspots in the middle of the world are more popular than others. At the start of the evaluation the players are all distributed randomly in the world. The speed of the
156
players is chosen according to the obtained results in [4, 5]. The worlds used in the evaluation are 5 km wide and every player moves at 18 km/h, similar to the running speed experienced in various multiplayer games. To model the interactivity of the players, a player entity generates an event in the virtual world every 500 ms. This is a realistic value and would not result in sluggish gameplay in a real MMOG, because previous research has shown that a round-trip latency of up to 1 s is not considered to influence the gameplay of typical MMOGs [40]. A typical event includes the location of the player, the direction of the movement and a possible action the user executed. This was measured to be on average 1 KB, including the JMS overhead. The data associated with each player and that needs to be transferred with player that goes from one microcell to another was set to 50 KB. This data includes inventory items of the player and only contain a reference to the actual item properties, which are like world textures, installed on the client side. A summary of these parameters is included in Table 1. 8.4 Load balancing parameters All proposed algorithms require to be invoked at regular intervals to optimize the load distribution. In the evaluations presented in this article, the algorithms are invoked every three minutes. If an algorithm requires more than 2 minutes, its execution is aborted and the algorithm returns the best solution at that time. The MicroCell relocations are initiated immediately after the algorithm execution has terminated. However, as it is not feasible to have all relocation occur at the same time, the number of parallel relocations a single processing node can participate in is limited to 4. Every 10 seconds a new batch of MicroCell relocations can be started. If not all relocations are performed by the start of the next algorithm iteration, the pending relocations are discarded and the current assignment is used as an input parameter for the algorithms. Any relocations that were still being processed are continued and assumed to be completed.
9 Evaluation results In this section we discuss in detail the obtained evaluation results. The delays experienced by the clients are compared for different loads in the platform and when the player is subjected to MicroCell relocations. Additionally the efficiency of the MMOG load balancing algorithms is evaluated by validating their ability to distribute the load and the time required to stabilize an unevenly distributed world. For obtaining the results presented here, the cpu load was monitored using mpstat under linux and response time by measuring the round trip delay of a message sent by the player.
Cluster Comput (2011) 14: 145–163
9.1 Player experienced delays One of the key factors influencing the gaming experience for the end-user is the delay experienced when interacting with the virtual world and other players. This experienced delay is determined by the network delay and the processing delay of the platform. The network delay is mainly influenced by the processing time in the hosting platform. The actual network delay due to the Internet connection of the player is not included in the presented results. Due to the asynchronous messaging, no additional delays are introduced due to the microcell paradigm. As such, the load on a processing node is the determining factor for the delay introduced by the MMOG platform. To evaluate the influence of the load on the measured delay, the load on a single processing node was gradually increased by adding extra players. For each number of players, a stable situation was maintained for an interval of 10 minutes to gather the required delay measurements. The influence of the increasing load on the measured delay, for all players connected to this node is shown in Fig. 11. As the extra connected players result in an increasing number of user generated events, the load on the processing node increases as well. Up to approximately 20 players, the average response time remains stable to increase gradually as the number of players keeps growing. A steeper increase can be observed for the 95th and 99th percentile. When the number of players is increased beyond 55 players, the server is no longer capable of processing all events within the time constraints due to locking in the database. The current platform implementation locks the MicroCell data on each transaction, due to these limitations in the implementation, the total cpu load does not increase above 60% on a quad-core system. 9.2 Load distribution Distributing the generated load over multiple processing nodes requires efficient load balancing algorithms. Figure 12 shows the cpu loads of the MMOG platform when no load balancing is performed (Fig. 12(a)) and when the LAalgorithm is active (Fig. 12(b)). The variation in the server loads is significantly less in the case load balancing is performed. If significant changes in the load distribution occur due to large amounts of players moving from one spot in the virtual world to another one, the load balancing prevents one node to be overloaded or a decrease in the QoE of the players residing in the microcells around the new hotspots. We evaluate the performance of the proposed MMOG load balancing algorithms by comparing the average load of all the processing nodes, the average maximum load and the average minimum load during the evaluations. Figure 13 shows the obtained results in the case of an MMOG
Cluster Comput (2011) 14: 145–163
157
Fig. 11 On the left axis the delays experienced by the end user are shown for an increasing number of players. On the right axis the increase of the load on the processing node is shown as the number of players increases
Fig. 12 When a load balancing algorithm is used, the cpu loads of the 4 processing nodes are more balanced than when no load balancing is used at all
deployed on respectively 4 and 16 processing nodes. The closer the maximum and minimum load are together, the better the algorithm balances the workload across the different processing nodes. The LA algorithm clearly scores best in this case. However, the average load is the highest for the LA algorithm in the case of 16 processing nodes. This indicates that the overhead for the deployments proposed by the LA algorithm is higher than for the other algorithms. When using more servers, this is more prevalent as the amount of
microcells remains the same, but the number of servers increases and thus the ratio of the inter server communication compared to the intra server communication is higher. The PRA and especially the CSLA algorithm perform better in minimizing the average load, and thus the overhead introduced by the microcell assignment for a higher amount of nodes. The PRA and CSLA algorithm are more complex and require more processing which allows them to obtain a more efficient deployment. The lower maximum loads in
158
Cluster Comput (2011) 14: 145–163
Fig. 13 The measured loads of the processing nodes for the evaluated algorithms. The error bars indicate the standard deviation
Fig. 14 The load per server when player numbers are increased and servers are added at runtime
the case of the LA algorithm show it responds faster to the variations in the load that occur in the evaluation. To evaluate the response time of the algorithm we evaluate the scenario of adding additional processing nodes to the platform. For example, if the load on the processing nodes would increase above a certain threshold additional nodes can be assigned to maintain the QoE of the players. Figure 14 shows the load of the nodes when adding an extra server to the configuration. The algorithm used in this example is the PRA algorithm and initially 30 play-
ers were present in the world. For each increase, 30 players were added to the server and a second server was added 15 minutes after the initial load increase and 45 minutes after the second load increase to clearly show the behavior of the platform. Initially the entire virtual world is assigned to a single processing node and the number of players is doubled. After 10 minutes, a second processing node is added to the configuration and the load is distributed across the two available nodes. At this point a small spike in the total load over
Cluster Comput (2011) 14: 145–163
159
Table 2 Overview of the algorithm behavior when adding a second processing resource to the platform (max indicates the maximum allowed time) Execution time (s)
Cellmoves unstable
Cellmoves stable
Iteration
CSLA
120 (max)
5.21
3.13
6
PRA
3.23
9.81
6.25
7
LA
10.17
20.74
2.25
3
all nodes can be seen as a large part of the virtual world is relocated and a fast-track relocation mechanism is used. To speed up the load balancing process, temporarily MicroCells are allowed to be relocated at a higher rate. In a second phase the number of players is increased again with the same amount and after 45 minutes, a third server is added to the configuration, which again results in a temporarily spike in the total load due to the extra MicroCell relocations. In Table 2 the time required to stabilize the load across two servers for each algorithm is presented as the number of algorithm iterations. The iteration interval is 240 seconds, and the microcell lock for the PRA algorithm was configured at 480 seconds. The table also lists the execution time of the algorithm, the average number of MicroCell relocations of performed after each iteration while the world is unstable and the average number of relocations when the world is stable. The results show that the PRA algorithm requires the most iterations, which can be explained as it is highly dependent on the monitored players in the virtual world. These need to reside in microcells with neighbors deployed on the different server to generate MicroCell relocations. The CSLA algorithm is limited by the fact it is the most complex algorithm and requires the full 2 minutes of processing time. This leaves only one minute to perform the actual cell relocations. The LA algorithm manages to stabilize the deployment in only three algorithm iterations but performs the highest number of cellmoves per iteration. All algorithms successfully obtain a stable state and are capable of maintaining this stability over time. During this stable state the algorithms are still executed and perform additional cellmoves to improve the load distribution. At this time the LA algorithm performs the least cellmoves, followed closely by the CSLA algorithm. The PRA algorithm still performs more than twice as much cellmoves. 9.3 MicroCell relocation delays One of the main features of the proposed MMOG platform is the capability to relocate MicroCells at runtime. The influence of relocating a MicroCell when a player is not residing the microcell itself is very small. However, if the
player resides in the microcell that is relocated, this is expected to have an influence on the measured response times. Figure 15(a) shows the expected influence on the processing nodes when the MicroCell is relocated. On the source server, the MicroCell is responsible for a certain part of the load generated. When the MicroCell needs to be relocated, all data is gathered and serialized. This is a relatively resource intensive task, especially on the database which introduces additional delays due to extra locking. As the MicroCell is transmitted over the network all effects need to be forwarded, to be applied after reconstruction of the MicroCell at the receiving server. This reconstruction, combined with applying the accumulated effects causes another short spike in the load and database locking. Once the migration phase is completed, the players are redirected. Before this redirection actually occurs, the players experience a slower response time, as their messages are rerouted through the source node to the receiving node. As soon as a player is redirected the response time experienced by the player goes back to normal. Figure 15(b) shows the actual measured response times experienced by the players during a microcell relocation which indicate that the experienced response times correspond to the proposed model. An overview of the time spent during a microcell relocation is given in Table 3. From the measurement results, it is clear that a significant time of the relocation is spent in the network transmission of the data. However, the increase of the transmission time is limited as the number of players increases due to the minimal extra data required for a player. The same applies for the serialization and deserialization of the data. Processing the updates increases significantly as the number of updates that need to be processed increases for each player that needs to be transfered. Table 4 gives an overview of the delay measurements during a MicroCell relocation. The measured delays by the players during the MicroCell relocation process increases as the number of players in the microcell increases. The table includes the maximum delays measured during the relocation and redirection phases of the MicroCell relocation process (i.e. both spikes visible in Fig. 15(b)). The delays in the second phase are significantly longer than during the first phase. The cause of this higher delay is that all messages from and to the players need to pass through two server nodes instead of just one as is normally the case. It is important to note that these higher delays are only experienced by the players in the microcell that is being relocated. The most significant part of the delay is caused by database locks, only a small part is attributed to higher load on the processing nodes and is hardly measurable in the delays experienced by the other clients. Taking into account that the delay thresholds up to which the QoE for MMOGs remains acceptable, typically are about 1s, the measured delays are still very acceptable for this type of games [41, 42].
160
Cluster Comput (2011) 14: 145–163
Fig. 15 Comparison of the expected delays during a microcell relocation and the measured delays by players residing in the microcell being relocated
10 Conclusions In this article we propose the architecture and implementation of an MMOG hosting platform capable of hosting contiguous online virtual environments. By introduc-
ing the microcell concept, the platform allows the load generated by an MMOG to be distributed across multiple processing nodes and to dynamically assign extra processing nodes when required. In order to gather detailed information on the impact of the microcell relocations and the
Cluster Comput (2011) 14: 145–163
161
Table 3 The average time spent in each phase of the microcell relocation Players in microcell
5
10
15
20
Serialization (ms)
253
289
330
347
Network transmission (ms)
900
1123
1280
1367
Deserialization (ms)
124
130
143
152
Update Processing (ms)
202
308
583
735
Player redirection (ms)
204
399
585
792
1683
2249
2921
3393
Total (ms)
Table 4 The measured client-side delays when relocating a microcell for an increasing number of players in the relocated microcell Players in cell
Serialization peak (ms)
Redirection peak (ms)
1
29
56
5
33
143
10
51
152
15
87
233
20
110
337
player transfers on the overall gameplay, a prototype was implemented. The JavaEE based prototype implementation is evaluated through performance evaluations with input parameters based on extensive simulations. Three MMOG load balancing algorithms are proposed and compared in terms of the responsiveness of the algorithms to load variations in the virtual world and the efficiency of the load distribution. The obtained results show that the microcell approach and the MMOG hosting platform can provide the required functionality to efficiently distribute the load across the available processing nodes and perform microcell relocations at runtime without affecting the gameplay. Acknowledgements Filip De Turck acknowledges the F.W.O.-V. (Fund for Scientific Research-Flanders) for their support through a postdoctoral fellowship.
References 1. Blizzard Entertainment: World of warcraft subscriber base reaches 11.5 million worldwide. In: [Online] http://www.blizzard.com/ us/press/081121.html (2008) 2. Rosedale, P., Ondrejka, C.: Enabling player-created online worlds with grid computing and streaming. Gamasutra (2003) 3. De Vleeschauwer, B., Van Den Bossche, B., Verdickt, T., De Turck, F., Dhoedt, B., Demeester, P.: Dynamic microcell assignment for massively multiplayer online gaming. In: Proceedings of Netgames 2005: 4th Workshop on Network and Systems Support for Games. New York, USA (2005)
4. Verdickt, T., De Vleeschauwer, B., Van Den Bossche, B., De Turck, F., Dhoedt, B., Demeester, P.: Adaptive microcell assignment in massively multiplayer online games. In: Proceedings of CGAMES, the 10th International Conference on Computer Games; AI, Animation, Mobile, Educational and Serious Games, pp. 92–99. Louisville, Kentucky, USA (2007) 5. Van Den Bossche, B., De Vleeschauwer, B., Verdickt, T., De Turck, F., Dhoedt, B., Demeester, P.: Autonomic microcell assignment in massively distributed online virtual environments. J. Netw. Comput. Appl. 32(6), 1242–1256 (2009). doi:10.1016/ j.jnca.2009.04.001 6. BigWorld Pty Ltd.: Bigworld [online] (2008). http://www. bigworldtech.com 7. CCP: Eve online—a massive multiplayer online roleplaying space game—mmorpg [online] (2008). http://www.eve-online.com/ 8. Chertov, R., Fahmy, S.: Optimistic load balancing in a distributed virtual environment. In: Proc. of the 16th ACM International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), pp. 74–79 (2006) 9. Restrepo, A., Montoya, A., Trefftz, H.: Dynamic server allocation in virtual environments, using quadtrees for dynamic space partition. In: IASTED International Conference on Computer Science and Technology, pp. 364–368. Cancun, Mexico (2003) 10. Chen, J., Wu, B., Delap, M., Knuttson, B., Lu, H., Amza, C.: Locality aware dynamic load management for massively multiplayer games. In: Proc. of Principles and Practice of Parallel Programming (PPoPP), pp. 289–300 (2005) 11. Ahmed, D., Shirmohammadi, S.: A microcell oriented load balancing model for collaborative virtual environments. In: Virtual Environments, Human-Computer Interfaces and Measurement Systems, 2008. VECIMS 2008. IEEE Conference on, pp. 86–91 (2008). doi:10.1109/VECIMS.2008.4592758 12. Sun Microsystems, I.: Project darkstar [online] (2008). http:// www.projectdarkstar.com/ 13. Fiedler, S., Wallner, M., Weber, M.: A communication architecture for massive multiplayer games. In: NetGames ’02: Proceedings of the 1st Workshop on Network and System Support for Games, pp. 14–22. ACM, New York (2002). doi:10.1145/566500.566503 14. Morgan, G., Lu, F., Storey, K.: Interest management middleware for networked games. In: I3D ’05: Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games, pp. 57–64. ACM, New York (2005). doi:10.1145/1053427.1053436 15. Minson, R., Theodoropoulos, G.: Adaptive support of range queries via push-pull algorithms. In: PADS ’07: Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation, pp. 53–60. IEEE Computer Society, Washington (2007). doi:10.1109/PADS.2007.11 16. Ramakrishna, V., Robinson, M., Eustice, K., Reiher, P.: An active self-optimizing multiplayer gaming architecture. Clust. Comput. 9(2), 201–215 (2006). doi:10.1007/s10586-006-7564-2 17. Vik, K.H., Halvorsen, P., Griwodz, C.: Evaluating steiner-tree heuristics and diameter variations for application layer multicast. Comput. Netw. 52(15), 2872–2893 (2008). doi:10.1016/j.comnet. 2008.06.003. URL http://www.sciencedirect.com/science/article/ B6VRG-4SRCJWH-2/2/6b35f71601406090d2c150e113b4c744. Complex Computer and Communication Networks 18. Smed, J., Kaukoranta, T., Hakonen, H., Ab, O.L.M.E.: A review on networking and multiplayer computer games. Tech. rep., Turku Centre for Computer Science (2002) 19. Jiang, X., Xu, D.: Soda: A service-on-demand architecture for application service hosting utility platforms. In: HPDC ’03: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing, p. 174. IEEE Computer Society, Washington (2003)
162 20. Wang, X., Du, Z., Chen, Y., Li, S., Lan, D., Wang, G., Chen, Y.: An autonomic provisioning framework for outsourcing data center based on virtual appliances. Clust. Comput. 11(3), 229–245 (2008). doi:10.1007/s10586-008-0053-z 21. Lau, S.M., Lu, Q., Leung, K.S.: Adaptive load distribution algorithms for heterogeneous distributed systems with multiple task classes. J. Parallel Distrib. Comput. 66(2), 163–180 (2006). doi:10.1016/j.jpdc.2004.01.007 22. Appleby, K., Goldszmidt, G.: Using automatically derived load thresholds to manage compute resources on-demand. In: 9th IFI/IEEE Symposium on Integrated Management (IM2005), pp. 747–760. Nice, France (2005) 23. Balasubramanian, J., Schmidt, D.C., Dowdy, L., Othman, O.: Evaluating the performance of middleware load balancing strategies. In: EDOC ’04: Proceedings of the Enterprise Distributed Object Computing Conference, Eighth IEEE International, pp. 135–146. IEEE Computer Society, Washington (2004). doi:10.1109/EDOC.2004.11 24. Broberg, J., Tari, Z., Zeephongsekul, P.: Task assignment with work-conserving migration. Parallel Comput. 32(11–12), 808– 830 (2006). doi:10.1016/j.parco.2006.09.005 25. Adam, C., Stadler, R.: Service middleware for self-managing large-scale systems. IEEE Trans. Netw. Serv. Manag. 4(3), 50–64 (2007) 26. Shaikh, A., Sahu, S., Rosu, M., Shea, M., Saha, D.: Implementation of a service platform for online games. In: NetGames ’04: Proceedings of 3rd ACM SIGCOMM Workshop on Network and System Support for Games, pp. 106–110. ACM, New York (2004). doi:10.1145/1016540.1016547 27. Ferretti, S., Roccetti, M., Palazzi, C.E.: An optimistic obsolescence-based approach to event synchronization for massive multiplayer online games. Int. J. Comput. Appl. 29(1), 33–43 (2007) 28. Ferretti, S., Roccetti, M., Palazzi, C.E.: Intelligent synchronization for mirrored game servers: a real case study. J. Adv. Comput. Intell. Intell. Inform. 12(2), 132–141 (2008) 29. Renambot, L., Bal, H.E., Germans, D., Spoelder, H.J.W.: Cavestudy: an infrastructure for computational steering and measuring in virtual reality environments. Clust. Comput. 4(1), 79–87 (2001). doi:10.1023/A:1011420511667 30. Van Den Bossche, B., Verdickt, T., De Vleeschauwer, B., Desmet, S., De Mulder, S., De Turck, F., Dhoedt, B., Demeester, P.: A platform for dynamic microcell redeployment in massively multiplayer online games. In: Proceedings of NOSSDAV2006: The 16th International Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 14–19. Rhode Island, USA (2006) 31. Object Management Group (OMG): Unified modeling language specification [online]. http://www.uml.org 32. Sun Microsystems: Java EE at a Glance [online]. http://java.sun. com/javaee/ 33. Debian: The universal operating system [online] (2008). http:// www.debian.org/ 34. JBoss Inc.: JBoss Application Server [online]. http://www.jboss. com/products/platforms/application 35. MySQL AB Sun Microsystems, I.: Mysql: The world’s most popular open source database [online] (2008). http://www.mysql. com/ 36. Marinescu, F.: EJB Design Patterns: Advanced Patterns, Processes and Idioms. Wiley Computer, New York (2002) 37. White, B., Lepreau, J., Stoller, L., Ricci, R., Guruprasad, S., Newbold, M., Hibler, M., Barb, C., Joglekar, A.: An integrated experimental environment for distributed systems and networks. In: Proc. of the Fifth Symposium on Operating Systems Design and Implementation, pp. 255–270. USENIX Association, Boston (2002)
Cluster Comput (2011) 14: 145–163 38. Valve Corporation: Counter-strike: Source [online] (2004). http://counter-strike.net/ 39. Johnson, D.B., Maltz, D.A.: Dynamic source routing in ad hoc wireless networks. In: Imielinski, T., Korth, H. (eds.) Mobile Computing, vol. 353. Kluwer Academic, Amsterdam (1996) 40. Sheldon, N., Girard, E., Borg, S., Claypool, M., Agu, E.: The effect of latency on user performance in Warcraft III. In: Proc. of ACM Network and System Support for Games (NetGames), pp. 3–14 (2003) 41. Fritsch, T., Ritter, H., Schiller, J.: The effect of latency and network limitations on mmorpgs: a field study of everquest2. In: NetGames ’05: Proceedings of 4th ACM SIGCOMM Workshop on Network and System Support for Games, pp. 1–9. ACM, New York (2005). doi:10.1145/1103599.1103623 42. Claypool, M., Claypool, K.: Latency and player actions in online games. Commun. ACM 49(11), 40–45 (2006). doi:10.1145/ 1167838.1167860
Bruno Van Den Bossche graduated as a Master in Computer Science at Ghent University in 2004. He joined the Department of Information Technology at Ghent University where his research included scalable software architectures, distributed software and the automatic optimal distribution of component based software applications. His research was published in over 20 publications in journals and conferences proceedings. In October 2009 he obtained a PhD in computer science. He joined Comsof, a spinoff company from the Department of Information Technology at Ghent University, where he is currently working on the design and development of Media Asset Management platforms and the design and optimization of Fiber to the Home networks. Bart De Vleeschauwer graduated as a Master in Computer Science at Ghent University in 2003. Subsequently, he joined the Department of Information Technology at Ghent University, where he performed research on overlay networks, autonomic networking, multimedia service delivery and network monitoring. His research was published in over 40 publications in journals and international conferences. In September 2008 he obtained a PhD in computer science. He joined the Alcatel-Lucent Bell Labs Fixed Access team in October 2009 where he is currently working on design, control and management of multimedia and broadband service delivery networks.
Cluster Comput (2011) 14: 145–163 Filip De Turck received his M.Sc. degree in Electronic Engineering from the Ghent University, Belgium, in June 1997. In May 2002, he obtained the PhD degree in Electronic Engineering from the same university. During his PhD research he was funded by the F.W.O.-V., the Fund for Scientific Research Flanders. From October 2002 until September 2008, he was a post-doctoral fellow of the F.W.O.-V. and part time professor, affiliated with the Department of Information Technology of the Ghent University. At the moment, he is a full-time professor affiliated with the Department of Information Technology of the Ghent University and the IBBT (Interdisciplinary Institute of Broadband Technology Flanders) in the area of telecommunication and software engineering. Filip De Turck is author or co-author of approximately 250 papers published in international journals or in the proceedings of international conferences. His main research interests include scalable software architectures for telecommunication network and service management, performance evaluation and design of new telecommunication and eHealth services. Bart Dhoedt received a Masters degree in Electro-technical Engineering (1990) from Ghent University. His research, addressing the use of micro-optics to realize parallel free space optical interconnects, resulted in a PhD degree in 1995. After a 2-year post-doc in opto-electronics, he became Professor at the Department of Information Technology. Bart Dhoedt is responsible for various courses on algorithms, advanced programming, software development and distributed systems. His research interests include soft-
163 ware engineering, distributed systems, mobile and ubiquitous computing, smart clients, middleware, cloud computing and autonomic systems. He is author or co-author of more than 300 publications in international journals or conference proceedings. Piet Demeester is professor in the faculty of Engineering at Ghent University. He is head of the research group “Intec Broadband Communication Networks” (IBCN) that is part of the Department of Information Technology (INTEC) of Ghent University and that also belongs to the Interdisciplinary Institute for Broadband Technology (IBBT). He is Fellow of the IEEE. After finishing a PhD on Metal Organic Vapor Phase Epitaxy for photonic devices in 1988, he established a research group in this area working on different material systems (AlGaAs, InGaAsP, GaN). This research was successfully transferred to IMEC in 2002 and resulted in 12 PhDs and 300 publications in international journals and conference proceedings. In 1992 he started research on communication networks and established the IBCN research group. The group is focusing on several advanced research topics: Network Modeling, Design & Evaluation; Mobile & Wireless Networking; High Performance Multimedia Processing; Autonomic Computing & Networking; Service Engineering; Content & Search Management and Data Analysis & Machine Learning. The research of IBCN resulted in about 50 PhD’s, 800 publications in international journals and conference proceedings, more than 20 international awards and 3 spin-off companies.