Using Autonomous Avatars to Simulate a Large-Scale Multi-User Networked Virtual Environment Peter Quax∗
Patrick Monsieurs† Tom Jehaes‡ Expertise Center Digital Media Limburgs Universitair Centrum Universitaire Campus B-3590 Diepenbeek, Belgium
Abstract This paper presents our approach in testing the scalability of largescale multi-user networked virtual environments. Emphasis is laid on both the number of users that are supported by the architecture and the resulting network traffic, both at client and server side. Instead of using a limited number of actual human users and extrapolating the results to larger user bases, we have opted for a system in which autonomous avatars are employed. These autonomous avatars are programmed with a limited number of fast algorithms that determine their behavior. These algorithms result in awareness of the structure of the world and reactions to events that happen in the world. By using these autonomous avatars as actual users of our Networked Virtual Environment, they also generate traffic that is representative for their human counterparts. This testing methodology is applied to our custom developed networked virtual environment framework ’ALVIC’ (Architecture for Large Scale Virtual Interactive Communities). This enables us to prove scalability of the system to at least 1000 clients using results acquired by actually capturing network traffic. This method for scalability testing eliminates the need for large numbers of human users and is furthermore able to provide accurate results by only using a limited number of computers. Keywords: Autonomous Avatars, Networked Virtual Environments, Scalability
1
Introduction and related work
Networked Virtual Environment (NVE) applications started out as military simulations [Macedonia et al. 1994], later evolving into dedicated research projects at universities and institutions [Anderson 1996; Funkhouser 1995]. Recently, the entertainment industry has adopted the findings in this field of research and developed new types of applications based on NVE technology. Examples of these are the multi-player games [Mythic ; Sony Online Entertainment ] ∗ e-mail:
[email protected] [email protected] ‡ e-mail:
[email protected] § e-mail:
[email protected] † e-mail:
Wim Lamotte§
and the virtual communities [There Inc. ; Linden Research Inc. ]. While some of these only support a limited number of users due to their nature (e.g. first person shooters, party games,...), others have evolved into applications that support very large user bases (e.g. Massive Multiplayer On-Line Role-Playing Games, Large Scale Virtual Communities,...). This last category has proven to be quite popular and a way of generating revenue for application and content providers. Scalability testing for these games is mostly done purely by means of large groups of beta testers. This methodology exhibits some problems that are described below. Military simulations are mostly run on high-bandwidth dedicated Local Area Networks, employ very powerful machines and are limited in the number of users they support. The same cannot be said for multiplayer games and virtual communities. The Internet is composed of a collection of heterogeneous networks and all kinds of end-user equipment. It is therefore practically impossible for a provider that wants to target massive audiences with a single application to rely on state-of-the art high bandwidth connections and end-user equipment. The ALVIC (Architecture for Large Scale Virtual Interactive Communities) system was developed with these limitations in mind and presents a number of solutions to these problems. The main feature of the ALVIC architecture is the use of multicasting to distribute data among the users of the application. Each client in this architecture is able to adapt the network traffic and required processing power to its own preference. While not yet widely available on current hardware in the internet, multicasting will play an important role in the distribution of broadcast-quality video in the near future. It is therefore very likely that this technology will be implemented in nearly all future networking hardware (routers, switches etc...). Testing scalability is difficult due to a number of factors, and a number of possible methodologies can be suggested. A trivial way of testing scalability would be to gather the required number of human users in one location and have them test the actual application in a controlled environment. Both in terms of cost and time, this way of testing is often prohibitively expensive and really only practically suitable for applications that target a small user base. Another testing method may be to test the application by means of a small user base, and later extrapolating the results to predict behavior of the system with larger user bases. This methodology however does obviously not take into account possible unpredictable behavior of the application when used simultaneously by large numbers of users. A third way would be to test the application in practice using a large group of beta testers. This however is time-consuming to set up, involves problems when having large amounts of testers test the application for prolonged periods and possible high attributed costs. This method is more suited for actual gameplay testing and debugging than server- and network load testing. There is little previous work using agents to test scalability of NVE-alike applications. In [Tveit 2002], Tveit uses autonomous agents to test the effects of various factors on the total processing time of a simulation. This was done to test the scalability of the sim-
(a) General Architecture.
(b) Areas of Interest.
Figure 1: architecture.
ulator. There were also some tests using NPSNET V [Capps et al. 2000] to simulate network traffic using adaptable Areas-of-Interest. However, these results are specific to the NPSNET architecture and cannot easily be mapped on a generic system. We present a methodology that uses autonomous avatars to simulate large numbers of users. These avatars exhibit behavior that is comparable to human users in terms of movement. They can therefore be used to generate network traffic that is representative for real users. By implementing these autonomous avatars, we are able to prove scalability of our ALVIC architecture to at least 1000 simultaneous users, as will be described in further sections.
2
General World Setup
For reasons of clarity and to ease understanding of the presented test results, we will summarize the ALVIC architecture in this section. For more detailed information, we refer to [Liesenborgs et al. 2002a] and [Liesenborgs et al. 2002b]. The architecture was built from the ground up with scalability in mind. In a later step, this architecture was extended to include simple video based communications[Quax et al. 2003], which is outside the scope of this paper. The world as a whole is divided into a number of regions, each of which is associated with a unique multicast address. In the system, a client only sends data to the multicast group of the region it is located in at each given time. The data that is sent is made up of high-level actions that comprise both low level information such as positions and orientations, and high level information such as animation information. Each client has an Area-Of-Interest (AOI) associated with it which (fig 1(b)) is determined by e.g. its line of sight. Depending on the regions that are in a client’s AOI, multicast groups are joined and left as the client moves around the world. This way, we achieve a highly scalable system in which clients can easily adapt the incoming data flow depending on available bandwidth or processing power. Reduction of downstream bandwidth is achieved by scaling down the number of subscribed multicast addresses. Distribution of action data in this scheme is automated and without need for any server intervention(F4 in Fig 1(a)). It is important to note that the upstream bandwidth use of any given client is never influenced by the active number of clients in its AOI. At a higher level we also achieve easy adaptation for use on less powerful devices (mainly handheld) due to the high-level action based mechanism. Details of this adaptation can be found in [Liesenborgs et al. 2002a; Liesenborgs et al. 2002b; Quax et al. 2003]. When considering the server part of the setup, the master servers are responsible for authentication and/or administration and are used to dynamically forward clients to a specific game server (F1 in Fig 1(a)). There is a communication infrastructure between the master and game servers to exchange information such as network
load and availability of system resources (F2 in Fig 1(a)). The role of the game servers in this system is to track the multicast addresses that are in use at any given time. If a client moves to a specific region in the world, it is informed by the server of that region’s multicast address (F3 in Fig 1(a)). It is feasible to distribute this server setup among a number of machines because the only synchronization that has to take place is a list of regions and associated multicast addresses. For the results in this paper, we disabled the high-level event mechanism of the framework and resorted to a much simpler method of distributing absolute state at regular intervals and relative states every time an action takes place. This helps in generalizing the results, as each specific application may require a different kind of dead-reckoning scheme or none at all. Dead-reckoning involves periodically sending absolute positions and directions and having the other users calculate object and/or avatar positions from this data between absolute updates. The results in section 5 are presented without any form of optimization of network traffic. The data presented can therefore be regarded as a worst-case scenario, in which no dead-reckoning is possible or very frequent deviations from calculated positions occur.
3
Autonomous Avatars
As shown in the introduction, in order to obtain realistic network traffic of a massive multi-user environment, it is necessary to set up an actual running system. Because it is impractical to run the system with up to a thousand real users, these users are simulated by autonomous avatars. Current examples of massive multiplayer NVE’s are massive multiplayer online role playing games (such as EverQuest [Sony Online Entertainment ] or Dark Age of Camelot [Mythic ]) and virtual communities (such as There [There Inc. ] or Second Life [Linden Research Inc. ]). In these environments, users explore the environment, interact with other users or computercontrolled characters. Some areas in the environment, such as spawning areas, merchants or easy killing grounds, are more popular than other areas. As a result, these areas will be more densily populated. To simulate the traffic of these kinds of applications, the behavior of our autonomous agents must fulfill several conditions: • The users are distributed over all the areas of the environment. • Some areas are more densely populated than other areas. • Most of the time, the users are moving around in the environment. • Sometimes users are inactive for a period of time.
areas will be more crowded, thereby satisfying the conditions specified above. Network traffic is generated by transmitting positional information. This information includes both the position and the orientation of the avatar. The orientation of the avatar is set to match the current movement direction of the avatar.
Figure 2: Separation Behavior.
• Not all users move at the same speed.
4
In this section, the setup of the different hardware components used in the experiment will be discussed. A total of 8 systems were used:
Due to the need to be able to simulate large numbers of agents on a limited number of computers, there is one additional constraint:
• 5 nodes of a cluster setup. Each of these nodes contains a dual Intel Xeon processor running at 2.4 GHz with 2GB RAM. These nodes were used to run 180 agents each.
• The behavior of the simulated users must be simple enough to limit processing power.
• 1 single processor PC with a Xeon processor running at 2.4 GHz. This PC was used to run another 100 agents.
The implementation of the behaviors of the autonomous avatars is inspired by the work of Reynolds [Reynolds 1987]. In his work, Reynolds implements flocking behavior of groups of virtual creatures such as birds or fish. Flocking behavior is implemented through the combination of three simple behaviors that are executed by every individual creature: • Separation: Move to keep some distance from local flockmates. • Alignment: Move to the average heading of local flockmates. • Cohesion: Move towards the average position of local flockmates. The autonomous avatars are mostly identical to user-controlled avatars. Both have an AOI from which they receive positional information of other avatars. In both cases, the selection of AOIs is based on the current position of the avatar by subscribing to the AOI of the current position and all adjacent AOIs. Also, autonomous avatars must use the same movement commands as user-controlled avatars. All the autonomous avatars in the environment are controlled by the same set of reactive behaviors. These behaviors control the avatar based on both internal and external events. As a result, every avatar will react differently in the environment. External events are received through the sensors of the avatar, such as the positions of other avatars. Internal events are triggered by a random number generator. In our implementation, we use a separation behavior to avoid collisions between the avatars in the environment. This behavior uses the positions of nearby avatars, and calculates a vector that points away from the obstacles (see fig 2). To ensure that the autonomous avatars move around and spread over the entire environment, a MoveTo behavior is added. This behavior generates a vector that points towards a specified position in the environment. This vector is added to the vector generated by the separation behavior to obtain the movement vector of the avatar. When the autonomous avatar enters the world, and later at random times, the target position of the MoveTo behavior is set to a randomly selected point in the environment. As a result, the agent will move around most of the time until the target is reached. At that time, the avatar remains inactive until a new target location is selected, or until other avatars trigger the separation behavior. Because the random positions are chosen uniformly over the entire environment, the areas in the center of the world will be traversed more often than the outer regions. This ensures that these
Experiment Setup
• 2 single processor PC’s running at 1.7 GHz with 512 MB RAM. One of these PC’s was used to run the master server and game server, and the other was used to run either a non-autonomous client to observe the world, or a single autonomous avatar. This PC was also used to capture the traffic of a single human or autonomous client. All PC’s were connected using a dedicated gigabit network. Network traffic was captured live using EtherEal [Ethereal.com n. d.]. When an autonomous avatar is added to the environment, it spawns at a specific location in the environment. If all the avatars would be added to the environment simultaneously, they would all be added to the same multicast group, which would nullify the measures to make the environment scalable. Therefore, agents are added to the environment gradually, allowing them time to move away from the spawning point. In this test setup, over a period of 20 minutes, 1000 agents were added to the environment.
5
Test Results
The following section presents the results of the captured traffic by the server, an autonomous avatar and a user-controlled avatar. This traffic includes all protocol overhead of the transmission, such as ethernet/IP/UDP/TCP headers. Traffic is summed every second and displayed in the charts.
5.1
Server traffic
Fig. 3 shows the traffic that was transmitted (3(a)) and received (3(b)) by the server. This communication is used to request and distribute the addresses of multicast groups to clients when they move to different regions. The amount of sent and received traffic are almost equal. Capturing starts when the first client is added to the system. The amount of traffic gradually rises as more avatars join the environment and levels off when all 1000 avatars have joined. There is a slightly higher amount of traffic when an avatar initially joins the environment, because it requests the addresses of all the multicast groups in its AOI. When the avatar moves around in the environment, only the addresses of the new multicast groups are requested. Note that the total amount of server traffic is low, considering the number of users that are present in the world, particularly when compared to pure client-server based systems.
(a) Sent TCP Traffic.
(b) Received TCP Traffic.
Figure 3: TCP traffic sent and received by the game server.
(a) Sent UDP Traffic.
(b) Received UDP Traffic.
(c) Sent and Received TCP Traffic.
(d) Sent and Received IGMP Traffic.
Figure 4: UDP, TCP and IGMP traffic sent and received by an autonomous avatar with 9 regions in AOI.
5.2
Autonomous avatar traffic
Fig. 4 shows the traffic of an autonomous avatar while it moves around in the environment. It is spawned in a world already containing 1000 avatars and consisting of 576 distinct regions. The areas are defined in such a way that a typical autonomous avatar can traverse them within approximately 15 seconds. As explained
before, most of the avatars tend to stay relatively close to the center of the world. The autonomous avatar here is configured to always have 9 regions in its Area-Of-Interest, and therefore is always subscribed to 9 multicast groups (surrounding its own area). Fig 4(a) and Fig 4(b) show the sent and received UDP traffic, which is used to transmit positional data about the avatars. Again, it should be
Figure 5: UDP traffic sent and received by an autonomous avatar with 25 regions in AOI.
(a) Sent UDP Traffic.
(b) Received UDP Traffic.
(c) Sent and Received TCP Traffic.
(d) Sent and Received IGMP Traffic.
Figure 6: UDP, TCP and IGMP traffic sent and received by a user-controlled avatar.
noted that optimizations such as dead reckoning are not used in this experiment. As a result, these positional updates are transmitted every time the agent moves. Even when an agents doesn’t move, it sends a positional update every few seconds. Fig. 4(b) shows the received positional updates of other avatars in the client’s AOI. Initially, the client is located in a relatively busy part of the world near the spawning point, which results in a large amount of traffic. As the agent moves around, the traffic decreases and increases as the avatar joins and leaves multicast groups. Fig. 4(a) shows the
transmitted positional updates of the autonomous avatar. This traffic is relatively constant most of the time, but occasionally a spike occurs. This happens when the autonomous avatar comes near another avatar, and must maneuver to avoid this avatar (because of the separation behavior described in section 3). After about three minutes, when the avatar has reached its random target, the traffic drops to near zero. At this time, only the periodic positional updates are transmitted. After some time, a new random target is selected and the avatar starts to move again. Fig. 4(c) shows the sent and
received TCP traffic, which is used to request the addresses of new multicast groups. When the new multicast addresses are received, the avatar joins these multicast groups using an IGMP message. This traffic is shown in Fig. 4(d). It can be observed that the amount of this traffic is relatively low compared to the traffic of the positional updates. This is useful for the scalability of the servers, as these only handle the TCP traffic. Figure 5 shows the UDP traffic that is sent and received by an autonomous avatar that is spawned in the same world but has 25 regions in its area of interest. Note that the sent traffic is the same as in the case with 9 regions in AOI, but the received traffic is higher than the previous case. It is therefore clear that the downlink traffic can be throttled by adjusting the extent of the AOI. Difference between the two AOI’s is comparable to fig 1(b).
5.3
Future Work
We are currently extending the framework to support a Video Area Of Interest, which is analogous to the AOI presented before. In this setup, clients can interact with each other using webcams, whilst maintaining the ability for easy bandwidth adaptation at client side. Once complete, this will allow us to add the video traffic to our test setup, thereby providing us with the ability to test and prove scalability with a large number of clients using video. The ALVIC framework is also currently being extended with server-based streaming of different representations of avatars such as detailed geometry, simplified geometry and image based representations. This streaming is necessary because of the dynamic nature of the environment, as new avatars are detected when they enter a clients AOI. These representations can easily be fitted into the framework, given that they too can be assigned to a specific multicast address per region.
User client traffic
Fig 6 shows the traffic captured by a non-autonomous avatar. This avatar is spawned at approximately the same time as the autonomous avatar described above, and under the same world conditions. The traffic is therefore expected to be comparable to the traffic of the autonomous avatar. The user-controlled avatar has 25 regions in its AOI and therefore is subscribed to 25 multicast groups, comparable to the setup of the an autonomous avatar with 25 regions in its AOI (see Fig 5). This results in an almost equal amount of received positional updates (mean value), see fig 6(b). The sent data (fig 6(a)) is higher in volume because of the speed of the user client. This avatar can move at considerably higher speeds throughout the world than the autonomous avatars. The results are faster changing of regions (more TCP/IGMP traffic) and more positional updates (sent UDP traffic). Initially, the avatar remains stationary near the spawning point in the environment. After approximately 200 seconds, the avatar starts moving towards one of the outer regions of the environment that is sparsely populated by autonomous avatars. As a result, the amount of received positional updates, shown in Fig. 6(b), drops drastically. Then, the avatar moves towards the other outer end of the environment (passing through the densely populated center), showing an increase and again a drop in the amount of received positional data. The amount of TCP and IGMP traffic, shown in Fig 6(c) and Fig 6(d), is comparable to the traffic of the autonomous agent. A screenshot of the application with around 70 agents and 25 regions in a user’s AOI can be seen in fig 7.
6
7
Conclusion
We have shown that our methodology of using autonomous avatars to simulate large numbers of users can be used to prove scalability of networked virtual environment applications, and our ALVIC framework in particular. Using a minimal number of computers, we can generate network traffic that is representative for human clients using the same application. We have also demonstrated that it is feasible to adapt downstream traffic at client side by adjusting the Area-Of-Interest to include less or more multicast groups. This switching can be done according to a number of rules, such as the line of sight or the number of avatars in each region. Lastly, it can be seen that traffic between clients and servers is truly minimal in comparison to a comparable client-server based system in which game event data is not exchanged directly between users.
8
Acknowledgements
Part of this research is funded by IWT project number 020339 and the Flemish Government.
References A NDERSON , D. E . A . 1996. Diamond park and spline: A social virtual reality system with 3d animation, spoken interaction, and runtime modifiability. TR 2, MERL. C APPS , M. V., M C G REGOR , D., B RUTZMAN , D. P., AND Z YDA , M. 2000. NPSNET-v: A new beginning for dynamically extensible virtual environments. IEEE Computer Graphics and Applications 20, 5, 12–15. E THEREAL . COM. Ethereal. World Wide Web, http://www.ethereal. com. F RECON , A., J-A RO , H., AND S TENIUS , S. 1998. Dive: A scaleable network architecture for distributed virtual environments. Distributed Systems Engineering Journal (special issue on Distributed Virtual Environments) 5, 3, 91–100. F UNKHOUSER , T. A. 1995. RING: A client-server system for multi-user virtual environments. In Symposium on Interactive 3D Graphics, 85–92, 209. G REENHALGH , C., AND B ENFORD , S. 1995. Massive: A collaborative virtual environment for teleconferencing. ACM Transactions on ComputerHuman Interaction 2, 3, 239–261. L IESENBORGS , J., Q UAX , P., ET AL . 2002. Designing a virtual environment for large audiences. Lecture Notes in Computer Science Information Networking, Wireless Communication Technologies and Network Applications, 585–595. L IESENBORGS , J., Q UAX , P., ET AL . 2002. Designing a virtual environment for large audiences. In 16th International Conference on Information Networking, 3A–2.1–3A–2.10. L INDEN R ESEARCH I NC . SecondLife. World Wide Web, http://www. secondlife.com. M ACEDONIA , M. R., Z YDA , M. J., P RATT, D. R., BARHAM , P. T., AND Z ESWITZ , S. 1994. Npsnet: A network software architecture for largescale virtual environment. Presence 3, 4, 265–287. M YTHIC. Dark Age Of Camelot. everquest.com.
World Wide Web, http://www.
Q UAX , P., J EHAES , T., J ORISSEN , P., AND L AMOTTE , W. 2003. A multiuser framework supporting video-based avatars. In Second Workshop on Network and System Support for Games, 3A–2.1–3A–2.10. R EYNOLDS , C. W. 1987. Flocks, herds, and schools: A distributed behavioral model. In Computer Graphics, 21(4) (SIGGRAPH ’87 Conference Proceedings), 25–34.
Figure 7: Screenshot.
S ONY O NLINE E NTERTAINMENT. EverQuest. World Wide Web, http: //www.everquest.com. T HERE I NC . There.com. World Wide Web, http://www.there.com. T VEIT, A. 2002. Scalability analysis of the zereal massively multiplayer game simulator. Tech. Rep. 12/02, Department of Computer and Information Science, NTNU, Sem Slands vei 7-9, NO-7491 Trondheim, Norway, November.