A Grid Computing for Online Games

4 downloads 2418 Views 5MB Size Report
Sep 3, 2004 - the distance between game clients and the wireless access point, the ...... darker. Furthermore, a starry sky is emulated, its transparency and.
A Grid Computing for Online Games Rafael García Leiva Ismael Herrero Rodríguez David Muriel Álvaro

Paul Warren Ivan Djordjevic Theo Dimitrakos

Andago Ingeniería SL C/ Alcalde Ángel Arroyo 10, 2-4 28901 Getafe (Madrid Spain) +34 91 601 13 73

British Telecom Adastral Park, Martlesham Heath Ipswich, Suffolk IP5 3RE, United Kingdom +44(0)1473609591

[email protected]

Melanie Biette Atos Origin SA Av. Diagonal 210-128 08011 Barcelona (Spain) +34 93 504 35 68

[email protected]

[email protected]

Matteo Gaeta Nadia Romano CRMPA Università degli Studi di Salerno via Ponte Don Melillo, 84084 Fisciano (Sa) Italy +39 089964230

[email protected] Andago Ingenieria SL has developed the Andago Games Platform, an open source platform which provides the necessary technological base for provisioning online game services based on service strategies like user loyalty, or based on business strategy models like subscriptions or micro payments. By using common accounts, names, user interfaces, and policies for all the games (or from each publisher), a disjointed and confusing experience is avoided.

ABSTRACT Andago Ingeniería SL has developed the Andago Games Platform [1], an open source platform which provides the necessary technological base for provisioning online game services based on service strategies like user loyalty, or based on business strategy models like subscriptions or micro payments. However, the platform requires important investments by operators and portals, limiting the number of possible customers. Grid computing will reduce dramatically the amount of these investments by means of sharing resources among different operators and portals. Also, Grid computing offers the possibility to create virtual organizations, where operators and portals could share games and contents, and even their users base.

However, the platform requires important investments by operators and portals, limiting the number of possible customers. Grid computing will reduce dramatically the amount of these investments by means of sharing resources among different operators and portals. Also, Grid computing offers the possibility to create virtual organizations, where operators and portals could share games and contents, and even their users base.

Categories and Subject Descriptors J.m [Computer Applications]: Miscellaneous.

1.1

General Term

Grid computing is a new distributed computing paradigm to solve large-scale computation problems. Grids take advantage of having many interconnected computers, modeling a virtual computer where the execution of processes can be distributed in a parallel infrastructure. Grid technology uses resources from multiple individual computers, interconnected through a network, to solve large-scale problems in a global environment. Grid computing provides the capacity to manage large datasets, by dividing them into smaller subsets, or the ability to perform multiple calculations simultaneously, by executing jobs in parallel among different processors.

Algorithms, Design, Standardization.

Keywords Online games. CD-ROM based games. Grid computing.

1.

Grid Computing

INTRODUCTION

With the fast growth of the video games and entertainment industry - thanks to the appearance of new games, new technologies and innovative hardware devices - the capacity to react becomes critical for competing in the market of services and entertainment. Therefore it is necessary to be able to count on advanced middleware solutions and technological platforms that allow a fast unfolding of custom made entertainment services.

Grid computing allows to share resources in a coordinated, secure an flexible way, among groups of individuals and institutions that change dynamically. This is normally know as Virtual

1

servers and configure the specific characteristics for each match.

Organizations: a non-fixed group of individuals and institutions that share resources in order to achieve a common goal. The Open Grid Forum (OGF) is the community of users, developers, and vendors leading the global standardization effort for Grid computing. The main standards proposed by OGF are:

x

Advanced statistics: the platform collects advanced statistics of the games played by the users generating rankings and statistics.

x

OGSA (Open Grid Services Architecture [2]): integrates key Grid technologies with web services to create a framework based on distributed services. The OGSA vision is to describe and build a new set of standard interfaces and well defined procedures that constitute a common framework for all the applications and Grid systems.

x

Automatic game launch: the users have information in live time about the matches in existence and can connect to them without having to configure the connection. The platform launches the games in the user's terminal and connects it to the selected server with the correct configuration.

OGSI (Open Grid Services Infrastructure [2]): define mechanisms to create, manage, and interchange information among entities called Grid services. OGSI also introduce new methods and standard interfaces to create and distribute Grid services. Recently, OGSI has been updated to follow a more web services oriented approach, becoming WSRF (Web Service Resource Framework [3]).

x

x

Clans: the users will be able to select enemy users, ignored users or form clans of friends.

x

Championships, downloads, chat, etc.

2.

2.2

Functionalities

Beside to user's services, the platform counts on management tools that will allow operators and system technicians to easily manage the service. Using these tools administrators will be able to:

ANDAGO GAMES PLATFORM

Andago has developed the open source online games platform Andago Games that provides the technological base necessary for the creation of online games services around which the main entertainment sites will be able to establish solid business models.

x

Domain Management: Using the domains, virtual platforms within a physical platform will be able to be created, which will allow resources to be assigned to services and the creation of house branding or ASP games channels.

x

Management of users: The administrator will have access to the list of users and their specific data. Through the administrator restrictions of access, invoicing or platform use will be able to be created.

x

Management of servers and games: The channel manager will be able to decide at any moment what resources should be available and to which players. The administrator will be able to start up or close down servers and matches of the different games supported.

x

Management of events: Events such as championships, tournaments or leagues according to the specifications of contract of services, use or statistics of games will be able to be created.

Figure 1. Andago Games Platform

2.1

Services for the User

The platform Andago Games allows to quickly create online multiplayer games channels with the following services for the final user: x

x

Pay per play / pay per subscription: the system allows the managing of access and connection times of users based on various price fixing policies. Policies for quality of service for different profiles of users can also be defined.

Figure 2. Andago Games Modules

Reserving of gaming rooms or servers and advance management of games: the users will be able to access games

2

2.3

Architecture

2.4

The Andago Games platform is composed of the following elements: x

User access site: The main interface of the user with the platform is the web channel. By means of this channel the user can access the different services of the platform available according to his profile (personal information, rankings, clans, championships, reserves, chat, and servers / games available).

x

Games Launcher: The Launcher, which could be offered as a plugin or as a standalone application, will take care of the initiating, the configuring, and the connecting of the players in the client's machine. This module obtains IP addresses, ports and types of the different matches available on the platform. Likewise it collects data on the games installed by the user in order to start the correct game or to download the necessary patches.

x

Lobby Server: The Lobby Server centralizes the logic of the platform and interacts with the rest of the modules of the platform: User channel, Launcher, Administrators, Watchers and Games servers. The communications with the Lobby Server through Internet are done by means of a communications module, the Console, which gathers commands in a communication protocol (based on SSL and TCP/IP) and converts them in CORBA Platform commands. The Lobby Server can share its data model to integrate the platform with systems of supply, invoicing, CRM, etc.

x

Administration Web: The administration of the platform is done by means of a web-based administration module, with support for different profiles of administrators and operators. This module makes it possible to manage the service defining domains, characteristics of servers, games, matches, reservations, user profiles, tournaments, conditions, billing, statistics, etc.

x

Watchers: The watchers are installed in the games servers. Their function is the advanced management of games and the obtaining of advanced statistics. The watchers can start up or close down games and matches, control the access of users according to profiles and conditions, and gather data on matches, players, and clans for the production of advanced statistics.

Technology

The Andago Games platform is based fundamentally on a distributed development environment based on J2EE, C, C++ and CORBA. The communications over the Internet are made through ACAP protocol (Andago Games protocol) and the internal communications through CORBA. The default version of the platform uses Linux Red Hat or Debian as the operating system, the applications server JONAS and MySQL as the database. The design of the platform allows the distribution of each one of the modules contributing scalability and load management.

3. GRID COMPUTING FOR ONLINE GAMES The main disadvantage of the Andago Games platform is that it requires important investments by operators and portals, limiting the number of possible customers. Grid computing will reduce dramatically the amount of these investments by means of sharing resources among different operators and portals. Also, Grid computing offers the possibility to create virtual organizations, where operators and portals could share games and contents, and even their user's base. Technically, the goal is to be able to share expensive resources between providers and to allow billing based on usage. From a business perspective our goal is to open new commercial opportunities in the domain of online games. A common problem with online games is that operators, portals and games providers would like to share resources and aim at sharing the costs to optimize their businesses. Yet business entities are generally required to play all business roles. The European market is still too fragmented and it is hard to reach the critical mass of users needed to make online games businesses profitable and to ensure resource liquidity. Having a Grid infrastructure makes it possible to divide tasks among different actors and in consequence each actor could concentrate on the business it knows best. Application developers provide the applications, portal providers create the portals to attract users, and Telcos/ISP will provide the infrastructure required. Such Virtual Organizations allow for profitable alliances and resource integration. The outcome of a grid enabled online games platform will be to provide the middleware to make this collaboration happen. The Grid ensures not only decreasing costs for businesses, but allows for creating a global European market as applications, infrastructure and users can be shared independently of political and social borders, smoothly integrated and better exploited. There are also big advantages for users. For example, they will have a larger games offer, better quality of service and certainly cheaper services. Grid centralized portals would provide thousands of games and entertainment content from different providers. Today, if one buys a new game and wants to play it online, the user has to connect to a server (possibly) in the USA, unless a local server was set up. Having a Grid infrastructure would largely ease that process. Users will simply connect to the Grid, play and join the international community of users.

Figure 3. Andago Games Architecture

3

4.

from potentially different ASPs that are can be provided in distinct contexts and distinct federations that meet different security and quality of service requirements, as required.

BEINGRID

BEinGRID, Business Experiments in GRID, is the European Union's largest integrated project funded by the Information Society Technologies (IST) research, part of the EU's sixth research Framework Programme (FP6). The BEinGRID consortium is composed of 75 partners who are running eighteen Business Experiments designed to implement and deploy Grid solutions in industrial key sectors.

x

Internet-based gaming offers challenging features such as interactivity with multi-players, low latency requirements, highperformance servers for game execution. Thus, a gaming platform will be a suitable driver for the technological and commercial impact evaluation of the business experiment.

The mission of BEinGRID is to establish effective routes to foster the adoption of Grid technologies across the EU and to stimulate research into innovative business models. To meet these objectives, BEinGRID will undertake a series of targeted business experiments (BEs) designed to implement and deploy Grid solutions across a broad spectrum of European business sectors (entertainment, financial, industrial, chemistry, gaming, retail, textile, etc). Eighteen business experiments are planned at the outset of the project with a second competitive call for proposals in the latter stages.

The experiment will selectively reuse SOA (Service Oriented Architecture) middleware components and R&D experience of FP5 and FP6 European projects including GRASP [2], TrustCoM [3] and EleGI [4]. Andago acts as an end-user and online games expert, and aims to significantly reduce infrastructure costs and to analyze the viability of a common European base of games users. BT participates in the project as a web services and security design expert, and following the experiment it aims to offer elements of the VHE as common capabilities over a converged network. Atos Origin participates as a technological IT integrator and aims to analyze the applicability of VHE solution to other economical sectors. CRMPA acts as technology provider and Grid expert developer in .NET environments.

The overall result of the project will be a collection of key middleware components, Grid solutions and successful case studies resulting from the real-world pilots and the best-practice guidelines derived from the Grid pilot experiences. The creation of the toolset repository that gathers high-level services, new tools and innovative Grid application solutions will result in a Grid market place enabling individuals and organizations to create, provide and use Grid technologies to meet their business challenges.

4.1

The main technical challenges, to be addressed through use of standards-based Grid and Web Services technologies, are:

Business Experiment 09

Focusing on the leisure and entertainment sector, the Beingrid Business Experiment 09 will build and assess a distributed application hosting environment that enables network-centric Application Service Providers (ASP) to deploy and manage their services in secure and accountable way. In particular internetbased multi-player gaming has been chosen as the targeted application area.

BE09 aims to provide distributed virtual hosting environment (VHE) that allows securely exposing applications as a managed services and enforcing quality of service in dynamic federations of users, application service providers and application hosts.

x

x

From a converged IT and communication services provider's perspective, this innovation has the potential to create a new market of network-hosted infrastructure services that are offered as common capabilities over the network, that enable the VHE operation and life-cycle management.

From an application service provider's perspective, this innovation allows businesses to provide, manage and control own application services, while outsourcing service deployment and management, security, and contact enforcement to the infrastructure services hosted by the network. In addition, they can take advantage of the distributed VHE in order to dynamically scale and load balance by distributing the hosting of their resource intensive or response time constrained application. From an application host provider's perspective, this innovation allows businesses to optimize the use of their resources by accommodating diverse application services

4

x

To architect scalable solution, for efficient and reliable game execution over distribute Virtual Hosting Environments.

x

To implement a Virtual Hosting Environment supporting the operation and life-cycle management for federations of Common Capabilities, Game Servers, and Game Instances, and ensure the applicability of the result across vertical markets.

x

To provide a suitable model for securely aggregating: Groups of gaming servers hosting game instances in virtual hosts; Collections (converged network) services supporting the execution a game instance; Communities of gamers sharing the same game instance; while maintaining the separation of administrative authority and concerns between gamer communities, application service providers, service host providers and the enabling infrastructure.

x

Provision of internet games to the consumer via non-intrusive, secure, pervasive and scalable technology.

x

The business experiment combines several technologies: .NET, Java, CORBA, Web Services, Grid.

5. CURRENT STATE AND EXPECTED RESULTS

6.

ACKNOWLEDGMENTS

BEinGRID, Business Experiments in GRID, is a European Union's integrated project, funded by the Information Society Technologies (IST) research, part of the EU's sixth research Framework Programme (FP6).

The open source Andago Games Platform is a production quality middleware that currently can be downloaded from Andago web page [1]. Andago Games Platform has been used in production environments like the Terra online games web portal.

7.

The next generation of the Adago Games Platform, that will include grid computing capabilities, is being developed in two phases. Phase I (expected in second quarter 2007) will allow to distribute games servers in a grid based environment, allowing different game providers to share resources. Phase II (expected in first quarter 2008) will take full advantage of the concept of virtual organizations. The proposed architecture (still in development) for a grid based online games platform is depicted in Figure 4.

REFERENCES

[1] AGP: http://games.andago.com [2] The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. I. Foster, C. Kesselman, J. Nick, S. Tuecke, Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002. [3] WSRF: http://www.globus.org/wsrf [4] GRASP: http://eu-grasp.net/english/default.htm [5] TurstCoM: http://www.eu-trustcom.com [6] EleGI: http://www.elegi.org

Figure 4. Grid Based Architecture

5

Scaling Online Games on the Grid Jens Muller ¨ and Sergei Gorlatch University of Munster, ¨ Germany {jmueller|gorlatch}@math.uni-muenster.de

ABSTRACT

Most of the game servers have to be manually set up, started and administrated in a static way, which does not allow for automatic service adjustments with regard to the dynamic user demands.

Massively multiplayer online games (MMOG) require large amounts of computational resources for providing a responsive and scalable gameplay for thousands of concurrently participating players. In current MMOG hosting, large datacenters have to be dedicated to a particular game title. Such static hosting requires a huge upfront investment and carries the risk of false estimation of user demand. The concept of Grid computing allows to use resources on-demand in a dynamic way, and is therefore a promising approach for MMOG service hosting to overcome the limitations of static game provisioning. In this paper, we discuss different parallelization mechanisms for massively multiplayer gaming and Grid computing architecture concepts suitable for on-demand game hosting. The work presented here provides both a state-of-the-art analysis and conceptual use case discussion for the new European project edutain@grid. This project targets at scaling real-time interactive online applications and MMOG, including First Person Shooter (FPS) and Real-Time Strategy (RTS) games, in an on-demand manner using a distributed Grid architecture.

1.

We suggest to transfer the concept of Grid computing from the academic and business area to the realm of distributed interactive applications for making hosting of online games dynamic. The term Grid [7] originates from the conceptual analogy to the power grid, where computational power can be as easily and transparent obtained as electricity by inserting a plug into a power socket. Although there are already some commercial game-related Grid systems like Butterfly [2] or the BigWorld system [1] available, these systems target the MMORPG genre and are barely suitable for running FPS or RTS game sessions. An overall consistent Grid approach not only for hosting the various real-time game genres, but ideally all real-time interactive applications including e-learning, interactive simulation and training applications is still missing. Following the idea of Grid computing, the recently started EU-funded edutain@grid project [3] targets at developing a practical Grid infrastructure for all real-time interactive applications including online games. The project’s topics include improved scalability and dynamic hosting of applications, responsiveness of online games as well as business models for making the Grid environment economically viable.

INTRODUCTION

Online gaming has become a major worldwide trend and experienced a massive growth during the past years. According to the game search service gamespy [4], at least 250.000 users are online playing First Person Shooter (FPS) games on more than 70.000 servers at any time of the day worldwide. The Steam platform reports 140.000 servers with more than 2.8 million unique users a month for the games hosted on that platform [5]. In the area of Massively Multiplayer Online Role-Playing Games (MMORPG), the number of players has doubled over the last three years and more than 12 million users are currently subscribed to the different games [15].

This paper summarises our recent work on scalable network architectures for real-time games and discusses scalability dimensions of different online game genres. We present our scalable concept of multi-server game world replication as a feasible approach to scale FPS and RTS games, which so far have been only played in small-scale game sessions. The proxy-server architecture, which we designed as an operational network architecture for our replication approach, provides the basis for our two demonstrator applications: (1) Rokkatan, a scalable RTS-style game and (2) the QFusion Proxy-Architecture, a port of the FPS Quake 2 using our multi-server replication approach. This replication concept will constitute important parts of the real-time computation and communication framework inside the edutain@grid architecture for scaling a variety of interactive online application classes.

While the number of players drastically increases, the basic concepts and technologies of hosting games on the Internet have not been changed since the beginning of online gaming.

The work described in this paper is supported in part by the European Union through the IST 034601 project "edutain@grid".

6

2.

2.2 Game World Zoning

PARALLELISATION APPROACHES TO SCALING ONLINE GAMES

In the zoning parallelization approach, the game world is partitioned into independent zones. These zones are processed in parallel on several servers, such that the game client has to change the server connection if the user moves his avatar into a different zone. Figure 1 illustrates an example of a game world with four zones.

Small-scale sessions of online games usually run on a single game server. This server runs a game-update loop in a periodic manner, in which it has to receive and process all user inputs, process user-independent parts of the game (compute artificial intelligence of NPCs, respawn items, etc.) and send the resulting new state to all game clients. The frequency of the game state update depends on the particular responsiveness requirements of an actual game and ranges from about 5 updates per second for RTS and RPG up to 35 updates per second in fast-paced FPS action games. The update frequency leaves the server a particular maximum time for processing a single loop (less than 30 ms in case of the responsive 35 updates per second): if the server is not able to finish the calculations in time and send the new state back to clients, then the users will immediately be disrupted in their game immersion due to this computational lag.

Server A

Server D

Game World Zone A

Zone D

Server B

Server C Zone C

Zone B

Game Entities

Figure 1: Game World Zoning The game world zoning is usually incorporated in MMORPGs. Regarding the scalability dimensions discussed above, this approach is very suitable for scaling the total number of users and the overall game world size, as long as the users scatter themselves in the huge game world. However, the third dimension of player density is not being scaled, because a particular single zone is only maintained at a single server. If, as for example in an action-oriented FPS game, a lot of players gather together in a large fight, then the corresponding zone server will be congested, similar to the single server in the conventional client-server architecture. Zoning is therefore a suitable and important approach for MMORPG where users are encouraged to spread out because due to advancing avatar level and proceeding quest lines only a particular subset of zones is interesting for a particular user. For action-oriented PvP games, however, zoning is not feasible because users are interested in fighting other players and therefore gather together, which dynamically increases the player density and congests a single zone.

Because the server has to maintain the update rate of the periodic real-time state processing, there is a maximum amount of data which can be processed in time. The computation power of a server is constant, which makes the single-server architecture approach unable to support MMOGs.

2.1 Scalability Dimensions In order to scale a game application, i.e., to increase particular characteristics like the number of players without violating the real-time constraints of the game update loop, the processing has to be parallelised. Before discussing different approaches to parallelisation, we summarize three main scalability dimensions identified in our previous work for different MMOG genres: 1. The overall number of participating users needs to be scalable in every MMO game. All these users are connected to a single game session and generally able to interact with each other.

2.3 Game World Replication Our concept of game world replication [10] is an alternative parallelization approach for scaling the density of players in a real-time game session. In this approach, each server holds a complete copy of the game state as illustrated in Fig. 2 and the processing of entities is distributed among participating servers: Each server has to process its active entities, while shadow entities are maintained at remote servers. After each entity update, the corresponding server broadcasts a corresponding update message.

2. The game world size needs to be scalable in particular in MMORPGs, where the world usually is very large. Scaling the game world size requires increasing (1) processing power for computing actions for more actively computercontrolled entities filling the world and (2) main memory for storing an increasing amount of static terrain geometry and dynamic entities is required. 3. The maximum player density has to be scalable especially in action-oriented Player-versus-Player (PvP) games like FPS. In contrast to the huge game world of MMORPG, these games are played in much smaller environments; users move their avatars where some action is going on, and thus dynamically create local player clusters with a high density. For MMO versions of such PvP games, the player density has to be scalable in order to provide responsive gameplay for situations with a lot of action.

Game World Server A Shadow Entity update

update

There have been different parallelisation approaches discussed in academia as well as implemented in commercial games to scale some of these dimensions for different types of genres. In the following, we briefly discuss the well-known zoning concept and our own replication approach.

7

Active Entity at B

Server B

Shadow Entity

Server C

Figure 2: Game World Replication

3.1 Grid Computing in the Game Area

The replication concept allows to scale the density of players, because the processing amount available for a particular static region of the game world can be increased this way. If players cluster together in a big fight, then the processing of all the interactions and visibility checks is split up among all participating servers. We implemented this approach in our proxy-server architecture [11] and demonstrated its feasibility in our scalable RTS game Rokkatan [10] which can be played by several hundreds of users in a single session on a small game world.

3.

Regarding the area of online computer games, there have been some academic and commercial Grid-related infrastructures developed and presented. Basically, existing approaches can be distinguished to follow one of the following two different concepts:

3.1.0.1 Grids for Single-Server FPS In the current state of the art of FPS game hosting, users rent servers at a flat rate from hosting companies on a monthly basis. Casual users which do not have control over such a server can only play on public servers and are not able to set up an Internet-based session for a closed group of users with their own rules.

GRID COMPUTING CONCEPTS

A computational Grid allows users to access resources (processing power, storage space, network bandwidth, etc.) in an on-demand fashion. Instead of buying resources and setting them up statically and privately inside of academic or business institutions, resources are shared over institutional boundaries by virtual organisations. If a user asks for a particular resource (for example, an SMP server with at least eight CPUs running at 1.2 GHz or higher), then a Grid infrastructure like the Globus toolkit [8] or Unicore [14] acts as a market broker among the user and resource providers for negotiations of resource characteristics, usage time and prices. After successful negotiations, the user can start own computations on the remote server by running a binary copied over or using pre-installed services.

Grid systems for single-server FPS allow users to start FPS game sessions in an on-demand manner for short durations. Instead of statically renting a server at a particular hoster, users specify the game and related characteristics like number of players, private/public game etc. and the Grid system negotiates these requirements with several hosters participating in this infrastructure. After contracting with a particular hoster, the user can configure game-specific settings like the map being played on, the score or time limit to win. The system then schedules the start of a binary game server featuring the user-specific settings according to the booking. Such a Grid system has for example been discussed in [13] and we also presented a prototype of an infrastructure providing this functionality in [12].

We briefly summarize main functional characteristics of Grid systems in the following:

Such a FPS Grid system does not use the general Grid concept to its full potential. Regarding the general features described in Section 3, only the dynamicity of the Grid approach and potentially its accounting is applied to the hosting of a particular subclass of online games. Such a single-server Grid using the available game server binaries can neither scale a single game session nor migrate it onto a different host for overall load balancing. However, it still provides an improvement over the static server hosting and is a first partial demonstrator of what Grids can provide for online game hosting.

• Dynamicity: instead of statically running services regardless of the actual user demand, a Grid allows to automatically start and stop services with respect to the demand and provides resources in a just-in-time manner when they are actually needed by users. • Scalability: in order to provide a high amount of computational power, the goal of modern Grid middleware is to create a virtual cluster of several servers for a single performance-demanding application.

3.1.0.2 Grids for Multi-Server MMORPG

• Checkpointing and Migration: several Grid infrastructures allow to store the state of running user applications, which can be used to periodically checkpoint the state of a long-time computation and restart it from the last state in case of a server crash or other failures. Additionally, this functionality allows to migrate a computation from one host to another, for example for load-balancing purposes.

The user demand for playing a particular MMORPG is dynamic in several dimensions, the most important are: (1) short-time variation of logged-in users depending on daytime and weekday, and (2) changing total playerbase. The first dimension reflects peak usage times of a constant total subscriber number, while the second dimension usually varies more slowly and reflects the game’s overall lifecyle of release, growth, saturation and finally decrease, possibly restarted with the release of expansions. Following these varying user demands, the game provider has to ensure that sufficient computation resources are available.

• Accounting and Billing: users and service providers usually have their own personal account in the Grid infrastructure, which is used for authentication and billing purposes.

In order to provide the required flexibility regarding the setup of an MMORPG, different Grid infrastructures have been proposed and commercially applied, as for example Butterfly.net [2] or BigWorld [1]. These infrastructures provide a server-side API to define game zones and instances and map them to actual server hosts at runtime. In comparison to Grids for single-server FPS, these infrastructures provide more sophisticated functionality of the general Grid

There are Grid systems and middlewares are available which provide the basis for productive Grid environments especially in the academic area, where for example physicians, meteorologist or geologists run computer simulations in an on-demand manner.

8

application instance and therefore results in a parallelisation architecture generally suitable for scaling all classes of multiplayer games.

concept (as summarized in Section 3) to online gaming: They enable dynamic game services, scale a single massively multiplayer session by providing zones and instances and incorporate accounting functionality. However, these Grids are especially targetting MMORPGs and are barely usable for other online gaming genres for which the built-in zoning concept is not appropriate. Additionally, the servers used by a single MMORPG realm still reside at a particular hoster and there is no option to migrate sessions between data centers for load-balancing reasons and for enabling an open market of MMOG hosting.

4.

5. DYNAMIC SCALING OF GAME ENVIRONMENTS

TOWARDS A COMPREHENSIVE INTERACTIVE APPLICATION GRID

Existing game-related Grid infrastructures mainly target a specific MMOG genre. For optimizing the distribution of server processing power for overall online gaming, a comprehensive approach suitable for all classes of online games is required. The recently started edutain@grid project [3] targets at providing the Grid concept with all its features not only to online gaming, but also to other online interactive multi-user applications like e-learning, training and simulation applications.

While the overall architecture illustrated in Figure 3 combines the different scalability approaches, an enclosing Grid infrastructure is still required to provide server resources for the zones, instances and replicas in a dynamic manner. In the following, we illustrate two main use cases of dynamically mapping game world regions to servers resulting from particular user demand and behaviour.

5.1 Dynamic Clustering of Users In Player-vs-Player scenarios using several zones it can be expected that users dynamically gather in a particular area and fight each other. The corresponding zone then should be replicated using several servers for scaling the density of users as illustrated for the bottom right zone in Fig. 4(a).

11 00 00 11 00 11 1 0 11 00 0 1

00 11 0 1 1 0 00 11 00 1 1 0 1 00 11 00 11 00 11 00 11 1 0 00 11 11 00 00 11 0 1 0 1 0 1 00 11 0 1 0 1 00 11 11 0 0 1 00 1 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 11 00 0 1 0 00 11 01 1 0 1 0 1 0 1 0 1 0 1 0 1 00 11 0 1 0 1 0 1 0 1 00 11 0 1 0 1 0 1 00 0 1 0 11 1

In the following, we outline the concept and use cases for a Grid infrastructure which provides dynamicity and scalability for all major types of online games. Our main idea is to scale all the different scalability dimensions introduced in Section 2.1 by combining several scalability approaches suitable for the various game genres. The according architecture has to be practically usable, meaning that the complexity and the dynamicity of the multi-server parallelisation has to be hidden as much as possible to the game developer inside of a convenient API, without restricting optimization possibilities for a specific application implementation. Our concept, therefore, follows the familiar paradigm of game entities and game-loop-centric processing.

(a) Heavy Fight at Bottom Right Zone

In particular, a comprehensive infrastructure has to support zoning, replication and instancing of particular game world regions. The overall resulting concept is illustrated in Figure 3.

(b) Fight Moving to Bottom Left Zone

11 00 00 11

11 00 0 1 0 1 00 11 11 00 00 1 1 0 1 00 11 00 11 00 11 0 1 00 11 11 00 00 11 0 1 1 0 00 11 111 000 00 11 0 1 0 1 00 11 00 11 0 1 00 11 0 1 00 11 00 11 0 0 1 01 1 00 1 00 11 00 11 11 00 0 1 0 1 11 00 1 00 11 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 0 1 0 1 0 1 11 00 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 1 01 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 11 0 01 01 01 0 1

Figure 4: Dynamic Clustering of Users [..] [..]

instance servers

[..]

In such a scenario of a fight with a high user density, it can be expected that users eventually move over to an adjacent zone. In Figure 4(b), the bottom right zone then becomes less frequented because users move over to the bottom left zone. This zone now has to be replicated in order to scale the density of users, while the replication of the previously frequented zone can be lowered due to decreasing load. Our concept of the comprehensive scalability framework supports dynamic adding and removing of replications, and the overall Grid infrastructure has to dynamically reassign the replication servers to zones according to the user behaviour.

[..]

zone servers

replication servers

Figure 3: Comprehensive Scalability Framework The particular combination of zoning and instancing is already practically used by commercial MMORPGs. However, using replication in combination with zoning is a novel concept which allows to scale the density of players inside of a particular zone. Combining these different approaches allows to scale all three main scalability dimensions for a single

5.1.0.3 Increasing Instance Demand As another example of how our concept of the overall scalability framework can be dynamically orchestrated by a Grid infrastructure respecting the actual user demand, let us imagine that the users are distributed across several zones of the

9

7. REFERENCES

virtual world. Besides the zones, there are particular instanced areas which a only barely frequented in the beginning as illustrated in Fig. 5(a).

[1] Bigworld technology . [2] Butterfly.net .

instance servers

00 11 00 11 00 11 0 1 00 11 1 0 00 1 1 00 1 00 0 111 11 00 110 1100 00 11 00 11 11 00 0 1 00 1 0 11 00 0 11 1 00 11

[3] edutain@grid project . [4] gamespy . [5] Steam platform . [6] W. Cai, P. Xavier, S. J. Turner, and B.-S. Lee. A scalable architecture for supporting interactive games on the internet. In Proceedings of the 16th Workshop on Parallel and Distributed Simulation, pages 60–67, Washington, D.C., May 2002. IEEE.

(a) Low Instance Usage [..]

instance

[..]

servers 00 11 1 0 00 11 0 00 1 [..] 0 0 11[..] 0 10 11 1 0 1 11 0 000 1 0 1 0 11 11 00 11 0 00 1 00 11 00 11 00 11 00 11 00 11 00 11 11 0 00 11 0 0 1 0 1 0 1

[7] I. Foster and C. Kesselmann, editors. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1998. [8] Globus Alliance. Globus toolkit .

(b) High Instance Usage

[9] B. Knutsson, H. Lu, W. Xu, and B. Hopkins. Peer-to-peer support for massively multiplayer games. In IEEE Infocom, March 2004.

Figure 5: Dynamic Clustering of Users Especially in MMORPG, it is a common scenario that the instance utilization increases drastically during night time, because users pre-arrange groups to adventure collaboratively in such usually difficult areas. As a result, much more instance servers are required as illustrated in Figure 5(b) and the general zoned game world might be less frequented. A Grid infrastructure therefore has to be able to dynamically increase the number of instance servers and possibly combine zones to reassign zone servers to instances.

6.

[10] J. M¨ uller and S. Gorlatch. Rokkatan: scaling an rts game design to the massively multiplayer realm. Computers in Entertainment, 4(3):11, 2006. [11] J. M¨ uller, S. Fischer, S. Gorlatch, and M. Mauve. A proxy server-network for real-time computer games. In M. Danelutto, D. Laforenza, and M. Vanneschi, editors, Euro-Par 2004 Parallel Processing, volume 3149 of Lecture Notes in Computer Science, pages 606–613, Pisa, Italy, Aug. 2004. Springer-Verlag.

RELATED WORK AND CONCLUSION

In this paper, we summarized the main scalability dimensions of online games and gave an overview of existing scalability approaches. The zoning concept [6, 9], which is widely used by existing MMORPG, scales the total number of users and the game world size. For scaling the density of players, however, our replication concept using the proxy-server architecture [11] is more feasible. As a general result of this discussion, we outlined our approach of a comprehensive scalability framework which combines zoning, instancing and replication suitable to scale all classes of online games.

[12] J. M¨ uller, R. Schwerdt, and S. Gorlatch. Dynamic service provisioning for multiplayer online games. In J. Cao, W. Nejdl, and M. Xu, editors, 6th International Workshop on Advanced Parallel Processing Technologies APPT 2005, volume 3756 of Lecture Notes in Computer Science, pages 461–470, Hong Kong, China, October 2005. Springer-Verlag. [13] A. Shaikh, S. Sahu, M. Rosu, M. Shea, and D. Saha. Implementation of a service platform for online games. In Proceedings of ACM Network and System Support for Games Workshop (NetGames), Portland, Oregon, USA,, September 2004.

We participate in the recently started edutain@grid and are currently working on the discussed comprehensive scalability framework to add the required scalability features to the Grid infrastructure. Besides scalability, we outlined the three other functional characteristics of dynamicity, migration and accounting of Grid systems which provide an enormous improvement over the currently usually static online game hosting. We discussed the current state of the art of game-related Grid infrastructures, which target specific game genres and do not provide the full benefits of Grid computing to general online game hosting yet. The two presented use cases of moving player density and increasing instance demand provide, among other scenarios, the challenges for advanced game Grid infrastructures and will be addressed in the edutain@grid project.

[14] Unicore Forum e.V. Unicore-grid, . [15] B. S. Woodcock. Mmorpg chart .

10

How to Fail: Mobile Game Design in a Research Project Involving Software Prototype Development Elina M.I. Koivisto

Mark Ollila

Nokia Research Center Visiokatu 1 33720 Tampere, Finland +358504821630

VITA, Linköping University Norrköping Campus Norrköping, Sweden +4611363000

[email protected]

[email protected]

other methods than software prototyping, and the third one was developed further to a commercial game with a slightly different feature set.

ABSTRACT This paper presents how failure can occur in mobile game design when using software prototype development to demonstrate game concepts. We examine three case studies of mobile game research projects which can be considered as failures because of not finishing the game prototype in time or at all, or not being able to test the game with end users in large scale pilot testing. In every case the game design had an impact to this. The first case study examines a slow-update game, the second a cross-platform pervasive game, the third a strategic persistent world game. All of the projects are relatively large scale involving work effort spent between 2-9 person years. We present our findings and provide recommendations on how to avoid these mistakes in process in the future.

Research often works with the development of software prototypes. When considering the research questions, there are usually two approaches: bottom up or top down. Depending in the context, the top-down and bottom-up approach has different meanings [9]. In this context, in the top-down approach the main research questions are already set in the beginning of the project and the prototype is particularly designed to solve those research questions. The bottom-up approach means that the researchers create a prototype that could potentially evoke interesting research questions and provide data that can be used to solve them. The research can be done also in an iterative manner when the evaluations of the prototype guide the design at each phase of the project. The research method in all of these projects was topdown. All of the example projects used an iterative game design process (see [5] for a mobile game design process that is used in some of our projects).

Categories and Subject Descriptors K.8.0 [Computing Milieux]: Personal Computing – General. Games

The setting of the research is also a consideration with closed research existing and that of open innovation [1]. The closed approach means doing research strictly within one company. The open innovation approach acknowledges that there are good ideas that can be used both inside and outside a company. In practice, this means taking a more open approach and collaborating with external parties on doing research. Of our projects, the second and first case studies are examples open innovation and the third case study is a closed research project.

General Terms Management, Measurement, Design, Experimentation, Human Factors.

Keywords Mobile, Game Design, Process, Failing

1. INTRODUCTION

The length of the research influences the research project. The third case study was a rather short-term research project and it was set up to explore technology that was about to be released in the market. The first and second case studies were long-term research, where completely new kinds of game types and approaches were researched. Long-term research is often more challenging since there are more unknowns and the technologies that are being used are not always available. Sometimes creating a fully functional software prototype is not even an option and alternative prototyping methods need to be used instead [6]. Also, evaluating game prototypes in long-term research projects is, to our experience, more difficult, because the test players are exposed to completely new kinds of concepts and ideas.

It is often the case that only successes are presented in terms of research results. However, often more is learnt from projects that fail. In this paper, we examine three cases of mobile game design using software prototype development where the project can be considered as a failure in a way or another. In the first project, the prototype was finished, but developing it took six times longer than originally planned. The second project did not manage to produce a fully functional software prototype that could have been tested with the end users. The result was close to it, but the project was shut down before the prototype could have been finished. The third project created a fully functional prototype that was tested both in laboratory with potential players of the game and in internal small-scale field testing. However, external large-scale field testing, which was the goal of the project, was not conducted. It must be noted, that none of the projects were complete failures: the first one eventually got finished in the end, the second one managed to produce useful research data by using

The scale of the project has an effect on the game design, project organization, and software development methods that are being used [14]. Games in general are hard to develop and the player’s expectations towards a game are a lot higher now than they were

11

The project was developed mostly within one research organization with help of other parties. The participants in the project were researchers with a lot of experience in game concepting but quite little experience on designing and implementing large-scale game projects.

twenty years ago [1]. This increases pressure to create research game prototypes that have more depth and better looking graphics. Even if the game would be a research prototype, it should keep the players interested long enough to produce data that can be used for research or be appealing enough to demonstrate the game design to customers effectively. The projects presented in this paper ranged from two to nine person years time, and are all rather large scale for research projects.

After the core game concept was created, it was tested out with three physical prototypes. We have very good experiences in using physical prototypes in game design to iterate and test the game design already in the early design phase of the project. Figure 1 shows how a typical physical game prototype may look like. The game design was evolved after each testing round. First, the game theme involved gods and destroying civilizations. Later, the theme was changed to a medieval fantasy theme that is more typical in strategy games. The game designer of the physical prototypes had experience on developing small-scale game prototypes. After the fist versions of the game, a lead game designer was chosen to finalize the game design. He had no experience on designing large-scale game prototypes but was experienced in designing and implementing smaller mobile application projects. The lead game designer got help in form of comments and suggestions for changing the game design from his colleagues. However, instead of being helpful, this lead to adding new features, that could potentially have made the game more fun to play, but at the same time added to the complexity of the game project.

The criteria for success and failure can vary as we can see in the example projects. When a software prototype is developed to demonstrate a game concept, one very clear criterion there is producing a playable prototype that can be evaluated with the end users. All of the projects presented in this paper had problems with this, the first one taking six times more time than planned to finish, the second one never being completely finished, and the third project finishing a playable prototype but due to the immaturity of the technology failing to perform a large scale pilot. There are several reasons for these kinds of failures, and often the project management and the game development process has a big role. Even if the project would fail to produce a game that is enjoyable and fun to play, it could be successful if it could be used to gather meaningful research data or prove that something cannot be done. In our paper, we focus on the problems that are closely related to the game design and can lead to not being able to gather research data with the software prototype. Our paper is organized in the following way. Section 2 describes the research method, with Section 3 presenting three case studies. Section 4 shows our findings followed by conclusions.

2. RESEARCH METHOD The research method was participatory research [4]. The researchers were active participants in the projects. The other methods, in addition to observation, were interviews with other participants in the projects, and analysis of project meeting notes and communication exchanges (when available).

Figure 1: Material used to produce physical prototype.

3. CASE STUDIES

The timing of the project was bad, since there was not reserved enough time for finishing the game design and the game implementation was supposed to start during the summer vacation. The game design is the most important part of the game project, but still surprisingly often underestimated [8]. The game design document ended up being too vague.

We chose three games where the authors have participated in some form in the past as examples of research game development projects that could be considered as failures. The reason for choosing these particular projects was that they all were very different kinds of projects. The projects were organized differently and they all demonstrate different ways of failing. In the following, we describe the game projects, how did they fail, and our view on why did they fail.

A summer trainee was hired to implement the game. The summer trainee was young, only eighteen, but a relatively experienced programmer. He did not have any experience on application development for the software platform that was used for implementing the game.

3.1 Case Study 1: A Slow-Update Game Our first case study is a slow update game that the players play asynchronously. The game is supposed to adapt to different kinds of social situations and play styles. In this game, the players can play the game with their own schedule, during the short moments when they do not have anything else to do, log out of the game, and continue when they have time again. The players can also choose how involved they want to be in the game by taking different kinds of roles. A quite unusual aspect of the game is the way the players join in the game: new players can only join a game instance when they are invited by other players.

The summer trainee worked on the game and in the end, he managed to get together something that barely worked and there was no graphical user interface, the game was text-based only. The purpose of the prototype was to pitch the game concept for a game publisher. The game was pitched to a publisher, however, the prototype at its current state did not help in pitching. The game was not developed further for about a month until there was a need for a similar game prototype in another project. The project decided to finish the game prototype, and at this point, the requirements for the prototype also got somewhat higher. Instead

12

There were stakeholders from five organizations in the project. Each of them had research questions they were interested in. The project had a main research question of finding out how it would be like to play a cross-platform MMOG, but in addition to that, several sub research questions were created. The extra research questions lead to a situation where it was difficult to know if the players enjoyed or disliked the features that were related with the main research question or the other ones. The huge effort that needed to be spent to develop the software platform also restricted the time that could have been used for implementing any extra functionality.

of a prototype that can be tested and demonstrated with some users, the game now needed to be a fully playable prototype that can be tested with the end users as a large-scale field testing. Another, more experienced, trainee was hired to work on implementing the game. However, he did not finish the implementation in time, so in the end, a third, experienced programmer implemented the game. The prototype was eventually implemented, but the project lasted, instead of three months, eighteen months, which was six times longer than originally planned. In this sense, the project can be considered to be a failure. Since the field trial is still on-going, we cannot yet conclude if the prototype failed in some other sense.

In the first design meeting, all the stakeholders were allowed to speak out their personal research goals in the project. This ended up in huge list that included research topics from exploring social systems in games to characterization. This was seen important because some of the researchers were not paid by the project. They needed to get something out of the prototype for their own research as well to be able to contribute. This process created a huge feature creep. There were far more many researchers in the project who took the role of a concept designer than who would have worked on implementing or fine-tuning the game. The approach where the game design is a compromise of several participants’ game ideas or requirements is often called “Design by a committee” [12]. A game that is designed by a committee and compromises often leads to boring or bad gameplay. After the first version of the game design document, the core design team removed many features to make the game more focused.

There were several reasons why the project ended up to be a failure in a way. First of all, the scale of the project was far too big to implement with the people who were allocated for the project and that was not taken into account well enough in the project planning. The game seemed rather simple, and in the beginning it actually was simpler. However, the helpful improvement suggestions of other game designers lead to even more complex design that was in the end far too big. The game design was partly ambiguous which led to problems in the User Interface (UI) design and implementation. There was not enough time for the game design and it was rushed because the designers needed to go to their summer holidays. There was communication to some extend between the design team and the programmer, but because he was not experienced with the development platform that was used, he could not help the design team by protesting against too complex design ideas that cannot be implemented in time.

The game concept attempted to create a large-scale virtual environment. In order it to become a real social environment, a large amount of content would have been needed to keep the players playing it long enough. The project did not have enough resources to implement the content and the final software prototype consisted of graphics that formed simple graphical basic forms, such as squares or balls.

Failing to finish the project in the first place led into more problems. The programmer of the game changed two times, and every programmer had to first familiarize himself with the existing system. This was time consuming and it could have been actually even faster to start programming from an empty table.

The project members were geographically distributed. The core design team was quite isolated from the rest of the team. There was not that much communication between the project members. The project manager did not encourage it either, which made the core team even more isolated from the rest of the project members who did not know what was going on in the project.

3.2 Case 2: A Cross-Platform Pervasive Game The game project in our second case study attempted to implement a cross-platform Massively Multi-player Online Game (MMOG) where the players could play the same game with a mobile phone and a stationary computer. The mobile players were not expected only to access the game with their mobile devices, but also to affect the game world with context-aware mobile gameplay [12] and use location of mobile players to change the game state.

One of the partners in the project was a relatively small game company. The company was going through some major changes which affected badly the implementation of the mobile client that the company was responsible of. The core design team did not feel that they had enough power to require the other partners in the project to deliver what has been agreed on. This lead to a problematic situation when all the stakeholders could not deliver what promised on time.

The scale of the project was a huge problem in the first place. There were very limited resources for the implementation, however, using existing platforms for the mobile and PC client gave some hope that the effort would not be so large. This was found to be not true when the project proceeded and it was noticed that not as much code in the existing platforms could have been used as expected.

One big obstacle for finishing the project to be ready for a field testing with the end users was using a location service provided by a mobile operator. Just before the field testing was going to take place, the operator was not anymore interested at all in providing location data for the project. Changing the location tracking so that it would use some other technology was not possible anymore when the project management found this out. Using GPS (General Positioning System) was considered in the beginning of the project, but rejected by the commercial partners since it was not commonly used by the current users. There was a

In the beginning of the project, it was also considered if the project should take an advantage of an existing MMOG and its user base. However, this approach was not used since the negotiations with commercial partners would have taken too long time to fit the project schedule.

13

The main obstacle for getting the game out in the field testing was both technological and usability one. At the time when the game was finished, SIP was a very new technology. Setting up a SIP account was difficult even for an expert user and getting a SIP account required collaboration with a test laboratory.

conflict between regarding the prototype as a pre-commercial prototype and concept demonstration. The problem with the location service was anticipated earlier by one of the project partners, who had similar experiences in implementing another location-based game, however, these concerns were not listened. The project could be considered to be a failure because the field testing with the prototype could not been done and the project was ended earlier than originally planned. The ending of the project was due to the problems with the mobile developer and other partners in the project not being interested in investing more resources in further development. Some concept testing and focus groups were done on basis the game design but the software prototype was never tested with the end users.

4. FINDINGS Our case studies demonstrated how to fail to design a good game prototype in research projects where software prototypes are developed. This section summarizes the reasons for failing that we found in these projects and gives guidelines for bad mobile game software prototype design.

4.1 Game design Innovate when not needed. Too innovative game concepts may confuse the players and also make it more difficult to gather research data (unless the innovative features are related to the research questions). It is very difficult to explain complex concepts in a small screen real estate to mobile game players. When evaluating the game, the most important thing is the research questions. If other aspects of the game are very innovative they can interfere with the results making it difficult to get data on the research questions. This was noticed particularly in the third case study.

3.3 Case 3: A Strategic MMOG Our third case study is a strategic MMOG for mobile phones. In this game, the players gather game resources, build teams, and duel against other players. A massive amount of players share the same game world, trade game resources together, and communicate with other players, but the battle is a duel between two players. The project was set up to research how Session Initiation Protocol (SIP) can be used to create interesting game features. In practice, this meant that the players of this game could invite other player to duel or trade even if the other player would not be logged in the game at the moment.

Try to solve too many research questions with one prototype. This is related with creating too innovative game concepts when not needed. If one prototype is used to solve several research questions, when evaluating the game, it may be difficult to find how the players liked different features if they interfere with each other. The second case study demonstrates this approach.

The project was done in collaboration with a game developer company. The project itself was quite successful, it was planned well and most of the planned features got implemented in the game. The game itself was generally fun to play, and it was verified in playtesting. However, the project could be considered as a failure in the sense that the large-scale pilot testing, which was planned with the end users, was never carried out.

Set up a committee to design the game. “Design by a committee” (see e.g. [10]) will lead to a compromised game design and extra features that would not be needed in the game. This was clearly demonstrated in the second case study where all the research partners were allowed to put features related to their personal research questions in the game concept. The first case study also had some design-by-a-committee features.

The game design re-used features from popular collectable card games, however, the story of the game world and some features were rather innovative. We found in the project that it is very difficult to explain new innovative concepts or game worlds to the players in the mobile platform. This is due to the limited real estate in the screen and need for relatively short play sessions1.

Add in all the features that could potentially make the game more fun to play. A game that includes all the possible features that one could imagine of is often called a feature creep. A game that does not have clear focus can be confusing for the players. In mobile game research projects the resources for implementing the game are often very limited and that is why a specific caution needs to be used when adding new features, even in the design phase, in the game. Failing to limit the amount of features in the game was one of the most important reasons for the project being delayed in our first case study.

The game was difficult to play and there was a clear need for a tutorial. However, tutorial is something that is often left to be implemented the last, and there was eventually not enough resources left in the project to develop a proper tutorial. The initial test results form the first case study also show that the players would have needed more guidance in the game. The project was funded by a third party that was not used to dealing with game developers, but rather with technical subcontractors. This led to some communication and agreement problems in the game project, which however, were overcome in the end.

1

Design the game so that the implementation needs to use immature technologies. Using immature technologies is very risky both when considering finishing the implementation successfully and evaluating the game. In the implementation, the immature technologies cause a lot of trouble because of missing documentation, functionalities, development tools, and software bugs. When evaluating the game, setting up the test environment can be difficult or even impossible. Particularly in field testing it is often a problem that the players do not have the devices that are needed for playing the game. However, research projects often need to use immature technologies and the only thing that can be done is just to acknowledge this risk.

Actually, in playtesting, we have found that the play sessions are not necessarily always short when playing mobile games. However, it is a good design practice in the design of mobile games to allow interruptions [6].

14

complexity of the device influences the desired complexity and feature creep in the game design.

Choose a game concept that inherently requires creating huge amounts of content. This problem is very relevant in research game projects, since there typically are not enough resources to create large amounts of content for the game. However, there are a few solutions, for instance, utilizing user created content can be in feasible in some cases.

6. ACKNOWLEDGMENTS We would like to thank the following people for useful comments and insight: Craig Lindley, Ciaran Harris, Ari Koivisto, Jouka Mattila, Riku Suomela, Juha Arrasvuori, Jussi Holopainen, Tommy Palm, and Mirjam Eladhari.

Do not learn from the past. As seen in the second case study, valuable information can be learned from previous projects. One of the project partners knew that there might be problems with collaborating with operators to get location data. However, this risk was not taken into account and that was the main reason for failing to test the prototype at all.

7. REFERENCES [1] Blow, J. 2004. Game Development: Harder Than You Think. Queue 1, 10 (Feb. 2004), 28-37.

4.2 Other reasons

[2] Brooks, F. The Mythical Man-Month. Addison-Wesley, Reading, MA, 1975.

Even if the focus of this paper was to point out what in the game design may lead to failing to produce a playable game prototype, our case studies also demonstrate other issues, mainly related with project organization and management. We shortly list those findings here.

[3] Chesbourgh, H.W. Open Innovation: The New Imperative for Creating and Profiting from Technology. Havard Business School Publishing, Boston, MA, USA, 2003. [4] Cornwall, A. and Jewkes, R. What is Participatory Research? Soc. Set. Med. Vol. 41, No 12, (1995), 1667-1676.

Underestimating the size of the project is a typical problem in software projects (see for instance [10]). The same problem was an important reason for failure in the first and second case study. There were problems with communication within the team and providing clear responsibilities for each partner. This can be particularly a problem in open innovation if it is combined with distributed implementation. In the first case study, the ambiguity of the game design document hid the actual complexity of the game and made the implementation more difficult when the game design document needed to be interpreted in the UI design and implementation.

[5] Koivisto, E.M.I. and Palm, T. Iterative Design of Mobile Games. In Proceedings of Game Design & Technology Workshop, Liverpool, UK, 2005. [6] Korhonen, H. and Koivisto, E.M.I. Gameplay Heuristics for Mobile Games. In Proceedings of MobileHCI, Helsinki, Finland, 2006. [7] Manninen, T. 2002. Contextual Virtual Interaction as Part of Ubiquitous Game Design and Development. Personal Ubiquitous Comput. 6, 5-6 (Jan. 2002), 390-406.

It makes a huge difference to have experienced developers to design and implement the game. This difference can be seen between our first and second case study and the third one that was created by a game development company. Also, in the first case study, an experienced programmer could finish coding of the software prototype, where the less experienced ones failed. Switching the programmers in middle of the project will lead to delays since the new programmers have to familiarize themselves with the old code. This can be seen in the first case study, and it is also acknowledged in software engineering literature [2].

[8] Martin, J. and Smith, C. 2002. A cross-curricular team based approach to game development. J. Comput. Small Coll. 17, 5 (Apr. 2002), 39-45. [9] McFarland, G. 1986. The benefits of bottom-up design. SIGSOFT Softw. Eng. Notes 11, 5 (Oct. 1986), 43-51. [10] Pemberton, S. Reflections: the design of notations. Interactions, 8, 2 (March 2001) 126-128. [11] Pressman, R.S. Software Engineering A Practioner’s Approach, Third International Edition. McGraw-Hill, Singapore, 1992.

5. CONCLUSION

[12] Rouse, R, III. Game Design Theory and Practice. World Ware Publishing, Plano, TX, 2001.

We gathered and analyzed data from three projects. Each of them failed in a way, and we focused on the reasons for failure that originated from the game design. Most of the game-design related reasons for failure in the case studies dealt with designing a too complex game for a reason or another. These reasons included innovating when it is not necessary, design by committee, and adding in all the possible features that could potentially make the game more fun to play. Other reasons were design that requires the project to create large amounts of gameplay content and requiring use of immature technologies, which often cannot be avoided, but should be acknowledged as a risk in the project.

[13] Suomela, R. Constructing and Examining Location-Based Applications and Their User Interfaces by Applying Rapid Software Development and Structural Analysis. Ph.D. Thesis, Tampere University of Technology, Finland, 2006. [14] Vazquez, F. Selecting a software development process. In Proceedings of the Conference on Tri-Ada '94 (Baltimore, Maryland, United States, November 06 - 11, 1994). C. B. Engle, Ed. TRI-Ada '94. ACM Press, New York, NY, 1994 209-218.

Future work will involve more detailed analysis of past projects and greater collaboration with commercially developed mobile game design projects. Additionally, we will examine the role of target device and device features in trying to understand how the

15

Alice: Using 3D Gaming Technology to Draw Students into Computer Science Caitlin Kelleher School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 USA +1 412 268 4074 [email protected] Although many factors are contributing to students’ loss of interest in computer science, one contributing factor is that students often find their first experience with computer science uninspiring. Typical beginning programming assignments such as sorting a list of numbers or performing a simple calculation fail to engage many students. The Alice programming environments are examples of applying 3D gaming technology to improve students’ experience with learning to program computers. In this paper, I will discuss two existing versions of Alice: 1) Alice 2 which targets college introductory programming students and 2) Storytelling Alice which is designed for middle school girls. In both versions of Alice, students learn to program through animating the motions of 3D objects in a virtual world.

ABSTRACT

The goal of the Alice project is to leverage 3D gaming technology to develop programming environments that will provide beginning programmers with a positive first experience with programming. At the college level, usage of Alice 2 helps at-risk computer science majors to succeed in introductory program and almost doubles the number of students who choose to continue in the computer science major. Storytelling Alice, a version of Alice designed to support the creation of animated movies, makes learning to program attractive to middle school girls; more than half of the girls who used Storytelling Alice snuck extra time to continue working on their programs. Keywords

Programming environments, evaluation, storytelling, gender

novice

programmers,

ALICE 2

Alice 2 [2] is a programming environment for novice programmers that allows users to construct programs that control the motions of 3D objects in a virtual world. To create a program in Alice 2, users begin by adding 3D objects to their virtual world. Alice 2 comes with a gallery of more than 700 3D objects including ranging from an amusement park to characters and scenery from ancient Egypt. The object tree contains the list of 3D objects in the virtual world (see Figure 1). To create a simple program, users can click on an object to animate, drag one of its methods into the method editor (e.g. iceSkater move), and select parameter values (e.g. forward and 1 meter).

INTRODUCTION

Despite wide-spread usage of computers and computer technologies, only a small, unrepresentative sample of the population is involved in creating new computer technologies. Broadening and diversifying the group of people who create new computer technologies has two potential benefits: 1) a larger, more diverse group will help ensure that computer science attracts the talent that the discipline needs and 2) a more diverse group of people involved in the design of new technologies will help to ensure that new technologies meet the needs of our diverse society.

Alice 2 makes programming easier for novices in two ways. First, Alice 2’s method of constructing programs through drag and drop prevents users from making syntax errors. Second, programs in Alice 2 are animations of visible 3D objects, so users can watch their programs execute and see mistakes when they occur. Alice 2 allows students to gain experience with the programming concepts and constructs typically taught in a first computer science course. These include looping, conditional statements, methods, parameters, variables, arrays, and recursion. Typical Alice 2 projects include short animations and simple games.

This problem is of particular importance at the current time because student interest in computer science has dropped dramatically in recent years. In the United States, the number of incoming college freshman who intend to major in computer science dropped by 70% between 2000 and 2005 and the number of students enrolled in computer science programs at research universities has dropped by 50% [9]. Despite decreasing student interest in computer science, there is still a strong need for computer scientists. The United States Bureau of Labor Statistics predicts that there will be nearly 1.4 million computer science related job openings between 2004 and 2014 [5].

16

2

1 5

4 3

Figure 1: A screenshot of the Alice interface. 1) The world window provides a view of the virtual world that a students’ program will control. 2) The object tree contains a list of the 3D objects in the virtual world. 3) The details area shows the properties, methods, and functions for the object selected in the object tree. 4) The methods editor shows the code that defines a method a student is working on. To call the IceSkater’s move method in “my first animation”, the user drags the tile “IceSkater move” from the details area into the method editor, drops it, and selects parameters from the pop-up menus. 5) The events area allows students to call methods based on events in the world, such as mouse clicks or changes in the value of a variable. Table 1: Academic performance and retention of students in Java course without and with exposure to Alice

ALICE 2 FOR INTRODUCTORY PROGRAMMING

The mechanical and motivational supports Alice 2 provides can help broaden the pool of CS majors. National Science Foundation-sponsored studies have shown that Alice 2 increases the academic success and retention of at-risk college students (freshmen intending to major in CS who enter college with no programming experience and/or who are not prepared to enroll in Calculus as freshmen). At-risk students who enrolled directly into a Java-based CS 1 class earned an average grade of C and only 47% of them continued on to the second course. After a half-semester Alice course, at-risk students performed as well as wellprepared students in the Java-based CS1 course: they earned an average grade of B and 88% of them continued on to the second course [6].

No Alice class prior to CS1 Alice class prior to CS 1

CS1 Grade

Take CS2?

C

47%

B

88%

ALICE FOR MIDDLE SCHOOL GIRLS

While it is important to retain the students who choose to major in computer science, we may be able to attract a larger and more diverse group of students if we reach them at a younger age. Research has found that many girls decide against pursuing math and science based disciplines, including computer science, during middle school [1].

As of fall 2006, Alice 2 is currently being used in CS1 courses at more than 200 colleges and universities.

Rather than presenting programming as a means in and of itself, we have created a version of Alice that presents programming as a means to the end of creating 3D

17

animated stories: Storytelling Alice. We chose to focus on the activity of storytelling for the following reasons:

Table 2: A comparison of the animations in Storytelling Alice and a version of Alice without storytelling support (Generic Alice).

1.

Given a little bit of time, most girls can come up with a story they would like to tell. 2. Stories are naturally sequential and are unlikely to require advanced programming concepts immediately, making them a good match for beginning students. However, most stories provide the motivation for more advanced programming concepts such as methods, parameters, and loops. 3. Stories are a form of self-expression and provide girls an opportunity to experiment with different roles, a central activity during adolescence. 4. Non-programming friends can readily understand and appreciate an animated story, which provides an opportunity for girls to get positive feedback from their friends. More than 250 girls between the ages of 10 and 16 participated in formative user testing that guided the design and development of Storytelling Alice. To enable middle school girls to create animated stories, it was necessary to make three broad changes to Alice 2:

Storytelling Alice Say, think Play sound Walk to, walk offscreen, walk Move Sit on, lie on Kneel Fall down Stand up Straighten Look at, Look Turn to face, turn away from Turn Touch, Keep Touching

Generic Alice Move Turn Roll Resize Play sound Move to Move toward Move away from Orient to Point at Set point of view to Set pose Move at speed, turn at speed, roll at speed

2. Make the gallery a rich source of story ideas.

One of the factors that impacts girls’ motivation to learn to program in Storytelling Alice is whether or not they can find a story that they want to tell. The gallery of 3D characters and scenery can be a source of inspiration for girls’ stories. In particular, highly caricatured characters with clear roles and giving characters animations that require some explanation within the story (e.g. what made a robot character go crazy) can help spark story ideas. 3. Make the system communicate that it can be used for storytelling.

Initially, the Alice tutorial was designed around examples chosen to demonstrate concepts as simply as possible. However, through user testing we found that it was necessary for the Alice tutorial to introduce the skills and concepts users needed within the context of creating stories similar to the ones they imagined. To enable users to successfully complete story-based tutorials which tend to be more complex, we created an interaction technique called “Stencils” where the online tutorial is spatially overlaid on top of the running Alice application (see Figure 3). Stencils [4] moderates the additional complexity of stories by presenting instructions one at a time in the context of the application and preventing users from accessing user interface components that are not necessary for the current step.

Figure 2: A scene created using Storytelling Alice.

1. Provide high-level animation primitives based on storytelling needs. The animations in Alice 2.0 allow users to perform simple graphical transformations like moving and rotating a character or one of its parts. Using simple transformations, it is often a tedious and frustrating process to create the kinds of animations that girls needed to create their stories such as walking or having two characters hug one another. By analyzing the storyboards girls created for their movies, we identified a high-level set of animations that enable girls to make more rapid progress on their stories without removing the motivation to learn programming constructs.

Evaluation of Storytelling Alice

In a study comparing the behavior of girls introduced to programming using Storytelling Alice with the behavior of

18

girls introduced to programming using a version of Alice without storytelling support (or Generic Alice), we found that storytelling is a promising approach for getting middle school girls interested in learning to program. Girls who used Storytelling Alice spent 42% more of their time programming and 45% less time on non-programming activities like selecting 3D objects and positioning those objects within the virtual world (p < .001). Further, users of Storytelling Alice were almost three times as likely to sneak extra time to continue working on their programs during breaks. Where only 16% of Generic Alice users snuck extra time to work on their programs, 51% of Storytelling Alice users snuck extra time (p < .001).

move to a new place, relationships with boys, and how to find your dog when it has been kidnapped. Average % Time Spent on Alice Activities 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% Scene Layout

Editing Program

Generic Alice

Running Program

Storytelling Alice

Figure 4: Users of Storytelling Alice spent 42% more time programming and 45% less time on scene layout than users of Generic Alice.

In addition to motivating girls to learn the basics of computer programming, Storytelling Alice provides motivation for girls to sharpen their communication skills. As girls develop their stories, they are often anxious to share their work with friends. When friends watch the movies, the author observes their reactions carefully to see if they have understood the actions in the story and whether or not they laugh at the jokes. We observed that when the author’s friends do not understand the sequence of events or laugh at the jokes, it provides a strong motivation for the author to revise their story, a process that is often difficult for teachers to motivate. ALICE 3

Both Alice 2 and Storytelling Alice help to make learning to program more approachable and more appealing for college and middle school students. In the next version of Alice, we will continue in the theme of applying gaming technology to improve the experience of learning to program for students ranging from middle school to college and continue to focus on the activity of storytelling. Why not gaming?

In response to decreasing student interest, many computer science departments are considering the use of computer gaming as a context for teaching computer science that may be more attractive for students. Over the past 5 years, the number of video game related majors has risen from less than a dozen to more than one hundred in the United States alone with more in Europe and Asia [8]. While the focus on computer gaming may help to increase the numbers of students who pursue computer science, it may further decrease the percentage of female students in computer science. Video game based curricula will almost inevitably draw inspiration from commercially successful games which do not appeal equally to both genders. Further,

Figure 3: Views of the Alice interface without (above) and with (below) Stencils.

Qualitatively, storytelling seems to be a good approach. Most girls in the storytelling condition readily envision a story they would like to tell. Using the high-level methods in Storytelling Alice, girls are quickly able to make progress on their stories, which helps to build their confidence and enthusiasm. Girls’ stories have addressed a wide variety of topics including how to deal with being unpopular, what to do when your parents force you to

19

computer gaming curricula are likely to be most appealing to students with a strong interest in video games (e.g. hardcore gamers). While one survey by the Entertainment Software Association states that the gaming population is 43% female, this number is potentially misleading because it includes casual play (e.g. games like solitaire and tetris) [3]. Netshelter, a company that provides marketing information, estimates that the hardcore gaming audience (those who devote significant capital to gaming related purchases) is 97.5% male [7].

language like Java, C++, or C# by the end of their introductory computer science class. While Alice 2 helps students to gain an understanding of programming concepts, it does little to help students learn the mechanics of writing syntactically correct programs. To help ease the transition from Alice to a general purpose language, Alice 3 will include both drag-and-drop and Java-based textual editors. Students will be able to begin by generating programs through drag and drop and gradually move toward editing their program in the Java-based textual representation. In this way, students will retain the benefits of animated programs, a consistent API, and the motivation associated with creating animated stories as they learn how to write syntactically correct Java programs.

Leveraging The Sims 2

Based in part on our success in getting middle school girls interested in learning to program through storytelling, Electronic Arts has given us permission to use the 3D characters and animations from The Sims 2 in the next version of Alice. The Sims is the bestselling PC game in history and is one of few videogames to appeal more strongly to female than male players. The addition of The Sims 2 characters enables us to examine the impact of a larger and richer set of high-level animations on students’ motivation and their development of programming skills. Characters in The Sims 2 can perform hundreds of different animations ranging from general-purpose actions like walking to very specific actions like washing a plate or jumping on a bed.

REFERENCES

1. AAUW. Girls in the Middle: Working to Succeed in School. Association of University Women Educational Foundation, Washington DC, 1996. 2. Alice. http://www.alice.org 3. Entertainment Software Association. Essential Facts about the Computer and Video Game Industry: 2005 Sales, Demographics, and Usage Data. Retrieved May 1, 2006 from http://www.theesa.com/files/2005EssentialFacts.pdf 4. Kelleher, C. and Pausch, R. Stencils-based tutorials: design and evaluation. In Proc. CHI 2005, ACM Press (2005), 541-550. 5. Hecker, D. Occupational employment projections to 2014. Monthly Labor Review. November 2005. 6. Moskal, B., Lurie, D., and Cooper, S. Evaluating the effectiveness of a new instructional approach. In Proc. SIGCSE 2004, ACM Press (2004), 75-79. 7. Netshelter. Netshelter reaches: Enterprise Decision Makers and IS/IT Professionals, Technology Enthusiasts, Consumers, and Hardcore Gamers. Retrieved May 1, 2006 from http://www.netshelter.net/media_kit/audience/. 8. Schiesel, S. Video Games Are Their Major, So Don’t Call Them Slackers. NY Times. 22 Nov. 2005. 9. Vegso, Jay. CS Bachelor’s Degree Production Grows in 2004; Poised for Decline. Computing Research News 17, 2 (2005).

Figure 4: A screenshot from The Sims 2 Easing the Transition from Alice to Java

Particularly at the college level, students often need gain to become proficient in writing programs in a general purpose

20

A 2D Car Physics Model based on Ackermann Steering Helmut Hlavacs Institute of Distributed and Multimedia Systems University of Vienna Lenaug. 2/8, 1080 Vienna, Austria

[email protected]

Newton Game Dynamics3 , Havok4 , OPAL5 , or Bullet.6 On the other hand, an engine specialized for the simulation of cars like Racer [12] can be used. In such a case, one is limited to the properties of the repective engine. Furthermore, using the engine might either be very costly, or the result might be restricted to non-commercial use due to legal considerations. Furthermore, the engine of course acts like a black box without insight into the model solution.

ABSTRACT Car physics models are often complicated and require a large effort for understanding them. Additionally, some seem to be incomplete and many open questions remain. In this paper a novel 2D car physics model is presented. The model is based on Ackermann steering and presents closed formulas for forces causing rotation and acceleration of the car and tyres. As a simplification, the model assumes only two tyres instead of four. The paper describes how the equations are derived and how the model can be solved in a game loop.

2. RELATED WORK Basically, there are two types of car physics models found in the literature. The first one is a very sophisticated type of model which takes into account as many parts of a car as possible, like springs, suspensions, 3D landscape, tyre models and elaborate tyre slip models, etc. Implementing such a model requires a large amount of effort, its use is for instance in the car industry or for professional games [6, 11, 9, 21, 2].

Categories and Subject Descriptors I.6 [Simulation and Modeling]: Miscellaneous

General Terms Theory

Keywords Car physics, Ackermann steering

1.

The second type of papers describes very basic 2D models, here mainly taking into consideration engine and centripetal forces [7, 8, 10, 3, 4]. These models are easily understood, yet one has the impression that important parts of the models are missing. Examples for such gaps include breaking forces due to steering, adding rotational forces to the tyre traction budget, or the fact that the engine/break force not only accelerates the car mass (which includes the tyre masses), but parts of it also must cause the tyres to rotate. These issues will be treated later in the paper. Also, their integration and solution is often not obvious.

INTRODUCTION

Car racing games rely on realistic car physics models for simulating the movement and traction of cars and tyres. Such a model may represent a car on a flat surface, as done in this work, or it may include the possibility of uneven ground and car jumps. Professional game studios put a large effort into developing proprietary car physics models for their products, which are of course not accessible for the public. People wishing to develop their own games thus have to rely on publicly available sources, which often are either over-sophisticated or leave open questions.

The aim of this paper is to focus on the second type of publication, i.e., an easily to be understood 2D model, which however tries to catch all forces of a car driving through a 2D plane. The presented model is based on so-called Ackermann steering, and consists of closed formulas, thus allowing to gain insight into the model behavior and solution. Furthermore, the integration and solution of the presented formulas in a game loop is presented.

Alternatively, it is possible to use existing physics engines, for example general physics engines that allow to simulate rigid bodies and spring-mass systems, like ODE1 , Tokamak2 , 1 2

http://www.ode.org/ http://www.tokamakphysics.com/

3. RIGID BODY DYNAMICS In this section those principles of rigid body dynamics which are necessary for understanding the proposed model will be 3

http://www.physicsengine.com/ http://www.havok.com/ 5 http://ox.slug.louisville.edu/∼o0lozi01/opal wiki 6 http://www.continuousphysics.com/Bullet/ 4

21

described briefly. For more detailed introductions in this field refer to [7, 8, 4, 5] or various articles at Wikipedia [17]. In the following, the discussion is restricted to the dynamics of solid cuboids, which are used in this paper to represent car masses. As a further simplification, we assume that the cuboid lies flat on a 2D plane, and its movements are restricted to the movements inside this 2D plane, just like a car may roll on the flat surface of a street. Note that in the following, scalars and scalar operations instead of vectors and vector operations will be used wherever possible, since the forces observed are often at right angles to the main model axes. If used, vectors are explicitly denoted by using bold fonts. Also note that throughout the model, forces and accelerations are kept constant over a short amount of time dt > 0. In fact, dt is assumed to be so small that sin dt ≈ dt. This can always be achieved by letting dt → 0, since it can easily be shown that [18] lim

x→0

Figure 2: A cuboid rotating about an axis through its center (a). A force F is applied to a point A (b). and represents the change of velocity per time unit, i.e., the first derivative of the velocity. Note that generally v and a are either both two-dimensional vectors, or scalars, if the direction is fixed, and only the magnitude is of importance. Also note that in the above scenario, the change in y-position dy of the car after dt is

sin x = 1, x

for instance by using the Taylor series of sin x [20].

dy = v¯(t, dt) dt = v(t) dt +

3.1 Translation

v¯(t, dt) =

F dt = a dt. M

v(t) + v(t + dt) v(t) + v(t) + a dt = , 2 2

(4)

i.e., the mean of v(t) and v(t + dt).

3.2 Rotation The second important movement a cuboid may carry out in the plane is rotating about an axis. In this work we always assume that the rotation axis is perpendicular to the plane. Fig. 2 (a) shows a cuboid rotating about an axis going through its center C. Like in the case for translation, we can define a current state of the object at time t, which is given by the orientation angle θ(t). Additionally, the change of this angle, i.e., the angular velocity, is given by ω = dθ/dt, and angular acceleration is defined to be α = dω/dt. Due to the rotational movement, a sample point A being at a distance of R away from the axis, moves along a circle with radius R. Its velocity vR on this circle is given by

(1)

No matter what velocity v(t) the cuboid is travelling at a time t, when we apply a constant force F to the center of its mass C (see Fig. 1), v(t) is no more constant, and after time dt the total change of velocity is given by dv = v(t + dt) − v(t) =

(3)

due to the fact that the mean velocity v¯(t, dt) of the cuboid in the time interval [t, t + dt] is given by

Assume a cuboid of length L, width W and height H meters, and having a mass of M kilograms. The height however will not be used in the remaider of this paper, its significance is only given in some cases when the mass of the cuboid has to be calculated. Assume also that we observe this cuboid from above, thus not seeing its heigth dimension. Additionally, the car at time t is at position (x(t), y(t)) in the plane and is moving upwards in positive y direction with constant velocity v. This kind of movement is called translation. If no force is applied to such an object, then the velocity will not change, and after a time dt has passed, the car’s change of position dy in the y direction will be dy = y(t + dt) − y(t) = v dt.

a 2 dt , 2

(2)

vR = ωR

The factor a = F/M = dv/dt is called the acceleration

(5)

and likewise its acceleration by aR = αR.

(6)

Acceleration is related to a force F being applied to a point of the cuboid. Fig. 2 (b) shows the case where a force F is applied to a point A. Here, the force vector F = (Frot , Ftran )T must be split into two components. The first −−→ component Frot is perpendicular to the vector CA and results in a rotational force changing ω. The remaining component Ftran is simply a translational force and must be treated as described in (1) to (4). Frot then results in a torque [15] Figure 1: A cuboid is accelerated by a force F .

T = Frot R.

22

(7)

Figure 4: Ackermann steering and centripetal force.

and Ftrac,r,y ), and points into the direction of the respective tyre. This component is mainly responsible for accelerating the car. The second component (Ftrac,f,la and Ftrac,r,x) is the lateral force, which is responsible for keeping the car on the current track.

Figure 3: The base car model with traction forces (a) and the car movement model (b). Similar to (2), this torque is then translated into angular acceleration by T (8) α= , I where I is the rotational equivalent to mass called moment of inertia. For general shapes and axes, I must be described by means of so-called tensors. However, for a cuboid and an axis going through the center of mass C and being perpendicular to the upper cuboid face, IC is defined to be [14]

Steering is modelled by the steering angle β, which defines the current angle between the front tyre and the main car direction (see Fig. 3 (b)). The current velocity of the car is represented by v. Furthermore, due to steering, the car may rotate by an angular velocity ω, the rotational axis here going through point B. However, as later shown in (23), ω can be computed by β, the current longitudinal velocity v, and L.

M (W 2 + L2 ) . (9) 12 An important result is given by the parallel axes theorem due to J. Steiner, which describes the moment of inertia I if the axis is parallel to the one through C, and having a distance of R from it [19]: IC =

I = IC + M R 2 .

4.

4.1 Limitations of the Model The described model is kept simple on purpose and does not implement a number of features. The omitted features, however, are mostly orthogonal to the model and can be added to the equations if so desired.

(10)

First, the model does not implement weight transfer, which happens when the car is accelerated [10, 2]. Second, the model assumes an engine or brake torque to be fed from the engine into the tyres. Elaborated models for computing the engine torque as a function of gear and rpm exist, and can easily be used in addition to this model [21, 10]. Third, the model assumes only two tyres instead of four, thus blurring the possibility of different forces on the two front or rear tyres. Fourth, there is no slip between tyres and the ground, i.e., the slip ratio is set to 1 (see Section 4.6). As a consequence, there are also no applications of Pacejka’s Magic Formula, which models properties of the tyre and the resulting slip [11]. Finally, the presented model does not contain aerodynamic drag and rolling resistance, which impose a breaking force onto a moving car [10].

A 2D CAR PHYSICS MODEL

In the proposed model the car is modelled by a cuboid as described in the previous section, with length L, width W and mass M . As convention we always assume that the car is heading up, i. e., its orientation is along the positive y-axis. Furthermore, as a simplification, the car uses only two tyres instead of four, the tyres being at the middle of the front and rear sides, depicted in Fig. 3 (a) as points A and B. At these points, the major forces are exchanged between the car body, the tyres, the engine and the ground. The figure also shows the traction force „ « Ftrac,f,x Ftrac,f = Ftrac,f,y « „ «„ cos β − sin β Ftrac,f,la (11) = Ftrac,f,lo sin β cos β at the front tyre, and Ftrac,r =



Ftrac,r,x Ftrac,r,y

The major source for driving a car is of course its engine. However, once moving, different forces appear and either move the car to the side or reduce its speed. In the following, three major types of force are investigated and described by equations. The integration of these equations into one closed model is then presented in Section 6.

« (12)

4.2 Ackermann Steering For cars there exists a movement model called Ackermann steering [6] (Fig. 4). The car center C moves along a circle,

at the rear tyre. Both forces can be split into two components. The first is a longitudinal component (Ftrac,f,lo

23

vector vB = (0, v)T , whereas point A moves with velocity vA = (ωL, v)T , and point C with velocity vC = (ωL/2, v)T , which is consistent with (13). If the car moves along a circle because of steering, this means that the points move at different circles being concentric, but having different radii. A travels along the largest circle, while C travels along a smaller circle, and B again on a smaller circle, exactly as defined in the Ackermann steering model. Because of (14), (15), and (16) the relations between the different velocities are given by r |c| tan2 β |vC | |vC | = = = 1+ ≥ 1, (17) |b| |vB | v 4

Figure 5: Movement model. the rear tyre moves along a circle being concentric to the first one, but with smaller radius, and the front tyre moves along another concentric circle with larger radius. The center D of is the − intersection of the vectors a = −→ −−→the three circles AD = (ax , ay )T and b = BD = (bx , 0)T , which originate at the tyre centers and which are perpendicular to the tyre orientations. If β = 0 then the two vectors are parallel and meet at infinity. Note that for β = 0 the center of mass C circumvents a larger circle than the one B circumvents in the same time, its velocity vC thus must be larger than the velocity |vB | of B, the same is true for points A and C. For β = 0 it follows that |vB | < |vC | < |vA |.

and |vB | v |b| = = = cos β ≤ 1. |a| |vA | |vA |

4.3 Centripetal Force If the steering angle β is different to zero, then a lateral force is put onto the front tyre, pushing the front of the car to the respective side. For a fixed β, the car then travels along a circle. If an object with mass M moves with velocity v along a circle with radius R, then there must be a centripetal force Fcp pushing the object center C to the circle center. The length of Fcp is known to be [13] (see Fig. 4)

(13)

In the following, these properties are the only premises for the car physics model described in this paper.

|Fcp | =

M |vC |2 . R

Fcp

(14)

=

M (1 +

tan2 β ) v2 4

|c|

2

and

„ a=

bx −L

«

„ =

−L cot β −L

«

= .

(19)

From (16), (17) and (19), and noting that in this case R = |c|, we get

Simple calculation shows that the angle in the left corner of the triangle BDA is also the steering angle β. For given L, it follows that for β = 0 and |β| < π/2 bx = −L cot β

(18)

(15) =

−−→ Similarly, the vector c = CD connecting the center of mass C with the circle center D is given by „ « „ « bx −L cot β c= = . (16) −L/2 −L/2

c |c| 2 „

−M (1 + tan4 β ) v cot β 1/2 L(cot2 β + 1/4) „ « 2 2 −M v tan β cot β 1/2 L

«

(20)

This is the centripetal force pulling the car towards the center of the circle it runs on. Also, it must be poduced by the lateral (cornering) forces of the front and rear tyres which are parts of the overal traction forces. Thus the next step is to split Fcp = Fcp,f + Fcp,r into two cornering forces originating at the front (Fcp,f ) and rear tyres (Fcp,r ), and pointing along the vectors a and b towards D. Since the forces responsible for the car rotation are modelled in Sections 4.4 and 4.5, the forces Fcp,f and Fcp,r treated here do not change the car’s rotation. From the description of rotational forces in Section 3 it follows that the force components −→ − −→ which are perpendicular to the vectors CA and CB must be equal. When the car points up these components are the x-coordinates of Fcp,f and Fcp,r , which therefore must be equal: „ cot β « −M v 2 tan2 β 2 , (21) Fcp,f = 1/2 L

In the presented model, the car is moved in the following way (see Fig. 5). In each time period dt, the car first moves a length of ds = v dt into the direction of the car, then the car is rotated by an angle of dθ = ω dt, using B as rotational axis. At this point it must be noted that it is thinkable that the rotational axis might go through the center or mass C, an assumption being found in many other 2D models. However, the following argument does not support this assumption. Suppose the rotational axis goes through C and not through B. The velocity of point C then is given by vC , point B would also have this velocity component, but additionally, due to the rotation around C, a second velocity component being perpendicular to vC . For β = 0 it follows that |vB | > |vC |. This, however is in contradiction to (13).

and

Due to the rotation, A additionally moves along a circle with speed ωL. As a consequence, point B moves with velocity

Fcp,r =

24

−M v 2 tan2 β L



cot β 2

0

« .

(22)

Figure 6: Forces causing the car to rotate.

Figure 7: A force causing a cuboid to rotate (a) and application of two forces additionally causing an acceleration a (b).

It is worth noting that the centripetal forces Fcp,r and Fcp,f depend on the current steering angle β and the current velocity v, but not on the steering change dβ/dt. Also note that both forces approximate the zero vector for β → 0.

to the rotational axis, the only point where such a force might be created is the front tyre. From there, it must point straight down to the rear tyre, i.e., its x-coordinate must be zero. Similar to (19), and by using (23), the y-coordinate of this force then must be

4.4 Car Rotation A car travelling along a circle will also rotate as described in Section 3.2. Assume that the car is travelling with velocity v around a circle with radius R and circumference l = 2Rπ. Since for a complete run around the circle it needs the time dt = l/v and rotates around an angle dθ = 2π, the rotation is carried out with angular speed ω = dθ/dt. Since v is the velocity of point B we set R = −bx and derive because of (14) ω=

2π v v v dθ = = = tan β. = dt l/v −bx L cot β L

Fcp,B,y = −

which is the same as the y-coordinate Fcp,f,y of the centripetal force at the front tyre Fcp,f as given by (21). In fact, this argument proves that the same force Fcp,f that is responsible for keeping the car on its track at the front tyre also is responsible for letting the car rotate around its rear tyre, and not its center of mass.

(23)

Consider the car in Fig. 6. The force responsible for the rotation is Frot,f = (Frot,f,x , Frot,f,y )T caused by the front tyre. To be exact, only the x-coordinate Frot,f,x causes the rotation, the other component Frot,f,y acts as a breaking force.

4.5 A Mysterious Force At this point it must be noted that there is also a lateral force at the rear tyre responsible for the rotation, though this force may not be visible at first sight. Think of a force causing a cuboid to spin. If only one force is applied, then the cuboid must rotate around its center of mass C. However, in the presented model, the rotational axis is the rear tyre, not the center of mass. It follows that there must be a second force, this time a lateral force Frot,r = (Frot,r,x , 0)T at the rear tyre, which shifts the rotational axis to the rear.

Since we assume that the car’s rotation axis goes through its rear tyre, from (10) we know that the car’s moment of inertia is IB = IC + M L2 /4. Thus, if the velocity v or the steering angle β change within a time dt by the amount of dv or dβ, then from (23) we get [16] α=

dω tan β dv v dβ = + . dt L dt L cos2 β dt

Consider the scenario depicted in Fig. 7 (a). A force Frot,f,x induces an angular acceleration

(24)

Following (7) and (8), the force component Frot,f,x of the front tyre must be « „ IB v dβ dv α IB = − 2 tan β + , (25) Frot,f,x = − L L dt cos2 β dt

α=

Frot,f,x L/2 IC

which in turn will cause the cuboid to rotate about the axis going through its center of mass C. The angular acceleration also causes an acceleration l = αL/2 of point B.

and furthermore Frot,f,y

M (ωL/2)2 M v 2 tan2 β =− , L/2 2L

Now consider Fig. 7 (b). Here, B is fixed to an axis and Frot,f,x causes some Frot,r,x to press against this axis, causing an equal force into the opposite direction. The result is that the force causing rotation now is only Frot,f,x −Frot,r,x:

= Frot,f,x tan β « „ IB tan β v dβ dv = − + . (26) tan β L2 dt cos2 β dt

α=

Another aspect of rotation is the fact that the car rotates around an axis going through point B. This rotation also demands a centripetal force Fcp,B , pushing the car against B, since without this force the car would rotate around its center of mass C. As the centripetal force always points

(Frot,f,x − Frot,r,x )L/2 IC

and the acceleration of B is given by l =

25

(Frot,f,x − Frot,r,x ) L2 /4 . IC

force at the front tyre actually is the sum Ftrac,f

= Fcp,f + Frot,f + (30) „ «„ « cos β − sin β 0 + . sin β cos β Facc,f

Here the force Facc,f is used similarly to Ftrac,r,y in (29): Ttot,f

=

Te,f − Facc,f Rw .

(31)

By using (21), (26) and (30), the y-coordinate of Ftrac,f is given by Figure 8: Acceleration forces at the tyres and the car body.

Ftrac,f,y

Additionally, the center of mass C is accelerated by a=

Fcp,f,y =

Now in order that B remains on its position, it follows that

IB − IC Frot,f,x IB

IB tan β . (33) L2 For the car body, the longitudinal traction forces sum up and result in the total longitudinal acceleration force Ftot on the car body: fr :=

(28)

Ftot = Ftrac,r,y + Ftrac,f,y .

which denotes the lateral force Frot,r,x at the rear tyre in case there is a lateral force Frot,f,x causing rotation at the front tyre.

(34)

Note that in (34), the term Ftrac,f,y can immediately be replaced by the right hand side of (32). Equ. (34) is also a good place for adding additional terms for various drag forces, which is not done here. What remains is the coupling between the rotational acceleration of the tyres and the car acceleration, which due to (2), (6), (7), (8), and (18) result in dv Ftot Rw Ttot,r = = , (35) dt M Iw

4.6 Acceleration The last movement component of the presented model is given by the longitudinal acceleration dv/dt of the car body. Recall the basic car force model as depicted in Fig. 3 (a). Also recall the definitions of the traction forces Ftrac,f by (11) and Ftrac,r by (12).

and Ftot |vB | Rw Ttot,f dv Rw Ttot,f = = = cos β . dt M |vA | Iw Iw

Longitudinal acceleration mainly is created by the car engine and the tyre breaks. An engine puts a torque Te,r to the rear axle, which itself forwards this torque to the rear tyre (in this model there is only one central tyre). This torque then results in an angular acceleration of the rear tyre, and a traction force Ftrac,r,y between the rear tyre and the ground. This traction force then accelerates the car body (and the car body then accelerates the front tyre). Alternatively, the tyre breaks would cause negative torques Te,r at the rear and Te,f at the front tyre. In the presented model, there is no slip ratio between the tyres and the ground. If a slip ratio is desired, then the model must be augmented with a corresponding slip factor.

(36)

The meaning of the system of linear equations (29) to (36) is this: if a torque is put onto the rear (front) tyre, this torque is split into the parts accelerating the rear (front) tyre, the front (rear) tyre and the car body. The velocity of the car mass is equal to the rotational velocity of each tyre. However, the velocity of the front tyre must be multiplied by a term which corrects the different movement radii. Additionally, if a slip ratio different to 1 is desired, it can be added to (35) and (36). The above system of linear equations (29) to (36) can be solved analytically. The solution for Ftot is given by “ ” fr v dβ 2 M Rw Fcp,f,y − cos 2 β dt + (37) Ftot = 2 2 Iw + (M + fr tan β)Rw

In order to establish the corresponding equations, we define the radius of a tyre by Rw , and the tyre inertia by Iw . Note that the engine torque is split into two parts, a Ttrac,r,y creating the traction force Ftrac,r,y , and the second part Ttot,r actually accelerating the rear tyre (Fig. 8). By noting (7) we get: Ttot,r = Te,r − Ftrac,r,y Rw .

−M v 2 tan2 β 2L

and

(27)

Simple calculation shows that (27) leads to Frot,r,x =

(32)

here setting

Frot,r,x . M

l = a.

= Fcp,f,y − « „ v dβ dv + + −fr tan β dt cos2 β dt +Facc,f cos β,

+

(29)

M Rw (cos β Te,f + Te,r ) 2 2 Iw + (M + fr tan β)Rw

The first summand of (37) denotes the rotational and centripetal forces, while the second part denotes the engine and

At the front tyre things get more complicated. The traction

26

Parameter W L M Rw Iw

Value 2 4 1500 0.33 2 × 4.1

Unit m m kg m kg m2

Explanation Car width Car length Car mass Radius of tyre Inertia of tyre

InitializeModel(); double lasttime = now(); while( true ) { Input input; GetUserInput( input ); double time = now(); ComputeModel( input, time - lasttime ); RenderTheScene(); lasttime = time; }

Table 1: General model parameters.

break torques. Furthermore, from (37), the other unknowns Ttot,r , Ttot,f , Ftrac,r,y , and Facc,f can easily be derived using equations (29) to (36). Although (37) looks quite complicated, for β ≈ 0 it changes into a much simpler form, because in this case cos β ≈ 1 and tan β ≈ fr ≈ Fcp,f,y ≈ 0: Ftot

Table 2: The game loop.

and the engine/break torques Te,r and Te,f as defined in Section 4.6.

M Rw (Te,r + Te,f ) . ≈ 2 2Iw + M Rw

The game loop, which is executed continuously throughout the game, is shown in Tab. 2. At first the user input is queried from either keyboard or joystick. Then the function responsible for computing the model is called. Its input parameters are the user input, here a structure holding β, Te,r and Te,f , and the time since its last call.

The M in the numerator is of course the car mass, while the denominator represents the inertia of both tyres plus the car mass. Additionally, the numerator just sums up the engine and break torques of the rear and front tyres.

5.

TRACTION

The code for computing the model itself is shown in Tab. 3. The function is called after some time dt has gone by. The function then computes the model behaviour for the last dt seconds, i.e., for the time interval that has passed by. The structure oldinput stores the user input that was recorded at the start of this interval, while the structure input holds the user input at the end of the interval. Engine and breaking forces either can be taken from oldinput and can be kept constant throughout the interval, or the average between the engine and breaking forces of oldinput and input can be used. In Tab. 3 the first alternative is chosen.

The traction forces depicted in Fig. 3 (a) are exchanged between the tyres and the ground. However, there is a limit Fmax to the amount of force that can be exchanged. If the traction force on one of the tyre exceeds this bound, then the tyre looses grip and starts sliding. It follows that the model must continuously check whether the traction forces exceed the maximum. In such a case, the model then must switch to an appropriate sliding model, which is not treated in this work, but for instance in [10]. Following [6], Fmax is determined by the static friction factor µs in the following way. If F denotes the force that presses down the tyre to the ground, then

7. EXPERIMENTAL RESULTS The pseudocode from Tab. 3 has been implemented by using Mathematica 5.2.7 The implementation has been used for running a number of experiments. The purpose of the experiments was to test whether the presented model can be implemented as described in Tab. 3 and yields numerical results that make sense. Also the experiments should show the magnitude of the observed traction forces, including centripetal forces (which are mainly used in other models), but also the additional forces as described in this paper, and furthermore investigate the usefulness of the given explicit formulas for explaining numerical results.

Fmax = µs F. Here, F is the amount of weight force on one tyre, i.e., in the presented model F = 9.81M/2. Values for µs are, for instance, µs = 1.0 for rubber on dry concrete ground, or µs = 0.3 for rubber on wet concrete ground. If the car starts sliding, then the force exchanged between tyre and ground is asssumed to be constant: Ftrac = µk F . Here µk denotes the kinetic friction factor which, for instance is µk = 0.8 for rubber on dry concrete ground, or µk = 0.25 for rubber on wet concrete ground. Other values for µs and µk can be found, for instance, in [6, 1].

6.

In the first set of experiments, the car is only accelerated without steering, i.e., β = 0 and Te,r > 0. Furthermore, it is assumed that acceleration on the front tyre is only done via the brakes, which are not used in the presented experiments. Thus, throughout all experiments, Te,f = 0. Also, the time difference was set to dt = 10 ms. The traction forces and the force on the main car body for acceleration only can be seen in Fig. 9. The figure additionally shows the maximum allowed forces for dry (upper horizontal thin line) and wet (lower horizontal thin line) ground, as defined in Section 5. If the traction force at any tyre exceeds this bound then the car starts sliding. It can be seen that almost all of the

MODEL SOLUTION

The previous sections have presented numerous equations, their integration into a closed model is described in this section. For solving the model, we now define a set of parameters and input variables which define the state of the model and user input. The set of basic model parameters is shown in Tab. 1. The current model state is given by the velocity vB = (0, v)T of point B. User input is given by the steering angle β

7

27

http://www.wolfram.com/

β=0

Force (N )

Input oldinput; double v = 0; ComputeModel( Input input, double dt ) { if( dt > 0 ) { β = oldinput.β dβ = input.β − oldinput.β Te,f = oldinput.Te,f Te,r = oldinput.Te,r Compute ω from (23) if( β = 0 ) { Compute Fcp,f from (21) Compute Fcp,r from (22) } else { Fcp,f = (0, 0)T and Fcp,r = (0, 0)T } Compute fr from (33) Compute Ftot from (37) a = Ftot /M Compute α from (24) Compute Ftrac,r,y , Facc,f , Ttot,r , and Ttot,f from (29) to (36) Compute Frot,f from (25) and (26) Compute Frot,r from (28)

Ftot Ftrac,f,y Ftrac,r,y

0

500

1000

1500

2000

2500

Te,r (N m)

|Ftrac,f | (N)

Figure 9: Forces caused by acceleration only (β = 0, dβ/dt = 0, Te,f = 0).

Ftrac,r = (0, Ftrac,r,y )T + Fcp,r + Frot,r Compute Ftrac,f from (30) Fmax = 9.81 M/2 if( |Ftrac,r | ≤ Fmax && |Ftrac,f | ≤ Fmax ) { Advance the car by ds = v dt + a dt2 /2 Rotate the car by dθ = ω dt + α dt2 /2 v = v + a dt } else { Switch to sliding model }

}

8000 7000 6000 5000 4000 3000 2000 1000 0 -1000

β = 1 deg 16000 14000 dβ/dt = 0 dβ/dt = 1 12000 dβ/dt = 5 10000 dβ/dt = 10 8000 6000 4000 2000 0 0 20 40 60 80 100 120 140 160 180 200 v (km/h)

Figure 10: Front traction force |Ftrac,f | caused by steering only (β = 1 deg, Te,r = 0, Te,f = 0). For dβ/dt = 0, |Ftrac,f | is almost equal to |Fcp,f |.

} oldinput = input;

is caused by the rotational forces Frot,r and Frot,f , which additionally are shown in Fig. 11.

Table 3: Computing the model. The same experiments have been repeated, this time setting β = 3 deg. The results are shown in Figs. 12 and 13. Especially Fig. 12 shows that when the steering angle is that large then the traction forces soon become too large and steering is impossible.

resulting body acceleration Ftot is due to the rear traction force Ftrac,r,y . The front traction force Ftrac,f,y here only is responsible for accelerating the front tyre.

The final experiments investigate which values for β and dβ/dt do not cause sliding when driving with a certain velocity v. In Fig. 14 the maximal allowed steering angle β is shown for dry and wet ground. The thin horizontal lines are drawn at β = 1 deg and β = 3 deg. The results may, for instance, be used to implement a virtual steering assistance system which prohibits sliding by automatically reducing β, in case the traction forces grow too large.

In the next experiments it is assumed that the car is travelling with some velocity v > 0 and a nonzero steering angle β > 0. However, there is no engine or break torque on the tyres. At first, the steering angle β was set to 1 degree, and the steering angle change dβ/dt was set to dβ/dt = 0, 1, 5, and 10 deg/s. The resulting front traction force |Ftrac,f | is shown in Fig. 10. Here it must be noted that if β = 0 and dβ/dt = Te,r = Te,f = 0, then the car body acceleration force Ftot is quite small, but due to the y-coordinate of (21) not zero. It follows that dv/dt is also quite small, resulting due to (24) in a small rotational force Frot,f and Frot,r . Thus, the observed traction forces are mainly caused by the centripetal forces, i.e., Ftrac,r ≈ Fcp,r and Ftrac,f ≈ Fcp,f . If however the steering angle changes, then it can be seen that the traction forces quickly rise. This additional growth

For β = 1 deg and β = 3 deg it has then been investigated how fast the driver may change the steering angle without causing the car to slide. The results are shown in Fig. 15, again for dry and wet ground. The lines end at the points where the traction force is already too large only because of β = 0 (compare to the intersections of the thick lines and the thin horizontal lines in Fig. 14).

28

β = 1 deg

β = 3 deg 10000 |Frot,r | and |Frot,f | (N)

|Frot,r | and |Frot,f | (N)

10000 1000 100 dβ/dt = 0 dβ/dt = 1 dβ/dt = 5 dβ/dt = 10

10 1 0.1 0.01 0.001

0

1000 100 10 0.1 0.01 0.001

20 40 60 80 100 120 140 160 180 200

dβ/dt = 0 dβ/dt = 1 dβ/dt = 5 dβ/dt = 10

1

0

20 40 60 80 100 120 140 160 180 200

v (km/h)

v (km/h)

Figure 11: Rotational forces |Frot,f | (thick lines) and |Frot,r | (thin lines) caused by steering only (β = 1 deg, Te,r = 0, Te,f = 0).

Figure 13: Rotational forces |Frot,f | (thick lines) and |Frot,r | (thin lines) caused by steering only (β = 3 deg, Te,r = 0, Te,f = 0).

Maximum allowed β 100 dry wet

40000 35000 dβ/dt = 0 dβ/dt = 1 30000 dβ/dt = 5 25000 dβ/dt = 10 20000 15000 10000 5000 0 0 20 40 60 80 100 120 140 160 180 200

Max. β (deg)

|Ftrac,f | (N)

β = 3 deg

10

1

0.1

0

20 40 60 80 100 120 140 160 180 200 v (km/h)

v (km/h)

Figure 14: Maximum allowed β (deg) for front traction force |Ftrac,f | caused by steering only (dβ/dt = 0, Te,r = 0, Te,f = 0).

Figure 12: Front traction force |Ftrac,f | caused by steering only (β = 3 deg, Te,r = 0, Te,f = 0). For dβ/dt = 0, |Ftrac,f | is almost equal to |Fcp,f |.

Maximum allowed dβ/dt

8.

Max. dβ/dt (deg/s)

1000

CONCLUSIONS

In this paper a detailed car physics model for Ackermann steering is presented. The model consists of closed formulas which derive all necessary forces and thus accelerations for simulating Ackermann steering. In particular, centripetal forces, rotational forces and acceleration forces are described. An interesting result is given for rotational forces at the rear tyre. Furthermore, the identity of a component of the centripetal force at the front tyre, and the force being responsible for rotating the car about its rear tyre is shown. It is then demonstrated how to use the model in a game loop. Finally, an implementation of this loop is used for carrying out a number of numerical experiments.

β = 1, dry β = 3, dry β = 1, wet β = 3, wet

100 10 1 0.1

0

20

40

60

80 100 120 140 160 180 v (km/h)

Figure 15: Maximum allowed dβ/dt (deg/s) for front traction force |Ftrac,f | caused by steering only (Te,r = 0, Te,f = 0).

The model does not include things like weight transfer, tyre slip or engine power. These have been investigated thoroughly in previous papers and can be included into the model easily.

9. REFERENCES [1] R. Beardmore. Friction factors. WWW, March 2006. http://www.roymech.co.uk/Useful Tables/Tribology/ co of frict.htm.

Future work will focus on sliding models, which will model the case when only the front tyres, only the rear tyres, or all tyres loose traction.

[2] B. Beckman. The Physics of Racing Series.

29

http://phors.locost7.info/, 2002. [3] R. Bower. The Physics of Motorsport. WWW, 1999. Department of Physics, University of Durham. [4] R. Chaney. Simulating Single Rigid Bodies. http://homepage.ntlworld.com/richard.chaney/ index.html, 1999. [5] C. Gerthsen. Physik. Springer, 2006. [6] S. Gsoellpointner. Ein 3-dimensionales Modell zur Simulation von Kraftfahrzeugen in Echtzeitanwendungen. Master’s thesis, Johannes Kepler University, September 2002. [7] C. Hecker. Physics, The Next Frontier. Game Developer, pages 12–20, October/November 1996. http://www.d6.com/users/checker/dynamics.htm. [8] C. Hecker. Physics, Part 2: Angular Effects. Game Developer, pages 14–22, January 1997. http://www.d6.com/users/checker/dynamics.htm. [9] E. M. Lowndes. Development of an Intermediate DOF Vehicle Dynamics Model for Optimal Design Studies. PhD thesis, Department of Mechanical and Aerospace Engineering, North Carolina State University, 1998. [10] M. Monster. Car Physics for Games. WWW, December 2001. http://home.planet.nl/ monstrous/tutcar.html (Web link currently broken). [11] H. Pacejka and E. Bakker. The magic formula tyre mode. Vehicle System Dynamics, 21:1–19, 1991. [12] R. van Gaal. Car physics - reference. WWW, April 2006. http://www.racer.nl/reference/carphys.htm. [13] Wikipedia. Centripetal force. WWW, August 2006. http://en.wikipedia.org/wiki/Centripetal force. [14] Wikipedia. List of moments of inertia. WWW, July 2006. http://en.wikipedia.org/wiki/ List of moments of inertia. [15] Wikipedia. Moment of inertia. WWW, August 2006. http://en.wikipedia.org/wiki/Moment of inertia. [16] Wikipedia. Product rule. WWW, August 2006. http://en.wikipedia.org/wiki/Product rule. [17] Wikipedia. Rigid Body Dynamics. WWW, July 2006. http://en.wikipedia.org/wiki/Rigid body dynamics. [18] Wikipedia. Sinc function. WWW, August 2006. http://en.wikipedia.org/wiki/Sinc function. [19] Wikipedia. Steiner’s Theorem. WWW, July 2006. http://en.wikipedia.org/wiki/Parallel axes rule. [20] Wikipedia. Trigonometric function. WWW, September 2006. http://en.wikipedia.org/wiki/Sine. [21] T. Zuvich. Vehicle Dynamics for Racing Games. WWW, 2000. http://www.gamasutra.com/features/gdcarchive/ 2000/zuvich.doc.

30

Design and Implementation of an Auditory Help System for Computer Games Aidan Kehoe

Flaithrí Neff

Ian Pitt

University College Cork, Cork, Ireland. +353 87 6668609

University College Cork, Cork, Ireland. +353 87 7615315

University College Cork, Cork, Ireland. +353 21 490 2863

[email protected]

[email protected]

[email protected]

presenting game-related help material. Speech technology can be used to present online help to players in games. It can complement the traditional help delivery techniques, and mitigate some of the problems associated with these techniques.

ABSTRACT Traditional methods of providing help are generally applicable to gaming, but have limitations. This paper describes the design and implementation of an auditory help system that can complement traditional online help methodologies and be tightly integrated into games. We consider the cognitive constraints of the user in relation to the presentation of bimodal information in a game environment. With the majority of emphasis placed on visual aspects of game-play, we suggest the integration of an auditory help system. However, there are also significant issues which should be taken into account when utilizing such a system including: technological constraints; perceptual interaction between visual and auditory information; perceptual interaction between speech and non-speech sounds within the help system itself; and interference between game sounds and help system sounds.

This paper describes the architecture and design of a help system for games that was developed to support both traditional online help functionality, and also to provide an optional auditory interface incorporating speech and non-speech sounds. Speech technology is being used in a growing number of fields as a result of increasing capabilities of computing platforms and on-going improvements in speech technology performance. However, speech technology still has performance limitations and the speech-enabled help system has been designed within these current technological limitations. The incorporation of the auditory help system into a visual game environment is not just based on technological factors. It is also based on an attempt to find the most efficient interaction between the machine and a user's cognitive processes. The paper presents a model that explains how the speech-enabled help system operates in the context of the game. This paper describes the types of help information that are suitable for presentation aurally. Guidelines for authoring effective auditory help information are proposed.

Categories and Subject Descriptors H.5.2 [Information Interfaces and Presentation]: User Interfaces – training, help and documentation.

General Terms Documentation, Design, Human Factors.

The focus of the initial implementation was on the development of a working system capable of supporting context-specific help topics. The speech-enabled help system was integrated in a number of fairly basic open source games that run on the Microsoft Windows platform.

Keywords Online help, Speech technology, Sonification.

1. INTRODUCTION Game designers are aware of the risks of games becoming overly complex and difficult to play [35]. Gee [25] identified a number of approaches used in game design that help players to progress through long, complex and difficult games. Also, most game interfaces have direct manipulation style interfaces [51]. This type of interface encourages user exploration and learning.

2. ONLINE HELP IN COMPUTER GAMES Help system technologies have evolved over the past forty years. The latest online help systems make use of enhanced computer capabilities and new technologies. Collapsible table of contents, full text search, popup text and task-oriented help topics are some of the developments. Wizards and tutorials sometimes incorporate multimedia elements for presentation of information to the user. The same techniques used to provide help in general computing applications are also widely used in games.

However, in spite of the best efforts of developers many players Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GDTW 2006, November 15–16, 2006, Liverpool, U.K. Copyright 2006 ACM 1-58113-000-0/00/0004…$5.00.

2.1 Challenges for Traditional Help in Games Empirical studies show that well designed help material can be effective [26]. However, traditional techniques for providing help to users present a number of significant challenges that are very apparent in the context of games.

will still require help at various stages of a game. Traditional techniques for providing help to users present a number of well known challenges, and these challenges are also relevant when

31

Simple detection of stimulus in a multi-modal environment requires little cognitive processing and is dealt with efficiently by the respective peripheral sensory mechanisms. Therefore the use of aural, visual and haptic stimuli for detection purposes is not an issue of concern and many earcons1 and auditory icons2 plainly do this to draw attention to such things as system warnings or application actions. However, when more detailed, contextual information or source monitoring of simultaneous information streams is required, issues of cognitive overload and divided attention become relevant [46]. It is important to point out that this not only becomes an issue for multi- or bimodal interfaces but also for unimodal interfaces providing several streams of data concurrently to the user. Once past the peripheral sensory mechanisms, a more desegregated central cognitive system takes over.

• Hands/eyes busy: In many gaming scenarios the player’s hands and eyes are busy. To access help information the player may need to pause the game, or perhaps open an additional help window. Pausing the gaming, or the introduction of an overlay help window, can result in a considerable disruption to the player’s gaming experience. • Context switching: Games often use a help window to display information to users. Studies on traditional computing applications show that switching between the application and the help window can be problematic, especially for novice users [33]. • Limited display real estate: Many games make use of popup text help. Hovering with the cursor over an element on the screen causes additional explanatory text to be displayed. However, the amount of space for popup text is very limited. The use of portable gaming devices with small display sizes and limited resolution, including cell phones and PDAs, makes display of text-based help material even more challenging [27].

3.1 Multimodal Streams The Auditory Help User Interface Model proposed by Neff and Pitt in this paper (Figure 1) attempts to take a cognitive snapshot of an instance of game play while using an auditorybased help system. The model incorporates an adaptation of the Baddeley & Hitch Working Memory Model [2, 3, 4] with the integration of some of the theories of Jones et al. [31, 32]. It is the Working Memory component along with the Subtask Attention & Inhibitor Manager that is perhaps the most relevant for this paper.

• Paradox of the active user: Carroll and Rosson [11] explain why users generally start using software immediately, without consulting manuals or other available help material. Also, when users do encounter a problem they are reluctant to break away from their current task, which is their primary focus, and consult the available help material. Use of speech technology to provide help to players within the context of the game can mitigate some of these problems, and complement traditional online help systems. Speech technology can be used to provide help information to players in situations where their hands and eyes are busy. Help information can be presented aurally in situations where effective visual display of help material is not possible.

3.1.1 Visual and Aural Streams The aim of the model is to identify and avoid possible bottlenecks in an environment such as that described in this paper and to accentuate areas that will facilitate the transfer of information to the user efficiently. We hypothesize that the use of an auditory help system relaying information to the user whilst he/she is interacting with a visual display will balance the cognitive load more effectively.

3. AUDITORY HELP SYSTEM MODEL The successful integration of an auditory help system into a largely visual game environment will require efficient interaction between the machine and the player's cognitive processes. This section of the paper describes how that interaction works. An understanding of the cognitive factors involved will help us to avoid design flaws by allowing us to anticipate the data flow of visual and auditory information. This allows us to identify areas of segregation and areas of consolidation of visual and auditory streams, possibly revealing points of perceptual interaction and interference. Unlike visual information, there is an inherent lack of external memory (display) for auditory information. Therefore, we need to maximally utilize the human memory components relating to auditory information and avoid perceptual interaction and interference with the visual content. Although visual and aural data utilize similar and closely related short-term storage components, the user interface model presented here (Figure 1) recognizes the separation of aural and visual information at critical stages within working memory. There is also the issue of segregation between lexical and acoustic content that is conveyed in the model and that has significant implications for our design and implementation of the auditory help system itself. All of these factors (vision, speech and non-speech sounds), though potentially discrete, are highly reactive with each other in most HCI applications. It is the potential discreteness of these factors that we hope to exploit through recognizing the cognitive constraints.

32

1

An earcon is defined as a “….non-speech audio messages consisting of abstract, musical tones that are used in structured combination. As there is no intuitive link between an earcon and what it represents, earcons have to be learnt.” [56]

2

An auditory icon is defined as “….natural, everyday sounds that are recorded in the environment and mapped to system events by analogy. The advantage of auditory icons is that only a minimal learning effort is required to understand the connection between sound and the to-be-represented object.” [56]

Figure 1: Auditory Help User Interface Model. involving the Central Executive, but here specified as a managerial element controlling attention and inhibition (the Sub-task Attention & Inhibition Manager which is labeled “StAIM” in Figure 1).

We hope to achieve this by examining aspects of human working memory, particularly with respect to maximizing its storage capacity by separating auditory and visual information. We also need to look at attention and inhibition aspects of dualtask, bimodal situations.

This attention and inhibition function is a top-down process and cortical regions have been demonstrated to correlate with this functioning [29]. During bimodal auditory and visual tasks, cross-modal interaction occurs between the respective cortices. Johnson and Zatorre [29] showed how during bimodal tasks where attention is focused on one of the tasks, cortical activity associated with the attended sense was increased whereas activity of cortical areas associated with the unattended sense was decreased. Passive attention during bimodal presentation shows balanced activity between auditory and visual cortices and even unimodal tasks display significant inhibition of the cortical areas associated with the absent sense. These observations indicate active cross-modal interaction between cortices depending on the user's attention on tasks. Attention mechanisms therefore force suppression of unattended modalities and this subsequently affects memory encoding.

Unimodal visual dual-tasks exhibit a sharing cost of 15% between both visual streams when the subject is instructed to give 100% attention to the primary task and 0% to the secondary task [8, 54]. This sharing cost does not occur in similar circumstances between auditory and visual streams, therefore it is reasonable to conclude that it is easier to monitor two streams in different sensory modalities than in the same modality [8, 44, 55], or indeed it is easier to concentrate more efficiently on one sensory stream whilst ignoring the other. In cases where auditory and visual information is complementary or where attention is not significantly distracted from either task, simultaneous presentation of auditory and visual information produces little or no degradation in short-term item-specific or contextual data retention [46]. Therefore, with careful consideration to issues concerning divided-attention and inhibition, it may be possible to present auditory help information simultaneously with visual game-play while retaining in some measure discrete pathways of auditory and visual information retention, thus reducing cognitive load or increasing cognitive processing efficiency. Baddeley's Working Memory Model accounts for such a separation between speech based and visual-spatial based information, although visual retention may have some slight degradation in bimodal scenarios if the dual-coding theory of vision is legitimate. Therefore, the bottleneck of concern is not so much memory capacity or organization in this case, but rather the cognitive interface between incoming data and the memory components. This cognitive interface is essentially the process that allocates resources between current tasks, indicated by Baddeley as

Cross-modal errors are a consequence of this attention and inhibition action during dual-task situations in both bimodal and unimodal interfaces. This is particularly so if information is incompatible for both tasks, with complementary information having a positive effect by enhancing the perceptual experience and retention capacity. Cross-modal errors have been demonstrated using haptic and visual tasks [13, 46], and in unimodal dual-task environments involving vision alone [30, 46]. This is primarily a result of divided-attention at time of encoding and not affiliated to storage and retrieval inadequacies. Therefore, the degree of attention allocated to each task is a significant consideration in auditory-visual interfaces, where cross-modal errors are also an anticipated

33

vocalized speech show significant interference in pitch memory, suggesting that there is a single storage system for pitch content of both speech and non-speech input [50]. This is contrary to the belief that, once identified as speech, a signal’s acoustic attributes (including its pitch) are separately stored in working memory and that non-speech content has no access to this store.

possibility. It is important to note that the degree of attention forced on one task over another has a direct bearing on memory retention and that these errors also occur in unimodal environments where dual-tasks are involved. Therefore, in order to take advantage of working memory's separation of auditory and visual information, aspects of attention need to be considered so as to limit cross-modal errors in particular situations.

The notion of a single store for various acoustic attributes, whether for speech or non-speech, seems consistent with the issues raised by Jones and Macken [31] in relation to the Irrelevant Sound Effect. This effect was previously known as the Irrelevant Speech Effect due to the notion that speech significantly interfered with subvocal rehearsal of visually presented text/digits, but non-speech sound did not. However, when matching the acoustic traits of non-speech sound with that of speech, disruption to visually presented text/digits is found to be comparable. Jones et al. [31, 32] determined that it is the changes in composition of the sonic stream that cause the disruption and not the lexical or semantic content of speech the changing state hypothesis. This correlates with the fact that a speech stream typically changes very significantly in terms of its physical characteristics over time.

3.1.2 Speech and Non-speech Sounds Non-speech sounds have several advantages over speech [42]. They can present information quickly. They can be attention grabbing. When cognitive considerations are taken into account, they do not interfere with speech processing. They’re already widely used in computer applications for signifying the success or failure of an operation. Because these sounds are typically shorter in duration than the equivalent spoken language message and easily attract the user’s attention, this feature may also be exploited in parallel with synthesized speech output to provide additional information to the user. For example, they can be used to cue speech, so that the listener is prepared to hear speech and does not miss leading words. They can be used to signify help system structural information such as topic titles, list items, the availability of more detailed help information, hyperlinks and bullets.

The use of very short, unvaried sonic events such as earcons and auditory icons and indeed speech keywords in the relay of information is therefore an important design principle if the changing state theory is valid. Slightly longer but repetitive sonic events also concur positively with the changing state hypothesis. It is our opinion that designers need to restructure help files from the ground up when presentation is implemented using speech rather than a direct text-to-speech implementation of visually designed help topics.

Non-speech sounds, if acoustically diverse from speech, can be used to work around the problem associated with the auditory suffix effect. This problem is encountered when presenting the trailing “related topics” and “notes” type of information that is commonly used in help topics. Studies [e.g. 17] show that presentation of non-speech sounds after speech does not significantly impair recall of the previously spoken words. A non-speech sound could be used to indicate availability of additional information. The user could access this additional information if required.

3.2 Game & Help Flow through the Model The model (Figure 1) represents the interaction between the relevant human cognitive processes and visual game-play cum auditory help system.

Non-speech sounds could also be used to assist in presentation of information that is not very suitable for rendering through speech. This includes items such as an indication of the presence of graphic items. However, care should be taken with the use of non-speech sound to ensure it is not distracting, and that it is correctly synchronized with speech output.

The user encounters a problem during game-play and creates a cognitive representation of that problem. The user’s experience will determine the resolution of such an internalization of the problem and this will have repercussions on various outcomes throughout the model. User-experience is a significant constraint throughout most of the model’s stages.

This brings us onto the perceptual issues associated with the implementation of an auditory help system that integrates speech and non-speech sounds. The original aural component of Baddeley’s Working Memory Model, the phonological loop, does not sufficiently explain the integration of non-speech sounds. Many studies have indicated that non-speech sounds are processed and stored separately in working memory [7, 19, 21] and indeed Baddeley and Salamé [48] indicate that this is a possibility if their theory of a speech detector within working memory is valid. However, many recent studies suggest that speech and non-speech sounds are processed and stored in a single auditory component in working memory [49, 50, 56]. This component may be subsequently subcategorized into various acoustic stores with a stripping of lexical material outside of these acoustic stores.

Having made a cognitive representation of the problem, the user now needs to find advice on the issue. Perhaps the first question raised in this instance is how to find such advice. The user’s long-term memory is now called upon to submit scenarios saved from similar previous experiences, referred to as schemas [14]. Schema-theory promotes the notion of interactive processing between top-down (conceptually-driven & generalizations based on personal experiences) and bottomup (data driven & based on sensory data analysis). Various schemas are tested until the most appropriate one is found for the current problem. As part of the chosen schema, the user has isolated where to find advice for the problem (in this case, locating the help system for the game). Based on the accuracy of the chosen schema (determined by user experience or lack thereof), the problem itself and how to engage with the help system needs to be broken down into smaller units called subtasks. Schema rules apply and therefore schema expectations of particular incoming information act like a sensory filter. In effect, a schema aids in the interpretation of incoming sensory data. This schema filter applies to both incoming visual game-play and auditory help system data.

Such non-lexical acoustic subcategories may follow the psychoacoustic approach of Bregman’s auditory scene analysis [9], but encompassing both speech and non-speech acoustical characteristics. For example, there is evidence suggesting that pitch memory pulls pitch information from both vocal and nonvocal input, but is deaf to other sonic dimensions [50]. Complex, flat tones consistent in spectral content and range as

34

Although for the most part only the expected data is readily noticed and passes through the filtering process, anomalies that particularly stand out are also noticed and allowed past the filter. Unlike the auditory help system, visual game-play has the advantage of external memory, or in other words most of the visual information remains displayed onscreen allowing for detailed reference by the user at a later stage. More important in this case, however, is the allocation of resources for information passing the schema filter. The incoming sensory information is combined with the subtasks and awaits admittance to working memory. Highly sensitive to top-down influences and the overall task scenario, the Subtask Attention & Inhibitor Manager allocates information to working memory. As already mentioned, the degree of attention required for any one task over another at any one time determines the success of data retention in working memory. There is also the issue of inhibition. De Fockert et al. [24] suggest that when working memory load increases, our ability to inhibit irrelevant information decreases. Therefore, an overload of either visual or aural content may lead to effects such as the Irrelevant Sound Effect, again proving the need to strictly evaluate and balance presented material – in our case auditory help files.

Figure 2: Audio Configuration The standard game audio remains directed to the gaming system speakers. The player hears the auditory help system output on the headset earphone. The player can optionally use the headset microphone to issue voice commands to control access to the help system i.e. the player hears normal game audio through the speakers, and interacts with the help system via the headset. Given the location differences between game audio and help system audio (speakers and mono-headset respectively), this reduces the possible interaction and interference between these two sources. That said, it is an important element to consider for future implementations of the auditory help system, especially in relation to games that implement complex 3-D environmental sound effects and orchestrated music scores. The games we have currently chosen are more simplistic in relation to game sonic events for the purpose of more immediate implementation and testing. We are aware of the more complex issues in more elaborate games and we hope to look at these issues in our future work.

However, with the assumption that there is an ideal balance between visual game-play tasks and auditory help system tasks, the user stores critical auditory information separately to visual information. Although there are limits to each individual working memory component, presenting auditory information and visual information as discretely as possible may maximize retention capacity, which may have a knock-on affect on processing efficiency. Since the help topics are auditory only, issues such as subvocal rehearsal which links visual and auditory stores is not an issue. Other sensory data relating to smell, taste and touch are recognized but not relevant to our study. The acoustic attributes of the sonic streams (such as pitch) are saved in their individual sonic dimension stores. Sonic events related to each other, such as music and environmental sonic events are combined in the Episodic Acoustic Store. Separately, and after it has gone through its acoustic stripping, the lexical content of speech is stored in the Speech Lexical Store.

4. DESIGNING AUDITORY HELP As discussed in the previous section of the paper, an effective auditory help system requires an efficient interaction between the machine and a user's cognitive processes. The design of the actual help topics themselves is an equally important consideration. While it is technically possible to present most help information aurally, it may not always be an effective method of presenting the information. Only certain categories of help information can be effectively presented using audio. Also, each help topic should be designed such that it can be presented effectively using speech.

The Universal Episodic Buffer then appropriately contextualizes relevant auditory and visual information, packaging it for the data retrieval routines. Information may also be accessed from external memory depending on the state of working memory and other cognitive processes. Based on the information attained from the help files and the visual data, the user attempts the most appropriate next move in the game. Before this end-goal is acquired however, all necessary subtasks need to have been completed. If not, then the next subtask is attended to. If all subtasks have been completed, then the user makes his/her next move.

4.1 Categories of Help One popular categorization of help information is procedural, conceptual, reference and instructional [26]. These various categories of help differ in terms of suitability for presentation to users through speech and sonification. Procedural help information details a list of steps required to complete a specific task. Conceptual information is used to provide overview, supporting theory and background detail. Procedural help and short conceptual topics with a simple sentence structure can be presented effectively using speech. Instructional help information supports the requirements of users who want to learn and typically includes elements from both procedural and conceptual help. For example, there may be a step-by-step procedure listing, with some supporting conceptual information to explain the details of the procedure. The use of speech output may be especially appropriate for use with instructional help topics. Several studies [38] demonstrate that students learn better with speech and visuals, rather than text and visuals. Application of the minimalist instruction

3.3 Delivery Mechanism The experimental auditory help system uses a mono headset. Figure 2 shows the audio channel configuration. The system has been designed so that it can be integrated in games with minimum disruption to the existing game audio implementation.

35

pronunciation, volume, rate, etc. The appropriate use of markup language elements while authoring help topics can greatly enhance synthesized speech intelligibility.

approach outlined by Carroll [10] typically results in shorter task-oriented topics suitable for presentation aurally. Conceptual help topics targeted at advanced users often contain complex sentence structures and diagrams. Reference information typically includes elements such as lists, tables, etc. This information is difficult to present effectively using speech alone.

Using the markup language to specify prosody can decrease listener fatigue and enhance listener comprehension [43]. Use of the lexicon feature can ensure correct pronunciation of items such as proper names, acronyms, and game-specific terms, etc. Markup language elements can be used to resolve ambiguity with respect to the correct pronunciation of written words and ensure appropriate text normalization. For example, 3/4 may be pronounced as three quarters, 3rd of April or 4th of March depending on the context.

4.2 Authoring Effective Auditory Help Topics To date, most online help material has been developed with the assumption that the material will be read. We are developing a set of guidelines to assist in the creation and testing of help material that is to be presented aurally [34]. These guidelines are derived from a variety of sources including experience in development of speech based devices [43], guidelines used for development of prompts in speech based telephony applications [15] and practical experience in the development of an experimental auditory help system. The guidelines under development incorporate the following elements:

4.3.2 Voice and Persona Persona is defined as the role that one assumes or displays in public or society; one's public image or personality. When people listen to a voice they infer items such as gender, age, ethnicity and much more. Research suggests that while interacting through speech people will perceive a persona, whether this has been designed or not [39]. Care should be taken in selecting a voice so that the speech persona is consistent with the game.

• Cognitive Load: Topics should be kept short and relevant so as to minimize the time required to listen to the topic and reduce the listener cognitive load. Multimodal designs that use small amounts of display space have potential to mitigate some of these problems [41] and may be a useful option for games.

In a gaming context it would also be possible to present help using “conversational agents” or “talking heads”. A viseme is the facial position associated with articulating a certain sound. As a speech synthesis engine produces output it can supply another application with viseme information which can be used to drive graphical facial animations. The use of conversational agents has been shown to improve understanding of speech in certain situations such as noisy environments [42].

• Conversational Language: Information presented using speech should use elements of conversational language. The use of such elements from conversational language can impart naturalness to synthesized speech and facilitate comprehension. Using common words and a simple sentence structure will help the speech synthesis engine produce good quality speech output.

5. HELP TOPIC STRUCTURE There is a growing awareness of the advantages of using a single source for product documentation. The material in this single source repository can be transformed and presented to users in a number of formats, or through different media [47]. Help topics are designed so that the same information can be presented effectively using either traditional help text display methods or aurally.

• Position of Information: The placement of new and focal information in spoken language is important. Application of the End-Focus Principle [45] should result in the placement of such information at the end of a sentence for the English language. • The auditory suffix effect [17] refers to a decrease in accuracy of recall of the last few items in a spoken list if the spoken list is followed by additional speech stimulus. In online help documentation today it is common for help topics to have trailing “related topics” and “notes” type of material at the end of a topic. To avoid auditory suffix effects this trailing material should be omitted from the aural topic presentation.

5.1 Help Topic Elements Streamlined-step [23] help topics are very prevalent in computer help systems and documentation. This style of help topic is in line with much of the best practice relating to design of help information that has evolved over the past 20 years.

• Chunking: Chunking can help users remember information [37]. Certain command, number, letter and keystroke sequences are frequently grouped for easier understanding e.g. numbers, command and hotkey combinations.

4.3 Using Text-To-Speech Output In the past studies have shown that people find synthesized speech more difficult to understand [12]. However, for many online help systems the option of using human speech recordings is not a realistic one since help material can evolve and change over time. The cost of creating and updating human speech recordings can be very significant [18].

4.3.1 Using Speech Synthesis Markup Language The role of Speech Synthesis Markup Language is to give authors of synthesizable content the capability to control and customize a numbers of attributes of the speech output such as

Figure 3: Help Topic Elements

36

help topics that are commonly used in online help systems for software applications.

Such help topics are typically short, have a well defined format and a simple sentence structure. This type of topic can be effectively presented aurally. Figure 3 shows a sample topic. Each speech help topic is constructed from a number of components, some of which are optional.

6. IMPLEMENTATION IN GAMES The auditory help system was integrated in a number of fairly basic open source games that run on the Microsoft Windows platform. The source code was required so that the games could be modified, and re-built, to incorporate the speech help system functionality.

6.1 Speech Technology Selection Standards such as XHTML+Voice and Speech Application Language Tags (SALT) enable multimodal access to web based information and services. These technologies could be used in the future for auditory help systems. However, the Microsoft Speech Application Interface (SAPI) was used for the initial implementation, since it allows for better integration with small standalone games.

Figure 4: Help Topic Structure Each topic has a title. This is followed by some optional introductory text that may be used to supply additional conceptual information relating to the help topic. There may be several subheadings within a single topic. The core detail of a topic is often a list of steps associated with a task-oriented procedure. Topics can optionally end with a trailing notes section that typically outlines some special case information.

6.1.1 Supporting Sonification with SAPI The Speech Synthesis Markup Language (SSML) “mark” element [52] can be used to trigger playback of non-speech sounds in synchronization with the synthesized speech output.

5.2 Unsuitability of HTML-based Formats At present many of the most widely used help system storage formats are HTML-based. Topics are often authored and viewed using HTML editing tools. The resulting topics contain elements that relate to both structure and presentation. For example, the HTML

element (header level 3) should refer to the logical structure of the information, but may have been used to obtain a particular display style. The use of tags to make text appear visually bold or italic does not give information as to why it should appear that way. For example, some help files used bold text to highlight a topic header, and also used bold text to refer to elements of the user interface.

Figure 5. Use of SSML "Mark" Element

The availability of accurate structural information relating to help topics is important for optimal rendering through speech. For example, the structural information associated with various header levels could be used for inserting appropriate pauses in the speech output. The blurring between structure and presentation means that it is difficult to present some preexisting HTML based help material optimally using speech. The use of a presentation-independent help storage format will facilitate much more effective presentation of the help information using speech.

The User Interface Model (Figure 1) was important in guiding an effective sonification implementation i.e. which sounds were used, and how they were used. The initial system implementation uses a small number of non-speech sounds. These are used to present status information, and information relating to help topic structure, e.g. title, bullet points, trailing notes, etc. Help topics are stored in XML files that are independent of the presentation mechanism. Thus, the same base help topics can be transformed for presentation aurally to the user, or displayed graphically as text.

5.3 Selection of DITA There are a number of architectures, tools and technologies that could be used to support a single source help system repository. Two of the most widely used XML based ones are DocBook and Darwin Information Typing Architecture (DITA). DocBook is well suited to support generation of a continuous technical narrative, usually in a book or article format. Using transforms it is possible to chunk this technical narrative into a number of distinct topics that could be used in a help system. However, DITA is a more appropriate choice for use with an auditory help system. DITA was designed for discrete technical topics. Each DITA topic is a chunk of information on a single subject. Structurally, it is a title followed by text and images, optionally organized into sections. It also supports distinct topic types (concept, task, reference) that map well to the types of

Help topic XML files are transformed to SSML and submitted to the TTS engine. The SSML “mark” element is used to identify a particular point in a document. The SSML input to the TTS engine includes mark elements relating to help topic structure. The TTS engine notifies the help system when a mark element is encountered and the appropriate non-speech sounds are generated in a synchronized manner with the synthesized speech.

6.2 User Controls When presenting a help topic aurally it is important that the listener has the ability to control the speech help system output [28]. For example, while listening to a topic the player may want to temporarily pause the output, or perhaps listen to a

37

• Help Topics with Multiple SubHeadings: Some help topics consist of several related subtopics, each with a subheading. Typically the player will only be interested in one of the subtopics. While visually examining such a help topic the player can scan through the topic to find the relevant subsection. However, presenting all the subtopics sequentially is inefficient aurally because the player must listen to each subtopic in turn and determine if that is the relevant one. When subtopics are used we plan to explore the effectiveness of players being allowed to listen to and select from subtopic headings.

topic again. As a result standard multimedia controls for auditory help output such as play, pause, stop, repeat and mute must be supported. These controls can be activated via the mouse/keyboard, or using spoken commands that are processed by a speech recognition engine. The European Telecommunications Standards Institute Technical Committee Human Factors (ETSI-HF) has created a standard basic set of spoken commands that supports media controls [22]. These commands have been designed to be easy to learn, memorable, natural and unambiguous. The commands are suitable for use in speaker independent applications, so training of the speech recognition engine on a per-player basis not is required. These commands are used for the auditory help system.

• Importance of Context: Shorter task-oriented context specific help topics are especially suitable for presentation aurally. For this to work effectively it requires a good deal of coordination between the design of the help material and integration of support for context-specific help i.e. the game code must be capable of selecting and presenting an appropriate help topic at a given stage in the game play.

Studies on speech applications [57] have shown that it is desirable to provide the user with some feedback regarding the status of speech recognition component. Our system implements the commonly used techniques of displaying a microphone input signal level meter and the current speech recognizer state.

6.5 Future Work We are planning to perform controlled studies to quantify the effectiveness of the auditory help system when used in a game context. These include:

6.3 Games The games used for the trial implementation were open source versions of Sokobahn (http://www.jabbah.nl/sokobanpro) and Reversi (http://www.brainjar.com/dotNet/Reversi/).

• Speech Dialog Elements, Search and Navigation: Users will often consult other expert users when they encounter a problem. These experts may be co-workers or family members located in the same room, or perhaps a technical support representative at the other end of a phone connection. In these situations speech is the primary method of communication used to provide assistance. We are exploring using elements of interactive spoken dialog systems [36] to optimize speech output and to support search and navigation within the help system.

The initial focus of the work has been to support presentation of context-specific help topics. The games were modified to add support for context-specific help (neither game had this capability). Thus the game can let the player know that there is a relevant help topic available using audio and visual alerts. The player can then access that context-specific help topic if required. A range of context specific help topics were designed.

6.4 Initial Findings

• Customized Help: The help needs and preferences of users differ with expertise level [20, 26]. Use of multidimensional categorization [1, 6] of users can enable creation of more targeted and relevant help topics. The topic of dynamic generation of help material is especially important when presenting help material through speech.

During the process of creating the first working prototypes, we conducted usability evaluations in the style of Nielsen’s formative evaluation philosophy [40]. This involved changing and retesting the auditory help system as soon as usability problems are uncovered. This highlighted a number of significant challenges in creation and presentation of auditory help topics.

• Improving Integration of Game and Help System Audio: In relation to more multiplex game conditions, the importance of audio during game play is becoming increasingly recognized both from a functional and aesthetical point of view. This has led to increased efforts by the industry in implementing more immersive and complex sonic attributes in games. It is not unusual that games now include all three audio categories – music, sound effects, and dialogue [5]. This means that from an entertainment standpoint, the audio environment is complex and multistreamed. However, these sonic elements are also increasingly interactive and tied closely with game-play, contributing to informing the user about game-states, alerts, warnings and moods. Therefore, with careful sonic consideration, sound effects, dialogue and music may also be used to contribute to the auditory help system. We plan to implement 3-D sound in later projects as we envisage that this will reduce auditory interference between streams by separating them more obviously and will add another dimension to aural data presentation.

• Listening to Auditory Help Topics: The significant differences between written and spoken versions of a language means that speech help topics must be listened to (as opposed to being just read) to ensure correctness and intelligibility. A speech synthesis engine will automatically apply its own rules for synthesizing speech. However, the synthesized speech output speech may not always be correct, or optimal. Use of the markup language capabilities will allow the author to override and change the synthesized speech output where required. Also, there are significant differences in the quality of the speech output between different speech synthesis engines [53]. It is important to test with the actual engine that is to be used in the deployed application. • Attempts to use Existing Help Material: The Reversi game already had some HTML help material. However, the existing material was not very suitable for speech presentation. The reasons for this included the length of the topic and extensive use of elements such as graphics and screenshots. In general, significant restructuring of pre-existing help material is required for effective use in an auditory help context.

7. SUMMARY Speech technology is being used in a growing number of fields as a result of increasing capabilities of computing platforms and

38

[20] Dreyfus, H. L., and Dreyfus S. E. Mind over machine: the power of human intuition and expertise in the era of the computer. The Free Press, New York, NY, 1986.

on-going improvements in speech technology performance. An auditory help system can be used to tackle some of the wellknown challenges of providing help in a game play scenario, and offers opportunities to enhance the gaming experience. This paper describes how an auditory help system can be effectively incorporated in a game, and discusses the design of help information suitable for use in such a scenario.

[21] Edwards, A. Using sounds to convey complex information. In: A. Schick and M. Klatte (Eds.), Contributions to Psychological Acoustics: Results of the Seventh Oldenburg Symposium on Psychological Acoustics. Bibliotheks- und Informationssystem der Universität Oldenburg, 1997.

8. REFERENCES

[22] ETSI, Generic spoken command vocabulary for ICT devices and services. ETSI ES 202 076. V1.1.2. (2002-11).

[1] Albers, M. J. Multidimensional analysis for custom content for multiple audiences. SIGDOC 2003, 1-5.

[23] Farkas, D. K. The Logical and Rhetorical Construction of Procedural Discourse, Technical Communication, Volume 46, Number 1, February 1999, pp. 42-54(13).

[2] Baddeley, A. D. Your Memory - A User's Guide. Prion, London, 1996. [3] Baddeley, A. D. The episodic buffer: a new component of working memory? Trends in Cognitive Science, 4, 2000.

[24] de Fockert, J. W., Rees, G., Frith, C. D., and Lavie, N. The role of working memory in visual selective attention. Science, Vol. 291, No. 5509, 2001, 1803-1806.

[4] Baddeley, A. D. and Hitch, G. Working memory. In: G.H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory, Vol. 8. Academic Press, New York, 1974, 47-89.

[25] Gee, J.P., What Video Games Have to Teach Us About Learning and Literacy, ISBN: 1403965382, 2003. [26] Hackos, J.T, and Stevens, D. Standards for online communication. ISBN: 0471156957, 1997.

[5] Barry, I. Game Design. In: S. Rabin (Ed.), Introduction to Game Development. ISBN: 1584503777, 2005.

[27] Hayhoe, G.F. Creating usable online documents for wireless and handheld devices. IEEE IPC Conference, 2001.

[6] Bernsen, N. O., Dybkjr, H., and Dybkjr, L. Designing Interactive Speech Systems. ISBN: 3540760482, 1999. [7] Berz, W. L. Working Memory in Music: A Theoretical Model. Music Perception, Vol. 12, No. 3, 1995, 353-364.

[28] Holzman, T. G., Speech-audio interface for medical information management in field environments. International Journal of Speech Technology 4 (3-4), 2001.

[8] Bonnel, A. M., and Hafter, E. R. Divided attention between simultaneous auditory and visual signals. Perception & Psychophysics, 60 (2), 1998, 179-190.

[29] Johnson, J. A. and Zatorre, R. J. Attention to Simultaneous Unrelated Auditory and Visual Events: Behavioral and Neural Correlates. Advance Access, 2005.

[9] Bregman, A. S. Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, 1990.

[30] Jones, T. C., Jacoby, L. L., and Gellis, L. Cross-modal feature and conjunction errors in recognition memory. Journal of Memory and Language, 44, 2001, 131-152.

[10] Carroll, J. M. The Nurnberg Funnel: designing minimalist instruction for practical computer skill. MIT Press, 1990.

[31] Jones, D. M. and Macken, W. J. Irrelevant Tones Produce an Irrelevant Speech Effect: Implications for Phonological Coding in Working Memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, Vol. 19, No.2, 1993, 369-381.

[11] Carroll, J. M., and Rosson, M. B. The paradox of the active user. Interfacing Thought: Cognitive Aspects of Human-Computer Interaction. MIT Press, 1987. [12] CCIR-5. User attitudes towards real and synthetic speech. Centre for Communication Interface Research, University of Edinburgh, 1999.

[32] Jones, D. M., Madden, C., and Miles, C. Privileged access by irrelevant speech to short-term memory: The role of changing state. Quarterly Journal of Experimental Psychology, 44A, 1992, 645-669.

[13] Cinel, C., Humphreys, G., and Poli, R. Cross-modal illusory conjunctions between vision and touch. Journal of Experimental Psychology: Human Perception and Performance, 28 (5), 2002, 1243-1266.

[33] Kearsley, G. Online Help Systems: Design and Implementation. Ablex Publishing Corp., 1988.

[14] Cohen, G., Kiss, G., and Le Voi, M. Memory - Current Issues, second edition. In: J. Greene (Series Editor). Open Guides to Psychology. Open University Press, 1993.

[34] Kehoe, A., and Pitt, I. Designing Help Topics for use with Text-To-Speech, SIGDOC 2006.

[15] Cohen, M., Giangola, J., and Balogh, J. Voice User Interface Design, ISBN: 0321185765, 2004.

[35] Koster, R. A Theory of Fun for Game Design ISBN: 1932111972, 2004.

[16] Conrad, R. Very Brief Delay of Immediate Recall, Quarterly Journal of Experimental Psychology, 12, 1960, 45-47.

[36] McTear, M. F., Spoken Dialogue Technology: Enabling the Conversational User Interface. ACM Computing Surveys (CSUR), ACM Press, NY, USA, 2002, 1-85.

[17] Crowder, R. G., and Morton, J. Precategorical acoustic storage. Perception & Psychophysics, 5, 1969, 365-373.

[37] Miller, G. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review 63, 1956, 81-97.

[18] Davison, G., Murphy, S., and Wong, R. The use of eBooks and interactive multimedia as alternative forms of technical documentation. SIGDOC 2005, 108-115.

[38] Moreno, R., and Mayer, R. E., A learner-centered approach to multimedia explanations: Deriving instructional design principles from cognitive theory. Interactive Multimedia Electronic Journal of Computer Enhanced Learning, http://imej.wfu.edu. 2000.

[19] Deutsch, D. Tones and Numbers: Specificity of Interference In Immediate Memory. Science, 168, 1970, 1604-1605.

39

[39] Nass, C., and Lee, K.M. Does computer-generated speech manifest personality? SIGCHI conference on Human factors in computing systems, 2000, 329-336.

[49] Schendel, Z. A., and Palmer, C. Suppression effects on musical and verbal memory. Memory and Cognition, 2006.

[40] Nielsen, J. Usability Engineering, ISBN: 0125184069, 1994

[50] Semal, C., Demany, L., Ueda, K., and Hallé, P. A. Speech versus nonspeech in pitch memory. The Journal of the Acoustical Society of America, Vol. 100, No. 2, 1996.

[41] Oviatt, S. User-centered modeling for spoken language and multimodal interfaces. IEEE Multimedia 3(4), 1996, 26-35.

[51] Shneiderman B., and Plaisant C. Designing the User Interface. ISBN: 0321197860, 2004.

[42] Pandzic, I. S., Ostermann, J., and Millen, D. User Evaluation: Synthetic Talking Faces for Interactive Services, The Visual Computer Journal, Vol. 15, No. 7-8, Springer-Verlag, Berlin, 1999, 330-40.

[52] SSML, www.w3.org/TR/speech-synthesis

[43] Pitt, I., and Edwards A. Design of Speech Based Devices. ISBN: 1852334363, 2002.

[54] Taylor, M. M., Lindsay, P. H., and Forbes, S. M. Quantification of shared capacity processing in auditory and visual discrimination. Acta Psychologica, 27, 1967, 223-229.

[53] Stevens C., Lees N., Vonwiller J., and Burnham D. Online experimental methods to evaluate text-to-speech. Computer Speech and Language 19, 2005, 129–146.

[44] Proctor, R. W., and Proctor, J. D. Secondary tasks modality, expectancy, and the measurement of attentional capacity. Journal of Experimental Psychology: Human Perception & Performance, 5, 1979, 610-624.

[55] Treisman, A. M., and Davies, A. Dividing attention to ear and eye. In: S. Kornblum (Ed.). Attention and performance IV. Academic Press, New York, 1973, 101-117.

[45] Quirk, R., and Greenbaum, S., A Concise Grammar of Contemporary English. Harcourt Brace Jovanovich Inc., New York, 1973.

[56] Vilimek, R., and Hempel, T. - Siemens AG Corporate Technology. Effects of Speech and Non-Speech Sounds on Short-Term Memory and Possible Implications for InVehicle Use. International Conference on Auditory Display, 2005.

[46] Richard, J. A. Remember the Source: Effects of Divided Attention on Source Memory for Modality with Visual and Auditory Stimuli. Bryn Mawr College.

[57] Yankelovich, N., Levow, G., A., and Marx, M. Designing SpeechActs: issues in speech user interfaces, SIGCHI, 1995.

[47] Rockley, A. The impact of single sourcing and technology. Technical Communication 48:2, 2001, 189-193. [48] Salamé, P., and Baddeley, A. D. Effects of background music on phonological short-term memory. Quarterly Journal of Experimental Psychology, Vol. 41A, 1989.

40

FPS Game Performance in Wi-Fi Networks Yanni Ellen Liu1,3 , Jing Wang1,3 , Michael Kwok2 , Jeff Diamond3 , Michel Toulouse1,3 1

Department of Computer Science, University of Manitoba, Winnipeg, Canada Email: {yliu,jingwang,toulouse}@cs.umanitoba.ca 2 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada Email: [email protected] 3 TRLabs, Winnipeg, Canada Email: [email protected]

ABSTRACT

and the game server. Different types of games may have different quality-of-service (QoS) requirements on the underlying network. Such requirements may be in terms of the delay and loss ratio performance in delivering the game traffic. Among types of MOGs, first-person-shooter (FPS) games [1, 2, 3, 4, 5] often have the most stringent requirement on network performance because of the highly interactive nature of such games.

Multi-player online games (MOGs) have become increasingly popular on today’s Internet. Meanwhile, the IEEE 802.11 (Wi-Fi) wireless networks have been widely used. We study how well an underlying 802.11g network supports a first-person-shooter (FPS) game, often considered the most demanding MOG in terms of network performance. We measure the latency and loss ratio performance experienced by the game traffic; these network-layer metrics were shown to have large impact on the gaming quality experienced at the application layer. The effect on performance of the following factors were examined: the number of game clients, the distance between game clients and the wireless access point, the enabling of data encryption, and the inclusion of FTP and video streaming background traffic. Our experimental results show that FTP and video streaming traffic significantly affect the game traffic performance, whereas the distance and the use of data encryption have rather minimal impact. We also observe that when the amount of background traffic is moderate to high, the performance degrades as the number of game clients increases. Based on our observations, we suggest QoS strategies that may be used to better support games in a Wi-Fi environment.

Most game clients today are connected to their respective game servers via wired networks. Over the past few years, the IEEE 802.11 (Wi-Fi) [11] wireless networks have gained wide deployment. Wi-Fi network access points are commonly seen in coffee shops, office buildings, university campus, airports, and many residential homes. The capacity of Wi-Fi networks has also kept increasing. A Wi-Fi network interface has become a standard built-in on many of today’s laptop computers. In view of these advances, it is anticipated that participating in a MOG from a Wi-Fi environment may become more and more common. To better support MOGs, the capability of underlying networks needs to be evaluated with respect to the QoS requirements of these games. Such an evaluation for wired networks has been carried out in the literature (see [7, 19] for example). In comparison, less attention has been paid to the investigation of MOGs in wireless networks. In a Wi-Fi wireless network, many factors may affect the game performance. These include the number of wireless game clients, non-game traffic sharing the same Wi-Fi network with the game traffic, the wireless protocol, and physical environment parameters such as the distance and clearance of sight between wireless clients and the access point, humidity, and interference with other wireless devices. In what follows, we will refer to the non-game traffic sharing the same Wi-Fi network with the game traffic as the background traffic.

Keywords Multi-player games, IEEE 802.11g, wireless, Wi-Fi, QoSsupport, experimentation, performance, measurement

1. INTRODUCTION Multi-player on-line games (MOGs) are computer games in which multiple game players simultaneously participate in the same game session over a computer network. Such games are increasingly popular on today’s Internet due to the availability of high-speed networks and affordable high performance personal computers. One popular system architecture for MOGs is client-server where each game client is connected to a game server via a computer network. State update messages are transmitted between the game clients

In this paper, we study the performance of FPS games in an IEEE 802.11g wireless network. We use emulated game traffic as well as other types of background traffic, and measure the latency and loss ratio performance perceived by the game traffic at the network layer, which are referred to as game traffic performance in this paper. These two metrics were shown to have large impact on the game performance perceived at the application layer [7, 19]. Differ from many existing studies in which simulation modeling was used, we take an experimental approach and evaluate the performance by measurement. The effect on perfor-

41

its next state update messages to clients. The shorter the latency in delivering the state update from the client to the server and then to the affected clients, the more realistic the game play. Many works have been carried out to study the traffic characteristics and the performance of FPS games. In general, traffic generated by FPS games has been identified to have small packet sizes, regular inter-arrival intervals, and relatively low bit rate [13, 18].

mance of the following factors were examined: the number of game clients, the distance between game clients and the wireless access point, the enabling of data encryption, and the inclusion of FTP and video streaming background traffic. Our experimental results show that FTP and video streaming traffic significantly affect the game traffic performance, whereas the distance and the use of data encryption have rather minimal impact. The large impact of background video streaming traffic is as expected because, same as the game traffic, it is delivered using the UDP protocol, which does not possess congestion control; when video load is high, congestion occurs, performance deteriorates. Somewhat unexpected, though often considered elastic, the TCP based FTP traffic also degrades game performance greatly. Thus both TCP and UDP traffic may need to be regulated in order to adequately support games. We provide a detailed analysis explaining these observations. In addition to the above results, we also observe that when the amount of background traffic is moderate or high, the performance degrades as the number of game clients increases. We provide insight into why the number of game clients also affects the game traffic performance. Based on our observations, we suggest QoS strategies that may be used to better support games in a Wi-Fi environment.

For the performance of FPS games, researchers have studied it at both the network layer and application layer. In particular, the relationship between the network-layer performance and the user-perceived game performance has been examined, and requirements on the network-layer performance for good gaming experience at the application-layer have been established. It was recommended in [7] that players should avoid servers with ping times over 150 ms or packet loss ratios over 3%. Through real user studies, the same authors also found that shooting is greatly affected by latency. With even modest latency (75-100 ms), accuracy and number of kills can be reduced by up to 50%. In addition, they found that users rarely notice performance degradation with packet loss under 5% during a typical network game [7]; however, the effect of higher than 5% loss ratios was commented on. Note that high loss ratios often occur in a wireless environment. Some of our experimental results showed that the loss ratio could be significantly higher than 5%; we thus include the loss ratio result in our study. The importance of latency is also found in another work in which an average round-trip-time (RTT) of 100 ms between a game client and a game server was suggested [19].

In our investigation, we assume that a game server is located on the same local area network (LAN) as the Wi-Fi access point to which game clients are connected; the scenarios in which a Wi-Fi network acts as an access network to a wide area network (WAN) and the game server is remotely located from the game clients are not considered. It should be noted however, that our results would provide useful insight into the QoS support to games in a wide-area wireless/wired environment, when combined with results from the WAN performance studies. For example, assuming an end-to-end latency requirement for interactivity is known, given the average delay performance on the wireless segment of an end-to-end game traffic path, one may estimate what range of average latency is required in the wired segment of the same path.

The IEEE 802.11 standard suite includes multiple modulation techniques, all of which use the CSMA/CA media access control (MAC) protocol. Although a new 802.11n standard is being developed which is said to be much faster, the most widely used ones today are 802.11b and 802.11g standards, which have a maximum raw data rate of 11 Mbps and 54 Mbps, respectively. Due to protocol overhead, the maximum throughput that an application can achieve is typically much lower than the above figures [6, 8, 20]. Both 802.11b and 802.11g support the base station (or infrastructure) mode as well as the ad hoc mode. The former assumes the presence of wireless access points when forming a Wi-Fi network, and mobile nodes communicate via these access points; the latter assumes the formation of a Wi-Fi network without any access point, and mobile nodes communicate with each other directly. On a Wi-Fi network, one can set a RTS/CTS threshold value in bytes; when a frame to be sent has a size larger than this threshold, the frame sender first reserves the channel using the RTS/CTS frames before sending the frame. This is to reduce collision thus to improve channel efficiency. In practice, however, most deployed Wi-Fi networks choose a large threshold value, essentially disabling the channel reservation function [12]. In our study, the base station mode 802.11g with RTS/CTS disabled is considered because this setup is commonly seen in most deployed Wi-Fi networks. For security reasons, 802.11 also has an optional encryption standard at the MAC layer called Wired Equivalent Privacy (WEP); each frame can be encrypted before transmission.

The rest of the paper is organized as follows. In Section 2, we provide background information of this study and review the related literature. In Section 3, we describe the test bed that we set up to evaluate the performance of FPS games in an 802.11g network. In Section 4, we present and analyze our experimental results. In Section 5, we suggest some QoS strategies to support MOGs in a Wi-Fi environment. Finally, we conclude our study in Section 6.

2. BACKGROUND AND RELATED WORK First-person-shooter (FPS) is “a genre of computer and video games that is characterized by an on-screen view that simulates the in-game character’s point of view, and is centered around the act of aiming and shooting handheld weapons, usually with limited ammunition” [2]. In a FPS game, when a player makes a move, e.g., firing a bullet at an enemy player, the game client on the player’s machine includes this move in the next state update message sent to the game server. After processing this message, the server distributes the update to those players who are affected by the move in

In the literature, the support of Wi-Fi networks to MOGs

42

(iv) A background traffic client (BC)—has a Pentium III 733 MHz CPU, and 256 MB RAM. All 11 machines are installed with the Linux operating system (Fedora 3, Kernel 2.6.9-1.667). Since we used emulated game traffic in our evaluation (see Section 3.2), graphics rendering power on the game clients is not considered in this study. In our test bed, the GCs and the BC are on a wireless network, while the GS and the BS are on a wired network. Particularly, the GC machines are equipped with Linksys WMP54G Wireless-G PCI adapters, and the BC machine with a D-Link AirPlus G High-Speed 2.4GHz DWL-G510 wireless PCI adapter. Both GC and BC machines are associated with a wireless access point (CISCO AIRONET 1200 series, model#: AIR-AP1231G-A K9). Note that an advantage of using networking equipments from different vendors is that our test bed might closely reflect a real-life scenario. On the wired network, the GS and the BS are connected with the wireless access point via a U.S. Robotics 8054 router. The entire test bed was set up in an office environment in a one-floor building, where there are offices and cubicles.

Figure 1: Experimental test bed

has been studied. In [15], the performance of two FPS games, namely Half-life [3] and Quake 3 [4], was measured on an 802.11b network. The infrastructure mode with RTS/ CTS enabled was studied by way of experimentation. They found that 20 Half-life or 10 Quake 3 game players take more than 3.5 Mbps of bandwidth, even though the actual required bandwidth is less than 1 Mbps. Differ from that work, in this paper, we focus on 802.11g without RTS/CTS and various factors’ impact on games. In [14], the game traffic performance of FPS games on an 802.11g network was studied; a limited number of two factors were considered. It was found that constant-bit-rate UDP background traffic has a significant impact on game performance. In contrast to that work, we examine more factors and include more realistic background traffic, including both TCP and UDP traffic, namely the FTP and video streaming traffic. In [17, 16], a wireless home entertainment scenario in which multiple types of application share a Wi-Fi network was investigated. To adequately support online games, schemes at the MAC layer and transport layer were devised. These were achieved by modifying protocol parameters at these layers. As a result, the game performance was much improved. The performance metric used was delay jitter while in our study delay is targeted. In addition, simulation was used while our study is based on experimentation.

The wireless access point (AP) was configured using its default settings except that access control by MAC addresses on the AP was turned on and broadcasting SSID was disabled. It is worth noting that besides our Wi-Fi network, there are two other Wi-Fi networks that are in operation in the same building. We observe that their signal strength is comparatively weak (rated 3 to 5 out of 10 as seen from our test bed), when compared to that of our Wi-Fi network (rated 10 out of 10). We consider such an environment adequate for our purpose because in a practical wireless gaming environment, co-existence of multiple Wi-Fi networks may be likely, and some (low) degree of interference may be present.

3.2

Game traffic is sent between individual GCs and the GS using UDP. We used a traffic emulator from [13] to generate the game traffic of a FPS game, namely Half-life [3]. Specifically, from the GS to a GC, one packet is sent every 60 ms; the packet size follows a lognormal distribution with an average of 203 bytes and a standard deviation of 0.31 bytes. When there is more than one GC in a game session, at every timeout (i.e., 60 ms), the GS sends one packet to each GC in a row, following the order in which the GCs join the session (or initially connect to the GS). From a GC to the GS, on the other hand, one packet is sent every 41.5 ms, whose size follows a normal distribution with parameters (71.57, 6.84) in bytes.

3. TEST BED DESCRIPTION In this section, we describe the test bed, traffic model, and performance metrics, that we used for our evaluation.

3.1

Traffic Model

In our experiments, there are two classes of traffic on the test bed: game traffic and background traffic.

Test Bed Description

Our test bed environment, depicted in Fig. 1, consists of 11 machines. These include: (i) A game server (GS)—has a Pentium III 1.7 GHz CPU, and 512 MB RAM.

There are also two types of background traffic in our experiments: File Transfer Protocol (FTP) traffic and video streaming traffic. Both are considered popular on today’s computer networks. The former is a type of TCP traffic that is elastic in nature and reacts to congestion conditions in the network; the latter is a type of UDP traffic that is

(ii) Eight game clients (GCs)—each has a Pentium III 733 MHz CPU, and 256 MB RAM. (iii) A background traffic server (BS)—has a Pentium III 1.7 GHz CPU, and 1 GB RAM.

43

packets that are received by the GS, Ns , are obtained from the game traffic generator program. LRc2s is then given by (Nc − Ns )/Nc ∗ 100%. However, our preliminary results showed that LRc2s never exceeded 1%; thus, we will not show the results for LRc2s in the rest of this paper.

more QoS demanding and does not perform any congestion control during transmissions. We used an FTP program provided by Fedora, called ncftpget, to download a large file from the BS to the BC. In our experiments, the FTP session is started before a game session starts, and is terminated after the game session is completed. So, the FTP session is provisioned to be long-lived.

4.

The video streaming traffic is generated by an emulator that we developed. It sends video frames of various sizes according to video trace data described in [10]. Multiple video streams are included in our experiments. For each video stream, one video frame is sent every 40 ms. Based on the average bit rate of each trace, the number of video streams is varied to produce various levels of background video traffic. A total of seven video trace files are used in our study, and their bit rates range from 850 Kbps to 1.2 Mbps. In each experiment, all video streams are started in the first 40 ms, within which the actual start times are random. Similar to the FTP traffic, video streaming traffic is sent from the BS to the BC. This is under the assumption that users in a Wi-Fi environment may be more likely to act as content consumers rather than content producers. Our preliminary experimental results also indicated that the game traffic performance is much more significantly affected by the background traffic sent from the BS to the BC than that sent in the opposite direction. Again, in our experiments, all video sessions are started before the game traffic starts and are terminated after the game session ends.

3.3

EXPERIMENTS AND RESULTS

In this section, we describe the experiments that we performed and report on the results that we obtained. We first make use of 2k factorial experimental designs to identify important factors that affect the performance. We then investigate the impact of these important factors using an increased number of levels per factor in the experiments.

4.1 Effect of Distance on Performance There are a total of five factors in our experiments: (i) NC: the number of GCs in a game session, (ii) Video: bandwidth usage of background video traffic, (iii) FTP: with or without the background FTP traffic, (iv) WEP: enabling or disabling of data encryption, (v) Distance: the distance from the GCs to the AP. To facilitate our investigation of the distance factor, we consider the following two cases. In Case I, we have one game client (NC=1) and we vary the distance between two levels. At distance level 1, the GC is placed in the “experiment room” where the AP is located, and the distance to the AP is around 10 feet. (See Fig. 2 for the layout of our experiment environment.) At distance level 2, the GC is placed in a room (at the top left corner in Fig. 2) that is around 75 feet away from the AP. Case I is to see the impact of distance when there is only one game client in the Wi-Fi network. In Case II, we increase the number of game clients to 8 (i.e., NC=8). Similar to Case I, two distance levels are under consideration. At distance level 1, all 8 GCs are located in the experiment room. These GCs form a circle with a diameter of around 10 feet and the AP is positioned in the center of the circle. There are no physical obstacles between the AP and the GCs. At distance level 2, four GCs are remained in the experiment room. The other four GCs are spread out in the office environment (see Fig. 2); one is 25 feet away from the AP, two are 50 feet away, and one is 75 feet away. We attempt to use this placement of GCs to model a Wi-Fi gaming environment where players may sit at different locations within the Wi-Fi network. As shown in Fig. 2, there are solid walls and cubicle dividers between these four GCs and the AP. Finally, in both cases, the GS, BS, and BC are located in the experiment room.

Performance Metrics

In our evaluation, the performance metrics of interest are: (i) the average round-trip-time, RT T , from a GC to the GS, (ii) the packet loss ratio, LRs2c , for game packets sent from the GS to the GCs, and (iii) the packet loss ratio, LRc2s , for game packets sent from all the GCs to the GS. As described in Section 2, both latency and loss ratio performance at the network layer may have major impact on the game performance at the application layer. RT T is collected using the “ping” utility that is provided by the operating system. To reduce the negative impact that is brought in by additional ping traffic, the time interval between consecutive ICMP Echo Request packets of the ping utility is set to 1 s. This is much longer than the 41.5 ms time interval between consecutive game packets that are sent from a GC to the GS. This interval is not too large either; so it can accurately capture the RT T encountered by the game traffic. When there is more than one GC in an experiment, the GC that last joins the game session is used to collect the RT T results. Based on preliminary experiments, the difference in RT T performance experienced by the various GCs was found to be minimal. To calculate LRs2c , two variables, the total number of game packets that are sent by the GS to all the GCs in an experiment, Ns, and the total number of game packets received by all the GCs from the GS in an experiment, Nc, are maintained. For the latter, each GC records the number of packets received; Nc is obtained from the sum of all these values. Then, LRs2c is calculated by (Ns − Nc)/Ns ∗ 100%. Similarly, to calculate LRc2s , the total number of packets that are sent by all the GCs, Nc , and the total number of

As to factors (ii) to (iv), we also select two levels for each of them. The 24 factorial designs for Cases I and II are illustrated in Table 1. Particularly, for Video, we choose 0 and 22 Mbps, where 22 Mbps represents a heavy load condition on an 802.11g network [6, 8, 20]. Also, when WEP is enabled, a 128-bit shared key encryption is used. Each experiment is performed for a duration of 8 minutes. This length is considered representative in a typical FPS

44

75 feet machine

traffic performance.

4.2

50 feet machine

Based on the above findings, in all subsequent experiments, we fix the distance to level 1, meaning that we position all GCs in the experiment room. To learn the significance of the other four factors on performance, we conduct another 24 factorial design for these factors. For NC, two levels are experimented: 1 and 8. The upper bound of 8 is selected mainly due to our resource availability The levels for Video, FTP, and WEP are the same as shown in Table 1.

25 feet machine

Experiment Room

Cubicle Separator Solid Wall GC Locations

After obtaining the experimental results, we calculate the percentages of variation in the performance metrics that are explained by the factors and their interactions. These percentages are shown in Table 4. As before, only the terms larger than 1% are shown in the table. It can be observed that for RTT, only FTP, Video, and their interaction account for larger than 1% of variation. For loss ratio, NC also shows non-negligible effect. Again, the factor of WEP does not have significant impact on performance. In the following sections, we will focus on FTP, Video and NC and provide a detailed analysis of their impact on performance. In this analysis, we keep our distance at level 1 and enable WEP encryption.

Figure 2: Test bed environment layout Table 1: 24 Factorial Experimental Designs for Cases I and II Factor Video (Mbps) FTP WEP Distance level

Case I (NC=1) [0, 22] [with, without] [enabled, disabled] [1, 2]

Significance of Other Factors on Performance

Case II (NC=8) [0, 22] [with, without] [enabled, disabled] [1, 2]

game session [9]. Each experiment is also repeated for six times. The 95% confidence intervals are computed. Since our results show that the width of the confidence intervals are extremely small when compared to the sample mean, we only report on the sample mean of our performance metrics.

Table 4: Percentage of Variation Explained (%) Metric RTT Loss ratio

Based on the results obtained from our experiments, we calculate the percentages of variation in the results that are explained by each factor and their interactions. This is summarized in Tables 2 and 3 for Cases I and II, respectively. Note that only those factors and factor interactions whose percentages of variation are larger than 1% are shown in these tables. We find that in both cases, FTP, Video, and

4.3

FTP 86.51 33.51

Video 2.23 24.73

FTP+Video 10.30 4.62

NC

NC+FTP

24.89

9.70

Effect of FTP and Video Traffic

We first examine the effect of FTP traffic. In Fig. 3, we plot the results for RTT against the amount of video streaming traffic when with and without FTP traffic for the case of NC=1. The corresponding results for loss ratio are shown in Fig. 4. Note that in these results, the highest video load level is 26 Mbps. This is higher than the 22 Mbps used in the 24 designs above. The higher levels are selected to illustrate what would happen when the wireless channel is approaching saturation. On the other hand, in the 24 designs, it was expected that the video load level is kept at 22 Mbps or lower in order to avoid starvation of other traffic on a shared 802.11g network. The conclusions from 24 designs above would remain valid had 26 Mbps been used instead.

Table 2: Percentage of Variation Explained in Case I (%) Metric FTP Video FTP+Video RTT 86.21 5.16 7.5 Loss ratio 65.55 30.95 2.96

Table 3: Percentage of Variation Explained in Case II (%) Metric FTP Video FTP+Video RTT 82.87 2.23 11.25 Loss ratio 27.75 45.00 24.53

We observe that when there is only one GC, the inclusion of FTP traffic greatly lowers the performance of game traffic. With FTP traffic, when the amount of video traffic varies, RTT ranges from 15 to 25 ms; without FTP traffic, RTT is below 10 ms for most video load levels. As to the loss ratio, the largest difference between with and without FTP traffic approaches 3%.

their interaction account for most of the variation in the results of RTT and loss ratio. In contrast, distance, WEP, or any factor interaction involving these two factors explains lower than 1% of variation. We conclude that for the scenarios of interest where distance is varied, the distance and WEP factors do not have a significant impact on the game

The performance degradation when there is FTP traffic can be explained as follows. FTP traffic is delivered using the TCP protocol at the transport layer. TCP congestion control uses network loss events to infer the congestion level in

45

using the recommended upper bound of 3% on game traffic loss ratio [7], when there is just one game client with FTP, background video should be kept below 22 Mbps; without FTP, it should be kept below 25 Mbps.

Distance=1; WEP=Yes; # of GC=1

Average Round Trip Time (ms)

25

20

We next examine the impact of video traffic on the game traffic performance. In Fig. 3, we observe that without FTP traffic, the RTT performance deteriorates as the video traffic load is increased because the average queue length grows as the load is increased. Contrarily, with FTP traffic, the RTT performance actually becomes slightly better as the video traffic load is increased. This may be because as the video load becomes higher, with TCP congestion control, the FTP traffic backs off more significantly, leading to better RTT performance. For the loss ratio, we observe in Fig. 4 that for both levels of FTP, the higher the video traffic load, the higher the loss ratio. Without FTP traffic, the loss ratio has a sharp increase between the video load levels of 24 and 26 Mbps; with FTP traffic, this degradation of loss ratio is more gradual. This is because of the “damping” effect of TCP traffic, which affects the dynamics of the queue length. Specifically, when there is only UDP traffic, there would be a sharp increase in queue length as the channel approaches saturation. But with FTP traffic, the queue tends to stay relatively full all the time and such a sharp increase in queue length at high video traffic load would not occur.

15

10

With FTP Without FTP

5

0 0

4

8 12 16 Video Stream Load (Mbps)

20

24

Figure 3: RTT vs. amount of video traffic (NC=1) Distance=1; WEP=Yes; # of GC=1 5

With FTP Without FTP

Loss Ratio (%)

4

3

2

1

The RTT and loss ratio performance when NC=8 is plotted in Figs. 5 and 6, respectively. Similar to the case of NC=1, both graphs show large performance degradation with FTP traffic. As to the impact of video streaming traffic on performance, without FTP traffic, the same trend as in the case of NC=1 is observed on the results for both RTT and loss ratio. With FTP traffic, on the other hand, it is important to note that although some fluctuation, the RTT of game traffic stays at around 20 ms regardless of the amount of background video. This may be because the buffer at the AP is almost full nearly all the time. Hence, on average the game traffic always encounters queueing delay that approximates the time it takes to service a full buffer of data. For the loss ratio, it slowly increases when the amount of video traffic is lower than 20 Mbps. This is due to the fact that the TCP congestion control effectively reduces the FTP traffic sending rate so that the loss at the AP buffer is maintained at a low level. However, when the video traffic load is increased above 20 Mbps, the amount of video traffic is so large that even if TCP remains at the slow-start phase (meaning that the FTP traffic is at the minimal sending rate), the channel is already overloaded by the UDP traffic. Therefore, a sharp increase in loss ratio occurs. This observation implies that in order to achieve good game loss ratio performance, the amount of video traffic needs to be controlled.

0 0

4

8 12 16 Video Stream Load (Mbps)

Figure 4: Loss ratio vs. (NC=1)

20

24

amount of video traffic

the network and attempts to explore and use the maximum available bandwidth in the traffic path. When a queue (at a router or an AP) inside the network is overflowed, loss events occur, and congestion control is triggered. This implies that TCP may tend to keep the queue length within the buffer size limit. At the same time, because TCP endeavors to make use of the available bandwidth as much as possible, it tends to keep the bottleneck queue relatively full. In our test bed network, the major queue in the network is the one at the AP along the server to client direction. When the video load is low to moderate and there is no FTP traffic, the queue is relatively short, yielding good RTT and loss ratio performance. With FTP traffic, however, the queue is relatively full with the FTP traffic. With first-come-firstserved (FCFS) scheduling at this queue, the game traffic will experience a much longer queue, when compared to the case of no FTP traffic, resulting in worse performance. When the video traffic load is high enough (e.g., above 24 Mbps), even without FTP traffic, the wireless channel is approaching saturation and the queue starts to overflow. In this case, it is observed that the performance is equally bad regardless if there is FTP traffic or not. These observations indicate that QoS strategies may be needed in order to well support game traffic when there is FTP background traffic in the network.

Our experimental results also indicate that when the number of game clients is 8, with FTP the game traffic loss rate exceeds 3% irrespective of video load level, thus for adequate game support FTP traffic should be kept to a minimum. Without FTP, the amount of video should be kept below 22 Mbps. We conclude that on our 802.11g network, the inclusion of FTP traffic, which is delivered using the TCP protocol, sig-

The absolute values from these experiments indicate that,

46

Distance=1; WEP=Yes; # of GC=8

Distance=1; WEP=Yes; FTP=No 25

20

Average Round Trip Time (ms)

Average Round Trip Time (ms)

25

With FTP Without FTP

15

10

5

0 0

4

8 12 16 Video Stream Load (Mbps)

20

16Mbps Video 20Mbps Video 22Mbps Video 24Mbps Video 26Mbps Video

20

15

10

5

24 0 1

2

4 6 Number of Game Clients

Figure 5: RTT vs. amount of video traffic (NC=8) Distance=1; WEP=Yes; # of GC=8

8

Figure 7: RTT vs. number of GCs (no FTP)

12 Distance=1; WEP=Yes; FTP=No 9 With FTP Without FTP

6 Loss Ratio (%)

Loss Ratio (%)

9

3

16Mbps Video 20Mbps Video 22Mbps Video 24Mbps Video 26Mbps Video

6

3

0 0

4

8 12 16 Video Stream Load (Mbps)

Figure 6: Loss ratio vs. (NC=8)

20

24

0

amount of video traffic

1

2

4 6 Number of Game Clients

8

Figure 8: Loss ratio vs. number of GCs (no FTP) nificantly affects the game traffic performance. Similarly, the amount of video background traffic, which, like the game traffic, is delivered using UDP, also has large impact on the game traffic performance. To provide good gaming experience, proper QoS strategies are needed. traffic, the queue is kept at an “almost full” level by TCP congestion control. With FCFS channel scheduling, queueing delay encountered by the game traffic is about the same, which corresponds to the time it takes to empty (or serve) an almost-full queue. In contrast, the number of GCs does have a large impact on the loss ratio performance of game traffic. The more GCs, the higher the loss ratio. This can be explained by the relationship between the loss ratio and the state update burst size, as described previously for the case of no FTP traffic. Particularly, with FTP traffic, the buffer at AP is kept relatively full. When NC is increased, a longer burst of game packets arrives at the queue, resulting in a higher chance of buffer overflow.

4.4 Effect of NC To evaluate the effect of NC, combinations of video and FTP traffic are considered. For each combination, NC is varied from 1 to 8. Consider first the scenario where there is no FTP traffic. The results for RTT and loss ratio are shown in Figs. 7 and 8, respectively. We observe that NC has large impact on performance, especially the loss ratio, at heavy video traffic load. This phenomenon may be explained as follows. In a game session, at regular time intervals, the server sends state update messages back-to-back to all GCs. These updates form a burst—a train of packets. The higher the number of GCs, the longer the burst. When the video traffic load is heavy, the queue at the AP is almost full, a longer burst would result in higher loss of game packets.

We conclude that the number of GCs largely affects the loss ratio performance experienced by game traffic. When there is no FTP traffic, this holds only when the video traffic load is high. With FTP traffic, this holds for all video load levels. For the RTT performance, when there is no FTP traffic, the number of GCs has some impact also when video traffic load is high. However, the RTT results always stay at the range of 15 to 25 ms when the FTP traffic is present.

Consider next the presence of FTP traffic. The RTT and loss ratio performance is shown in Figs. 9 and 10, respectively. It can be observed that when there is background FTP traffic, the RTT of game traffic is kept at a high level regardless of the number of GCs. This is because when with FTP

47

to the game traffic. In terms of implementation, a separate queue may be maintained for the game traffic, and the other types of traffic share another queue. Whenever the channel becomes idle, the channel scheduler serves the traffic in the game traffic queue, if it is non-empty; otherwise, the other queue is attended. Within each of these two queues, FCFS scheduling may suffice. Prioritizing game traffic may pose a problem when the total bandwidth consumed by the game traffic is high because this may lead to the game traffic starving the other traffic. However, FPS games have been identified to have low bit rate. Thus, this strategy can be viable with moderate game traffic load. In addition, this strategy may require cross-layer processing in order to identify the game traffic carried within MAC-layer frames. Evaluating the performance implications of this overhead is a topic of further research.

Distance=1; WEP=Yes; FTP=Yes

Average Round Trip Time (ms)

25

20

15

10

FTP Only 4Mbps Video + FTP 8Mbps Video + FTP 12Mbps Video + FTP 16Mbps Video + FTP 20Mbps Video + FTP 22Mbps Video + FTP 24Mbps Video + FTP

5

0 1

2

4 6 Number of Game Clients

8

Comparing the above two strategies, the one at the application layer does not require any modification at the AP. The downside is that it may result in a low network utilization. The strategy at the MAC layer, on the other hand, requires modification at the AP, but may yield higher network utilization. Further investigation of these two strategies is part of our future work.

Figure 9: RTT vs. number of GCs (with FTP) Distance=1; WEP=Yes; FTP=Yes 12 FTP Only 4Mbps Video + FTP 8Mbps Video + FTP 12Mbps Video + FTP 16Mbps Video + FTP 20Mbps Video + FTP 22Mbps Video + FTP 24Mbps Video + FTP

Loss Ratio (%)

9

6.

6

3

0 1

2

4 6 Number of Game Clients

CONCLUSIONS

In this paper, using an experimental approach, we have evaluated the impact of various factors, namely the distance between a game client and the wireless access point, the enabling of data encryption, the inclusion of FTP and video streaming background traffic, and the number of game clients on the game traffic performance in terms of RTT and loss ratio performance on an IEEE 802.11g network. We have found that FTP and video streaming traffic significantly affect the performance. The number of game clients is important when the aggregated background traffic load is high. In particular, it has higher impact on loss ratio than on RTT. Comparatively, distance and WEP have minimal effect. Finally, based on our observations, we have suggested two QoS strategies at the application and MAC layers, that may be used to improve the support of a Wi-Fi network to a MOG. As suggestions for future research, proper pricing schemes for QoS strategies in a Wi-Fi gaming environment can be studied. Furthermore, experimentation that links network-layer performance and user-level performance such as user satisfaction can be carried out.

8

Figure 10: Loss ratio vs. number of GCs (with FTP)

5. QOS STRATEGIES IN A WI-FI GAMING ENVIRONMENT In this section, we discuss possible QoS strategies that one may devise to better support games in a Wi-Fi environment that may be shared by FTP and video background traffic. Assuming no change is to be made at the transport layer, i.e., with TCP and UDP, we discuss strategies at the application layer and inside the network respectively.

7.

At the application layer, a possible strategy is to throttle both FTP and video background traffic when game sessions are on-going. This may imply certain restrictions on the applications that other wireless clients sharing the same Wi-Fi network with the game clients can run. From our study, we have observed that both FTP and video traffic significantly affect the game traffic performance, especially when the aggregated load level is high; therefore, throttling their traffic may leave ample bandwidth on a Wi-Fi network to game traffic and better support the game sessions.

ACKNOWLEDGMENTS

We gratefully acknowledge TRLabs Winnipeg and the Department of Computer Science at the University of Manitoba for their support to this project. We also thank anonymous reviewers for providing very insightful feedback and suggestions, which helped improve the quality of our paper. Many thanks to John Tromp who helped with polishing the graphs and some text.

8.

REFERENCES

[1] Counter-strike. http://www.counter-strike.net/.

Inside a Wi-Fi network, QoS strategies refer to the queue management and channel scheduling algorithms inside the AP at the MAC layer. A possible strategy is to employ priority-based scheduling in which a higher priority is given

[2] First-person shooter. http://en.wikipedia.org/wiki/Firstperson shooter. [3] Half-life. http://half-life.sierra.com/.

48

[14] Y. E. Liu, J. Wang, M. Kwok, J. Diamond, and M. Toulouse. Capability of IEEE 802.11g networks in supporting multi-player online games. In Proceedings of the 2nd IEEE International Workshop on Networking Issues in Multimedia Entertainment (NIME 2006), Jan. 2006.

[4] Quake 3. http://www.idsoftware.com/games/quake/quake3-gold/. [5] Unreal Tournament. http://www.unrealtournament.com. [6] Atheros Communications. Methodology for testing wireless LAN performance with Chariot. http://www.atheros.com/.

[15] T. Nguyen and G. Armitage. Quantitative assessment of IP service quality in 802.11b and DOCSIS networks. In Proceedings of the Australian Network and Telecommunications Conference, 2004.

[7] T. Beigbeder, R. Coughlan, C. Lusher, J. Plunkett, E. Agu, and M. Claypool. The effects of loss and latency on user performance in Unreal Tournament 2003. In Proceedings of the 2004 ACM SIGCOMM Workshop on NetGames (NetGames 2004), Aug. 2004.

[16] C. E. Palazzi, G. Pau, M. Roccetti, S. Ferretti, and M. Gerla. Wireless home entertainment center: Reducing last hop delays for real-time applications. In Proceedings of the ACM SIGCHI International Conference on Advances in Computer Entertainment Technology (ACE 2006). ACM Press, June 2006.

[8] X. Cao, G. Bai, and C. Williamson. Media streaming performance in a portable wireless classroom network. In Proceedings of the IASTED European Workshop on Internet Multimedia Systems and Applications (EuroIMSA), pages 246–252, Feb. 2005.

[17] C. E. Palazzi, G. Pau, M. Roccetti, and M. Gerla. In-home online entertainment: Analyzing the impact of the wireless MAC-transport protocols interference. In Proceedings of the IEEE International Conference on Wireless Networks, Communications, and Mobile Computing (WIRELESSCOM 2005), June 2005.

[9] M. Claypool, D. LaPoint, and J. Winslow. Network analysis of Counter-strike and Starcraft. In Proceedings of the 22nd IEEE International Performance, Computing, and Communications Conference, Apr. 2003.

[18] H. Park and T. Kim. Network game traffic modeling. In Proceedings of the 1st Workshop on Internet ad Network Economics (WINE 2005), Dec. 2005.

[10] F. H. Fitzek and M. Reisslein. Mpeg-4 and h.263 video traces for network performance evaluation. http://www-tkn.ee.tuberlin.de/research/trace/ltvt.html.

[19] P. Quax, P. Monsieurs, W. Lamotte, D. Deleeschauwerand, and N. Degrande. Objective and subjective evaluation of the influence of small amounts of delay and jitter on a recent first person shooter game. In Proceedings of the 2004 ACM SIGCOMM Workshop on NetGames (NetGames 2004), Aug. 2004.

[11] IEEE-SA Standards Board. ANSI/IEEE Standard 802.11. IEEE, 1999. [12] J. F. Kurose and K. W. Ross. Computer networking: a top-down approach featuring the Internet. Addison Wedley, 3rd edition, 2005.

[20] A. Wijesinha, Y. Song, M. Krishnan, V. Mathur, J. Ahn, and V. Shyamasundar. Throughput measurement for UDP traffic in an IEEE 802.11g WLAN. In Proceedings of the 6th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, 2005 and First ACIS International Workshop on Self-Assembling Wireless Networks, pages 220–225, May 2005.

[13] T. Lang, G. Armitage, P. Branch, and H. Choo. A synthetic traffic model for Half-life. In Proceedings of the Australian Network and Telecommunications Conference 2003, 2003.

49

Preliminary Experiments with Vision-Based Interface for Games Control Abdennour El Rhalibi, Carl Peters, Madjid Merabti School of Computing and Mathematical Sciences Liverpool John Moores University UK {a.elrhalibi ; c.peters ; m.merabti} @ljmu.ac.uk Computer vision data will be gathered by a single digital camera and by developing such a system, computer vision techniques are researched, implemented and evaluated in regards to game control suitability. Some base computer vision techniques such as colour tracking, background removal, motion tracking and edge detection are put to use and then built on using filters, sub-window optimisation and other improvements. By implementing different techniques, we can test the suitability and possible use of each technique for improving control of computer games, in terms of performance and usability.

ABSTRACT In this paper we propose to investigate the technical requirements of a vision-based interface for games control. We present and use some of the techniques which have been developed in computer vision, image processing and vision based perceptual user interfaces. We implement some common image processing techniques and evaluate their potential in term of performance and usage for game control.

Categories and Subject Descriptors H5.2. Information interfaces and presentation (e.g., HCI): User interfaces: Input devices and strategies

The structure of the paper is as follows: in section 2 we present a review of state of the work on vision-based interfaces techniques, in section 3 we present the image-based processing techniques we have used, in section 4 we introduce the system and application designed to interact within a game world, in section 5 we present our experiment and evaluation of the algorithms implemented, in section 6 we present a conclusion and possible future work.

General Terms Algorithms, Measurement, Experimentation, Human Factors.

Performance,

Design,

Keywords

2. RELATED WORK

HCI; perceptual user interface, webcam; glove; motion tracking; coloured sticker, colour space, filters, video games.

The areas related to our investigation are image processing, computer vision and vision based perceptual user interface. Much research has been made in these areas and important papers are [1][2][3][14]. These areas include commonly techniques of segmentation allowing extraction of part of an image, foreground, background, or object with a defined colour or shape.

1. INTRODUCTION Ever since people started to play video games, the interfaces have gradually improved with each new console generation, while PC users have mostly been using a keyboard and mouse configuration. As games demand greater depth, game controllers are usually improved by adding more buttons and a great variety of triggers, track wheels and analogue joysticks. With the up-and-coming generation of games consoles however, many people believe that controller complexity has reached critical mass, for example Sony PS2 and Microsoft Xbox controllers have 13 different buttons including triggers, pressure sensitive shoulder buttons and face buttons, eight way directional pads and two analogue sticks.

There has been an increasing interest in particular in segmentation for human pose and gesture recognition. Thanarat [2] proposes foreground segmentation algorithm based on background subtraction where background is modelled as Gaussian. Friedman [2] proposes an image segmentation algorithm for traffic video sequence using Expectation Maximization algorithm. W4 [4] combines shape analysis and robust techniques to detect people, and to locate and track their body parts (head, hands, feet, torso) in an outdoor environment. W4S [5] uses structure from motion to improve upon W4 and does detection and tracking of people in 2.5D.

With this mind numbing selection of keys even experienced players can make mistakes. In an attempt to get the interest of more potential players and to reinvigorate interest in the video games industry, much has been done to introduce new forms of interface to the home games market such as the wireless Nintendo “Wii” controller and the highly successful Sony “Eyetoy” web cam games. Vision-based interfaces could have great potential for pc games as well.

Kids Rooms [6,7] is a tracking system based on “closedworld regions”. The work by Leung and Yang [8] describes a vision system for labelling human body parts. Pfinder [9] is a real time system for tracking a person using a multi-class statistical model of colour and shape to segment a person from a background scene. Spfinder[10] is a recent extension of Pfinder in which a wide baseline stereo camera is used to obtain 3-D models. Spfinder has been used in a smaller desk-area environment.

This paper describes research into computer vision systems with the aim of providing players with a natural and fun way to interact with pc computer games.

All these works requires expensive equipment, and are not always real-time, in the sense of [10] [11] [13]. We review in the next

50

section the techniques which are required to implement such segmentation system.

a gaussian distribution [14]. We use this distribution to predict if other pixels correspond to this colour. We also try to predict if pixels belong to the background. The following figures show samples of the techniques we have implemented. Thresholding can be done using small markers, as depicted in Figure 1, or for tracking skin colour (e.g. using hand/face gesture) or large markers.

3. IMAGE PROCESSING-BASED TECHNIQUES Many image processing based techniques have been developed [1] [2] which can be implemented to develop the camera-based framework. There are many issues which need to be taken care of, when dealing with vision based systems. To enable an efficient interaction, the system must be able to identify accurately the medium used for the interaction, be it the hand position, a colour or the edge of an object or the body. These issues concerns the background, the luminosity, the range of colour to detect, the environment, the image artefacts and the precision with which we care to detect the interaction medium. The techniques we have developed include the software to process a video stream, enabling to deal with colour thresholding, colour space conversion, Background Removal, Colour and Edge Detection, Noise Removal and Sub-Window Optimisation.

Figure 1: Colour Thresholding

3.1 Colour Thresholding In many vision applications, it is useful to be able to separate out the regions of the image corresponding to objects/characters in which we are interested, from the regions of the image that correspond to background. Thresholding provides an easy and convenient way to perform this segmentation on the basis of the different intensities or colours in the foreground and background regions of an image. In addition, it is often useful to be able to see what areas of an image consist of pixels whose values lie within a specified range, or band of intensities (or colours). Thresholding can be used for this as well.

Figure 2: Sub-window optimisation

To improve performance and reduce the size of the image processed, thresholding can be used with background removal and sub-window optimisation. Enabling to track the control medium first, and then the marker. Figure 2 shows a stage of this process.

The input to a thresholding operation is typically a greyscale or colour image. In the simplest implementation, the output is a binary image representing the segmentation. Black pixels correspond to background and white pixels correspond to foreground (or vice versa, or using different colours). In simple implementations, the segmentation is determined by a single parameter known as the intensity threshold. In a single pass, each pixel in the image is compared with this threshold. If the pixel's intensity is higher than the threshold, the pixel is set to, say, red in the output. If it is less than the threshold, it is set to black.

Figure 3: Colour tracking in operation following three colours at the same time.

In more sophisticated implementations, multiple thresholds can be specified, so that a band of intensity values can be set to red while everything else is set to black. For colour or multi-spectral images, it may be possible to set different thresholds for each colour channel, and so select just those pixels within a specified cuboid in RGB space. Another common variant is to set to black all those pixels corresponding to background, but leave foreground pixels at their original colour/intensity (as opposed to forcing them to white), so that information is not lost.

Figure 4: Edge Detection with Threshold

Figure 5: Edge Detection with Threshold and Hysteresis

In Figure 3, Colour thresholding is in operation tracking three colours at the same time. Notice how care has been taken to define each colour but some false negatives are present on each marker and some false positives for the red marker are detected. Some solutions to this problem can be found by applying filters. In Figure 4, we depict an example of Edge detection. The threshold has been set to a low value to get a strong constant edge. Notice how some small pixel are alone with nothing surrounding them. By lowering the threshold limit, lines will become more definite, but these rogue pixels will become more common. Using a low threshold, edge detection will look for colour and material while large thresholds will cause the system to only detect more definite changes.

Given an image of a player, we must know which pixels correspond to a specific colour, for example using a colour marker or the player's skin (hand or face), and which are background or clothes. This will be useful in tracking the position, the orientation, or the motion of a person's head or hands. In this approach, we select a training sample of colour (skin or marker), to which we fit

51

Figure 5 shows the same scene as figure 4 using a higher threshold to detect only the most obvious edges and then searching for finer details using image hysteresis [14]. Notice how hysteresis joins everything together meaning no pixels are floating alone in space.

3.2 Colour Space Transformation Our approach involves the use of thresholding in a dimensional colour space. Several colour spaces are available, including Hue Saturation Intensity (HSI , a.k.a. HLS), YUV and Red Green Blue (RGB). The choice of colour space for thresholding depends on several factors including light conditions and utility for the particular application.

Figure 7: Background Removal Example

Background removal is able to track objects/players even within challenging environments far removed from a simple blue screen. The break-up in background removal beneath the pen (in Figure 7) is caused by some slight shadowing from the pen as seen in the small preview window.

RGB is a familiar colour space often used in image processing, but it suffers from an important drawback for many vision applications. RGB might change according to lighting condition. RGB threshold technique is not able to track colours reliably in changing light conditions, but allows eliminating most of the useless pixels from the images for further treatment. Thus, a further method, which could be used is, to do HLS threshholoding, by transforming RBG colour space representation to HLS (Hue, Luminosity and Saturation). The main advantage of the HLS colour space representation is that an object of a certain colour can undergo a large amount of bleaching and brightness change (which correspond with variations in Saturation and Luminescence) as a result of changing lighting conditions, with only a small amount of change in the Hue of the given colour, making thresholding more robust to light condition. A very effective operation is also to transform coloured image onto greyscale as seen in figure 8, but reductions in image quality due to reducing the number of colours can generate more and more ‘blind spots’ as distinctions between the player and environment are lost.

Figure 8: The right side of this image has been converted to greyscale and is able to track the hand and face only. On the other hand the left side remains colour and has many false positives.

3.4 Filters Filters are mainly used to suppress either the high frequencies in the image, i.e. smoothing the image, or the low frequencies, i.e. enhancing or detecting edges in the image. An image can be filtered either in the frequency or in the spatial domain. The first involves transforming the image into the frequency domain, multiplying it with the frequency filter function and retransforming the result into the spatial domain. The filter function is shaped so as to attenuate some frequencies and enhance others. Common filtering techniques are the following [14]: Mean Filter - noise reduction NR using mean of neighbourhood. Median Filter - NR using median of neighbourhood. Gaussian Smoothing - NR using convolution with a Gaussian smoothing kernel. Conservative Smoothing - NR using maximum and minimum of neighbourhood. Crimmins Speckle Removal - more complex NR operator. Frequency Filters - high and low pass image filters, etc. Laplacian/Laplacian of Gaussian Filter - edge detection filter. Unsharp Filter - edge enhancement filter.

3.3 Background Removal A very common technique used in image processing to extract foreground from background of an image is background removal. Before the beginning of the experiment, a video sequence of the empty background is recorded to obtain a mean image of the background. This background image is later going to be used in background subtraction of every single frame for foreground extraction, as seen in figure 6, e.g. a hand. Therefore, binary images of the hands will be obtained after proper thresholding, to eliminate noises.

3.5 Features detection This operator introduced here aims at identifying meaningful image features on the basis of distributions of pixel graylevels. The two categories of operators included here are: Edge Pixel Detectors that assign a value to a pixel in proportion to the likelihood that the pixel is part of an image

Figure 6: Background Removal Principle[14]

52

-

edge (i.e. a pixel that is on the boundary between two regions of different intensity values). Line Pixel Detectors, that assign a value to a pixel in proportion to the likelihood that the pixel is part of a image line (i.e. a dark narrow region bounded on both sides by lighter regions, or vice-versa).

Detectors for other features can be defined, such as circular arc detectors in intensity images (or even more general detectors, as in the generalized Hough transform), or planar point detectors in range images, etc.

Figure 10: Motion detection in operation. The system draws black pixels in area of motion. As the reference image updates, these previous motion areas fade away

Different feature detection approaches might be used [14]: Roberts Cross Edge Detector - 2×2 gradient edge detector Sobel Edge Detector - 3×3 gradient edge detector Canny Edge Detector - non-maximal suppression of local gradient magnitude Compass Edge Detector - 3×3 gradient edge detectors Zero Crossing Detector - edge detector using the Laplacian of Gaussian operator

4. USER GAME INTERACTION Former game pads, joysticks or even basic mice are all based on the same principles. First, a dedicated function to control an avatar or a cursor on the screen. On the game pads there is a multiple arrow area which once pressed allows the user to move right, left, up, down and combined direction or to simply move something on the screen. For the joystick, instead of the multiple arrows, you just have to move the stick in the desired directions and similarly, you just have to move the mouse to, for example move the cursor on the screen. Then, on all the mentioned devices you have buttons. Nowadays, game pads, joysticks and mice have a lot of buttons but back a few years ago, there were usually only two buttons. In games they were often used to trigger an event like jumping or firing. We have tested similar system using a glove and coloured stickers to interact with games. The system works as follow: A red sticker which is meant to be always visible to the camera is present on the interior side of the hand below the little finger. See Figure 11. A blue and a green sticker are also present on the index and middle fingers’ tip. They are used to emulate a left and right click of the mouse as those functions are widely required in most computer games.

Figure 9: Edge detection results of 5x5 median filter (right) and an unfiltered image (left).

3.6 Motion detection The basic idea of the system is to compare two video frames from the same scene, one taken now and one taken in the previous frame cycle. The pixel data from these two images is then compared and any differences can be considered movement within the scene. The process of detecting motion requires the acquisition of an image, its comparison with a previous image and then its temporary storage for a second comparison with a new image which starts the cycle again. A game would require tests to be made more ten to twenty-five frames per second to enable the program or the hardware to capture and process images at the speed necessary to track everything as closely as possible to the player’s current and most recent actions, otherwise a delayed pastimage may be used.

Figure 11: Glove with Coloured Markers

Figure 12: Simple object manipulation using computer vision

Considering the limitation of a mouse emulator, using glove and coloured markers, another system based on the use of hand tracking, was developed. Specific hand features in the image, such

53

as colour, edge or position, are used to interact with game objects, to change their position or trigger some games events.

test configuration is a Shuttle SB75G2 with 1GB of RAM PC3200, a 2.8Ghz HyperThreading Intel Pentium 4 processor and a GeForce 6600GT graphic card. The webcam used to capture the 160x120 frames is a Logitech QuickCam Pro 4000.

Figure 12 and Figure 13 illustrate some possibilities. In Figure 12 a cone has been used to test rotations which can be achieved by tilting your hand. Background removal (black pixels) and colour tracking have been used to reduce the chances of false positives in a colourful environment (only the colours within the player’s area will be tracked).

5.1 Test with Glove and Marker Table 1- Column (1) below shows the approximate average time in millisecond it takes for the callback function to go through each pixel in the buffer and copy them in memory . Although the frame analysis is not actually started, we still need to capture the video stream in order to display it latter on. Therefore, the callback function, which is also known as the overall frame analysis function, is still called. Copying the captured frame pixel per pixel in memory takes approximately 6.50 milliseconds.

In Figure 13, we show a version of a puppet game demonstrating its ability to track colours and bend arms. The arms are made from cones which point at the target. Both approaches heavily rely on using image processing techniques such as, background removal, colour threasholding, filtering and edge detection.

To reach 30 frames per second the maximum time allowed for each cycle is 1 / 30 = 33.33 millisecond. Therefore the current results of the overall frame analysis are quite good. However, a few functions, like displaying the video stream or moving the mouse, are not shown here. They are not part of the main functions so we neglect them as they do not really affect the final results anyway. Yet, the function which triggers the callback function and makes the webcam filling the buffer with the captured frame act directly on the final results.

We investigate if computer vision can be applied effectively in the field of computer games. Concepts such as segmentation, face detection, shape recognition and gesture recognition might all be put to use in development of the game. A background removal algorithm able to detect foreground objects from a static background scene handles the foreground/background segmentation. Using certain colour space techniques, shading and shadows from colour images is also detected and eliminated. Furthermore, image enhancement methods such as filtering operations are applied out to remove noise and false detection. Besides the background subtraction engine, colour segmentation is also employed at the same time to it for the purpose of hand detection. The tracked regions are then passed through another statistical-based filter to obtain optimal results for the coordinates of the hand. After the above stages of segmentation, the shape recognition might be carried out through calculation of moments. Finally, gesture recognition might be applied by hand/head movements based on patterns of motion.

The table 1- Column (2) shows the same function but this time with the pixel analysis activated. In the previous case whether something or someone was moving in front of the video camera did not matter and the overall frame analysis duration was always approximately the same. However, this time it does matter and the column 2 shows timing results when absolutely nothing moves in the camera field of view, there is not light condition change or almost nothing that can affect the frame in theory.

Process

Table 1: Timing result before frame analysis Time Time Time (msec) (msec) (msec) (1) (2) (3)

Overall frame analysis

13.20

23.00

44.00

23.50

Ŀ Sub-Window

n/a

0.10

0.20

0.10

Ŀ Pixel capture

6.50

6.50

7.10

6.70

n/a

0.55

19.50

5.70

n/a

0.10

5.10

1.50

Ŀ RGB to HLS conversion

n/a

0.35

14.50

3.15

Ŀ Pixel contour test

n/a

0.00

0.50

0.18

Ŀ Background removal Ŀ Pixels analysis Figure 13: An early version of the puppet game demonstrating its ability to track colours and bend arms. The arms are made from cones which point at the target.

5. EXPERIMENTS AND EVALUATION

Time (msec) (4)

If anyone passes in the camera field of view without the glove the timing results of the background removal algorithm and pixel analysis will increase significantly like shown in the table 1column (3). This table is just an example as the result may vary considerably according to how the objects or person were inserted in the camera field of view.

The experiments and evaluation will focus on performance and on HCI (Human Computer Interaction) while playing simple computer games using the application. The table and charts below shows the timing results for each of the main steps the application is going through to capture and analyse the frames and consider different use of the software. The

54

In this example the overall frame analysis takes 44 milliseconds but values superior to 60 milliseconds have been observed. If we add these values to the time required by the webcam to fill the buffer with the current frame, we are far under 30 frames per second. The reason why it is so slow does not come from the RGB tests which are fast. However, RGB to HLS conversions which are present in both the background removal algorithm and the pixel analysis function are the main cause of those bad results. The simple fact of removing the HLS conversions and test in the background removal algorithm allows the overall frame analysis to stay under the threshold of 33 milliseconds in the same conditions.

5.2 Test with Hand and Body Motion Although we have focused on developing the most accurate tracking method possible for the glove and coloured markers, they have a limited use to enable interaction within a game. We have therefore developed techniques to track hand and body motions. This section describes some of the timing tests made to get an accurate idea of how much time each method is taking and to find areas of possible improvement for the techniques developed in this context.

5.2.1 System timing results The following timing results have been taken using the same hardware and software as for the glove and coloured markers tracking. Any specific configurations will be noted along with the time results. For example we will make tests using different resolutions, camera compression settings and with different filters.

As observed earlier, RGB to HLS conversion is very time consuming and a system working directly with HLS value, and thus avoid conversions, would be faster. The only point to consider would be to know if it will impact the games performance while using the system. So far, no significant impact on the performances of both the application and games running at the same time were observed.

Due to the fact that we are testing several different computer vision methods with obvious advantages and disadvantages in different situations, the prospect of testing each system fairly is also a problem. To make the tests as fair as possible, we have created a special version of the program which runs edge detection, background removal, colour tracking and motion detection given the same subject matter concurrently. The tests will be conducted in a typical unprepared home environment with a player in the scene as would be the case during normal usage. This means each system has the same data to work from and the chosen environments are representative of intended usage. During the tests, colour tracking systems will be calibrated with an automatic calibration system, while other systems will be calibrated by hand to obtain the best tracking level within the chosen environment.

The table 1- Column (4) shows the timing results in normal conditions of use of the software, with the glove, and with RGB to HLS conversion activated in the background removal algorithm. This time the red sticker is detected and the sub-window is around the glove so instead of analysing the entire image of 160x120 pixels only a rectangle of 80x64 pixels is analysed. The observed performances are much better and the overall frame analysis duration is rarely superior to 25 milliseconds. If we deactivate the RGB to HLS conversion in the background removal algorithm, the overall frame analysis hardly reach 19 milliseconds. Of course, it would be better not to use the RGB to HLS conversion algorithm at all but it is necessary in the pixel analysis algorithm. Fortunately, for the number of tests preceding the actual RGB to HLS conversion prevent an abusive use of it. Yet, in the background removal algorithm the use of the RGB to HLS algorithm can hardly be reduced. It is however possible to optimize this part of the program by pre-storing the HLS data of the background which should reduce almost by half the timing results of this algorithm. The easiest solution would be to remove RGB to HLS conversion from the background removal algorithm. Yet, if the background does not risk interfering with system there is no problem doing it but if it is not the case, the RGB to HLS conversion followed by the HLS threshold test prevent many artefacts to appear and disturb the system which also means more comfort for the user.

During the first attempt to complete this test, we realised that the time spent to complete each tracking system is not constant. For example looking at the process for the background removal system, if one pixel passes the red test on one run but fails the green and then on the next run it fails the red, then a slight fluctuation in completion time will be detected. Because each image is slightly different and because so many pixels are tested, this fluctuation can add up. We also found out that the initial system which simply outputs the completion time of each task was not good, because the values are too long and too fast to understand. To solve this problem we altered the test program and started again with the goal of getting the average completion times for each system. On the other hand, this mistake has caused us to find an issue with the system which we have never considered before. When a threshold is set for colour tracking, background removal and motion detection, the variable time of the testing phase as highlighted in the results can be increased or decreased depending on the size of the threshold. For example using a low background removal threshold can cause completion times to drop to ~0.001 seconds at 160x120 resolution, but a large threshold value can bring the time up to ~0.004 seconds. In contrast the completion rate provided by the edge detection system remains relatively constant due to the fact that it has to make the same number of tests every single time. Although this is a small amount to worry about, it should not be forgotten that this is just an interface method and time lost through wasted thresholds could have been used for something more important including the ability to use higher resolution camera data.

If we compare the results with Michael T. Driscoll’s [13], who did similar experiments, we immediately notice that the performances are much better for images of approximately the same size; especially when you consider the fact that [13] Michael T. Driscoll did not use the RGB to HLS conversion algorithm in his tests as it was not possible to make it work in real-time when his work was developed. When considering the techniques involved in the experiments shown in column 4, the results are very satisfactory for such approach to be used in a game running at 30 fps.

55

results keep reoccurring. Although we have stated that these results are only averages and gathering them alone has altered the performance of the system, apart from some unknown changes to the hardware, we can only speculate that this time difference may have something to do with the camera hardware being better suited to the higher resolution meaning images are returned faster.

Figure 14: Experiments on Image Processing Techniques

As you can see in the chart above, in Figure 14, the difference in completion time from one method to the other is not very different, nor does it take an inordinate amount of time to complete a run of any of the computer vision systems. On the other hand the chart does not include times taken from 640x480 resolution as it would dwarf the other averages. For example the average edge detection time, as I mentioned earlier remains fairly constant, but takes around 0.091 seconds to complete at 640x480. Another aspect which is surprising in the table is the fact that colour tracking is often faster than background removal and motion detection. Although both systems basically move from pixel to pixel detecting if its colour is within a specific range, the way they calculate this colour range is different and is the reason why such differences are detected. For example colour tracking works with a constant range value for each colour throughout the entire surface of the image, but background removal and motion detection have to look up a different colour value for each pixel in an array.

Figure 15: Experiments on Images format and resolution

The chart below, in Figure 16, presents the average completion time of an image being gathered from the camera and then having a median or a mean filter applied to it with different window sizes. To gather this data, we ran the same test system as used previously at 160x120 resolution using either a median or mean filter at each of the useful window sizes (3x3, 5x5, 7x7, 9x9) [14]. Although it is possible to run the system at higher window sizes, applying filters with too great a size will distort an image drastically and will make it harder for the system to track objects properly. As you can see in the chart below, median filters do indeed cause a greater performance dip compared to mean filters due to the fact that median filters have to manipulate large numbers of array cells by placing them in order etc. on the other hand, mean filters use simple integer additions followed by a divide, which is much less intensive. Also because the median filter has to initialise an array when it is called, a large performance hit is expected, but can be reduced if a single array is defined at the start of the program and used over and over again. A new array has been defined in this function because the function has been made with flexibility in mind, meaning a different window value can be passed in and the system will adapt accordingly. On the other hand if only one window size is needed, then we see no reason why hard coded values can be used to save time. For this reason we went back and manually altered a few aspects of the median filter so all of its values are hard coded and its array is defined to have the exact number of cells only when the system starts up rather than on every call. As you can see from the chart below, the optimised median filter data is presented between the mean filter and the ordinary median filter data and is much faster for these simple changes. On the other hand, median filters are still slower than mean filters as the chart shows and is made obvious when running any programs with mean and median filters.

The initial experiments were made with the camera set to gather RGB24 colour images. We have made an experiment to compare the performance times of RGB24 and I420 compression [14] and these provide very similar performance. To address this performance question, we have run the same experiments again at different resolutions and with no colour modifications, with greyscale modifications and with binary modifications. As you can see in the chart below, in Figure 15, a slight performance penalty can be detected when converting colours into greyscale and a larger performance penalty can be detected when converting to binary. On the other hand, because the image processing costs for systems such as background removal and motion detection will be far greater because only one channel is used instead of three, this slight performance loss might be worth it in the long run. Of course the time spent converting colour images into greyscale can be improved further by using a black and white camera instead of a colour one. Although we expect to get some anomalies when gathering completion time data, the data presented in the colour conversion chart below is very strange because the times reordered at 160x120 resolution are almost all greater than at 176x144. This is strange because the system has been executed several times to gather the averages presented below, in Figure 15, and the same

56

system usage results we will implement the above computer vision systems into games and interface methods for a 3D racing games CARS. And this is our future work. Table 2: System Requirements and Usage

Figure 16: Experiments with Filters

5.3 Experiments Summary and Conclusion The purpose of this section was to enable us to quantitatively convey the changes in performance noticed when using different vision systems and filters with different settings. By running and developing the different systems, we could tell which resolutions and settings are best to use in a game environment by noticing responsiveness and time lag between our actions and the systems ability to catch up. On the other hand, up until now we have been using informed guesses as to which vision system is the fastest. For example we have assumed that edge detection would be the slowest system, but according to the data gathered here, it is almost equal to background removal and often faster than motion detection with its slow array operations. This is not a completely fair test however as edge detection really requires a filter to function properly, while background removal can operate without any additional help. We expect that a filtered edge detection system would make the chart presented above look much less like an even race.

5.4 System usage results We will now attempt to clearly display the relative advantages and disadvantages of each of the techniques tested in this paper (both in terms of requirements and human usage) and will attempt to recommend possible usage scenarios based on the evidence collected in the experiments. To compare the vision systems tested, we will use similar criteria as used in Wang et al [12], for example sustainability describes the amount of effort players have to put in to use the system and can suggest possible usage situations.

6. CONCLUSION

Figure 17: 3D Racing - CARS

In this paper we propose to investigate the technical requirements of a vision-based interface for games control. We present some of the techniques which have been developed in computer vision, image processing and vision based perceptual user interfaces. We implement some common image processing techniques and evaluate their potential in term of performance and use for game control.

7. ACKNOWLEDGMENTS The authors would like to thanks Gael Blottiere, for his contribution to the experiments.

8. REFERENCES [1] M. Turk, " Moving from GUIs to PUIs," Invited paper, Proc. of the Symposium on Intelligent Information Media, Tokyo, Dec. 1998. [2] Thanarat Horprasert, David Harwood and Larry Davis. “Statistical Approach for Real-time Robust Background Subtraction and Shadow

As well as developing the basic principals, we were also able to gather a lot of information from the attempts to build these systems into interface methods and actual games. To complete the

57

Detection”. http://www.eecs.lehigh.edu/FRAME/Horprasert/index.html, [3] Nir Friedman and Stuart Russell, “Image segmentation in Video Sequences: A probabilistic approach” In Proc.Thirteenth Conf. on Uncertainty in Artificial Intelligence (UAI), Brown University, Providence, Rhode Island, USA, August 1-3, 1997. [4] I. Haritaoglu, D. Harwood, and L. Davis. “W4: Who, When, Where, What: A Real Time System for Detecting and Tracking People”, In Proc. Of Third Face and Gesture Recognition Conf., publisher, Location, April 1998, pp. 222-227. [5] I.Haritaoglu, D.Harwood, and L.Davis. “W4S: A Real Time System for Detecting and Tracking People in 2.5 D”, Fifth European Conference on Computer Vision, Freiburg, June 1998 [6] S. Intille, J. Davis, A. Bobick. “Real Time Close-World Tracking”, In Proc. Of Computer Vision and Pattern Recongnition, June 1997, pp.697-703 [7] A. Bobick, J. Davis, S. Intille, F.Baird, L.Cambell, Y. Ironov, C.Pinhaez, Awilson “KidsRoom: Action Recognition in an Interactive Story Environment” M.I.T. TR No: 398, December 1996.

[8] Maylor K. Leung and Yee-Hong Yang. “First Sight: A Human Body Outline Labeling System”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 4, pp. 359-376, April 1995. [9] C. Wren, A. Azarbayejani, T.Darrell, A. Pentaland. “Pfinder: RealTime Tracking of Human Body”, In Proc. Of the SPIE Conference on Intergration Issues in Large Commercial Media Delivery Systems”, October 1995. [10] A. Azarbayejani and C. Wren and A. Pentaland, “Real Time Tracking of The Human Body", In Proc. IMAGE COM 96", May 1996. [11] Yoshiaki Akazawa, Yoshihiro Okada, and Koichi Niijima. “Realtime motion capture system using one video camera based on Colour and edge distribution” [12] Wang , S., Xiong, X., Xu, Y., Wang, C., Zhang, W., Dai, X., Zhang, D., (2006). ‘Face-tracking as an augmented input in video games: enhancing presence, role-playing and control‘ ‘Proceedings of the SIGCHI conference on Human Factors in computing systems’, P 10971106. ACM Press. New York, NY, USA. [13] Michael T. Driscoll (May 1999), “A Machine Vision System for Capture and Interpretation of an Orchestra Conductor's Gestures” [14] Kenneth R. Castleman, “Digital Image Processing”. Prentice Hall, 1996.

58

Shaping Interactive Stories by Means of Dynamical Policy Models Fabio Zambetta

School of CS&IT, RMIT University 392 Swanston Street Melbourne, Australia

[email protected] ABSTRACT

(for instance tactical First Person Shooters, strategic games, etc.), but his biggest achievement probably remains Balance of Power, an unwar game, a game about the prevention of war [11]. As observed by Crawford, a game does not necessarily need a continuous display of violence to be interesting or fun to play, whereas it definitely needs conflict: War can be considered as the most violent expression of conflict, and as such a game portraying political conflict and the means to avoid it, would not only be fun to play but it might also convey relevant messages. Our interest lies on one hand in exploiting dynamical models to achieve a more formal approach to the design process of (story-driven) games [18], and on the other on extending the current limits of interactive storytelling. We envisage an extended story-driven approach, where not only can players influence the game story, but also the story itself can change under the pressure of political balance. It is a common assumption that stories and drama are actually generated by conflict, as detailed by Aristotle a very long time ago [14]. To this aim we designed a model that can generate political balance and conflict between factions of PCs (Player Characters) and NPCs (Non-Player Characters) in a CRPG, and we will offer a first glimpse at the problems and results obtained during the development of a prototype implemented in the renowned NWN (Neverwinter Nights) computer RPG [1]. Our work is rooted in Richardson’s dynamical model of Arms Race [28], devised to analyze the causes of international conflicts, which Richardson initially applied to a World War I scenario. The improvements that our modified Richardson’s faction model brings are numerous when compared to the standard faction models currently used in RPG games. First and foremost, such an approach will mean more options are available to RPG designers that will enable the creation of different types of stories that integrate sociopolitical considerations in the plot itself and extend the usual story driven approaches: This will become the focus of our future work, and one of the most fascinating problems to tackle. Secondly, by simply varying the basic parameters of the core model, many scenarios can be created that correspond to a different political status quo (e.g., tense relationships, truce, initial friendly relations, etc.) that can lead over time to different types of equilibria. Finally, players’ choices will impact the in-game political balance, but at the same time the plot will evolve under the pressure of political events, giving rise to a mixed and novel gameplay style. The scenario we are designing to expand the current basic prototype has been dubbed Two Families. Players will take the side of one of two influential families in the

Computer Role Playing Games have lacked in formal modelling of political in-game context so far, even though they usually hint at creating entire societies or epic stories of big proportions, including warfare and devious plots. For instance, most of the games incorporating political models mainly comprise strategic titles like Civilization or Balance of Power, but RPGs seem to have focused very little on this aspect. We argue that the existing story-driven approaches to the design of CRPGs (Computer Role-Playing Games) could be extended by means of a proper mathematical model that simulates political balance and conflict. Thus, RPG designers will be able to propose a diverse role-playing experience to their players, and create stories that can change under the pressure of political upheaval.

Keywords Game Design, Interactive Storytelling, System Dynamics.

1.

INTRODUCTION

Originally a pure form of entertainment, nowadays video games are struggling to establish an identity as a more complete art and technology medium that could allow for diverse and heterogeneous forms of expression. Examples of this tendency are represented by serious games, which are emerging to push the boundary of computer games much further than the classic experience based on pure entertainment: Games can be used to analyze and learn about problems in education, training, health, and public policy [21]. To date, very few game designers have tried to tackle these kinds of issues in commercial products, but most prominent amongst them remain Chris Crawford, the author of a classic book on game design [10] and designer of games like Eastern Front, Balance of Power, and Balance of the Planet [11]. Crawford’s career initially focused on war games that found, and still find, a large space in the video games industry

59

fight for supremacy in a fictional city, and decide whether they want to further their faction’s political agenda or act as a maverick. Two Families contains in its current design, elements and influences from as diverse sources as Shakespeare’s Romeo and Juliet [29], hinting at the struggle for supremacy of the Montague and Capulet families, and Mario Puzo’s The Godfather [26], describing in a fictional way the crude and military organization behind the many Sicilian families in the USA, affiliated to the infamous Cosa Nostra. However, the main source of inspiration came from the struggle and bloodshed that occurred in the city of Florence around the years 1289-1301, the famous Black vs. White Guelfs conflict, whose cruel details are vividly narrated by Dante in some cantos of his masterpiece, the Divina Commedia [5]. The remainder of the paper is organized as follows: Section 2 introduces dynamical models, and describes the modified Richardson’s model used to compute a political balance amongst factions, Section 3 details possible scenarios of use for the model, section 4 introduces our prototypes and the results obtained so far, and Section 5 outlines our future work.

2.

matrix form yielding, with proper substitutions: z˙ A

−a

k

l

−b

x

z

=

r

=

y g

!

!

(2)

!

h

The solutions of the system of linear ODEs (Ordinary Differential Equations) [7] do not depend much on the values of the constants, but rather on their relative magnitude, and the signs of g and h, which represent in Richardson’s view the grievance terms. The constants k and l are named fear constants (mutual fear), a and b are the restraint constants (internal opposition against arms expenditures), and as already mentioned, g and h are the grievance terms (independent factors, which can be interpreted as grievance against rivals). Note that only g and h are allowed to assume negative values. When analyzing the model, one will need to take into account the optimal lines (where the first derivatives of x and y equal 0), the equilibrium point P*=(x*, y*) where the optimal lines intersect, and the dividing line L* for cases where equilibrium depends on the starting point. Trajectories heading towards positive infinity are said to be going towards an unlimited armament or a runaway arms race, whereas the ones going towards negative infinity are said to be going towards disarmament. There are two general cases that can occur in practice, in the general assumption that det A = 0 :

A dynamical system is a mathematical abstraction of a real world system governed by a fixed rule describing the time dependant change in its state (a collection of real numbers), using a geometrical space (a manifold [19], a more general concept than a simple Euclidean space). The evolution rule of the dynamical system is deterministic, generally described by differential equations originated in Newton Mechanics. Since then, dynamical systems have been modelled in as diverse fields as natural sciences, and many computing and engineering disciplines. In particular, Richardson’s Arms Race model was developed by Lewis Fry Richardson to model and predict whether an arms race between two nations (or alliances) was to become a prelude to a conflict. The original model consists of a system of two linear differential equations, but it can be easily generalized to a multidimensional case [16]. Richardson’s assumptions about the model are given below:

• All trajectories approach a stable point (stable equilibrium, see figure 1(a) ). • Trajectories depend on the initial point: They can either drift towards positive/negative infinity or approach a stable point if they start on the dividing line (unstable equilibrium, see figure 1(b) ). If ab > kl, we will achieve a stable equilibrium: An equilibrium point is considered stable (for the sake of simplicity we will consider asymptotic stability only) if the system always returns to it after small disturbances. If ab < kl, we will achieve an unstable equilibrium: The system moves away from the equilibrium after small disturbances. We will show that a modified version of the model can produce alternating phases of stability and instability, yielding variable but non-chaotic results: This can give rise to a richer simulation of faction dynamics, as alliances can be broken and conflict be ceased temporarily, or even war be declared on a permanent basis. Our investigation is aimed at refining Richardson’s model for use in a CRPG, and has involved three steps: Reinterpreting the model semantics to fit our intended game context, modifying the model to produce a satisfactory representation of interaction among factions, and finally converting the model output to the input used by a classic CRPG faction system (in our case the Neverwinter Nights faction system).

• Arms tend to accumulate because of mutual fear; • A society will generally oppose a constant increase in arms expenditures; • There are factors independent of expenditures which conduce to the proliferation of arms.

The actual equations describing this intended behaviour are given as: = ky − ax + g = lx − by + h

=

Az + r

(3)

IMPROVING RICHARDSON’S MODEL OF ARMS RACE

x˙ y˙

=

(1)

The values of x and y indicate the accumulation of arms for each nation. Clearly, we can also rewrite the equations in

60

will drive the system towards increasing levels of competition/cooperation (decreasing cooperation indicates competition). Without loss of generality, we will concentrate on a restricted context of unstable equilibrium: Richardson’s model will be modified in order to obtain a rich behaviour, and at the same time cater for the interactive scenarios found in modern videogames. Also, we will assume that g and h are negative (indicating that the two factions harbour resentment towards each other).

2.2

Modifying Richardson’s model

The standard formulation of Richardson’s model in the unstable equilibrium case implies that the final state of the system will be dictated by the initial conditions of the system. The initial condition of the system, a point P in the cooperation plane depicted in figure 1(a) and 1(b), will be such that:

(a) The system trajectories (in green) converge to an equilibrium point.

• If P lies in the half-plane above the dividing line L*, then the system will be driven towards infinite cooperation. • If P lies in the half-plane below the dividing line L*, then the system will be driven towards infinite competition.

(b) The system trajectories depend on the initial point, and can lead to different outcomes. The dividing line is coloured in yellow.

• If P lies on the dividing line L*, then the system will be driven towards a stable condition of neutrality.

Figure 1: Possible equilibria for the system. The problem with this model is that it is uninteresting in an interactive scenario, even though it apparently contains all the main ingredients required to produce a rich behaviour: Once an application starts approximating the solution of the model from its initial condition via an ODE solver [7], the solution will be stubbornly uniform and lead to a single outcome in any given run (any of the three listed above, depending on the initial position of P). To cater for interactive scenarios (e.g., PCs and NPCs actions occurring in game will interact with each other and the game world), we developed a stop-and-go version of Richardson’s model: The solution of the system will be initially computed by our ODE solver until an external event is generated in-game. When that happens, the parameters of the model listed in Table 1 are conveniently recomputed, leading to a possible change in the equilibrium of the system: The way parameters are changed allows for the possibility of moving the dividing line L*, thus altering the direction of motion of the current system trajectory. Recalling (3) we have:

Table 1: The reinterpreted parameters semantics. Parameters Semantics k Faction X belligerence factor l Faction Y belligerence factor a Faction X pacifism factor b Faction Y pacifism factor g Friendliness of X towards Y h Friendliness of Y towards X

2.1

Reinterpreting the Richardson’s model semantics

Even though the model created by Richardson seemed to be a viable approach to control overall factions’ behaviour in games, it became soon apparent that the model was designed with a very coarse level of granularity in mind. Whilst that fits nicely with the type of analysis that Richardson was interested in (very high level picture of the reasons behind a conflict), our goal was to give designers the freedom to change a game’s story over time. Hence, we started our analysis by naming two factions X and Y, and by reinterpreting x and y as the (greater than or equal to zero) level of cooperation of faction X and Y respectively. We also reinterpreted the parameters of the model as listed in Table 1: The meaning of all the parameters is not very different in our version of the model, but increasing values will lead to cooperation instead of conflict. The level of cooperation of each faction will theoretically lead either to a stable equilibrium point P* that yields a steady state of neutrality, or unstable equilibrium that

Anew = λ>0

λAold

(4)

Now we want to see how scaling A will influence the equilibrium of the system. To do so, let’s first compute the equation of L*, which is the locus of points where both the derivatives in our system will go to zero. The equation of L* will result in: x˙ + y˙

61

= (ky − ax + g) + (lx − by + h) = (l − a)x + (k − b)y + (g + h) = 0

(5)

Clearly, if L1 · P > 0 and λ > 1 then L2 · P1 > 0. Similar conclusions can be drawn in the case L1 · P < 0. Hence, any application using our model will need to provide a set (or a hierarchy) of events, along with a relevance level λj , j ∈ {1..M } that could be either precomputed in a lookup table or generated at runtime. Obviously, all the events having λj > 1 will correspond to event that facilitate cooperation, whereas events having 0 < λj < 1 will exacerbate competition. The effect of the λ-scaling is to change partitioning of the first quadrant, giving rise from time to time either to a bigger semi-plane for cooperation or for competition. Finally, we want to stress that the improved Richardson’s model presented here can be characterized in terms of an HCP (Hybrid Control Problem) [8]. We will not get into much detail to avoid losing the focus of our argumentation, but suffice to say that an HCP is a system involving both continuous dynamics (usually modelled via an ODE) and controls (generally incorporated into a Finite State Machine). The system possesses memory affecting the vector field, which changes discontinuously in response to external control commands or to hitting specific boundaries: Specifically, it is a natural fit to treat in-game events like control commands.

The effect of scaling on A will yield: x˙ + y˙

= λ(l − a)x + λ(k − b)y + (g + h) = 0

(6)

Thus, we will finally have: (l − a)x + (k − b)y +

(g + h) λ

= 0 (7)

Three distinct cases will be possible then: • 0 < λ < 1: L* is moved in its original upper half-plane, giving rise to a possible decrease in cooperation. • λ = 1: The scale factor does not change A (there is no practical use for this case, though). • λ > 1: L* is moved in its original lower half-plane, giving rise to a possible increase in cooperation. To test these claims, the reader needs only to take a look at Figure 2, where the case 0 < λ < 1 is depicted. The dividing line is initially L1 , and the point describing the trajectory of the system is P: The ODE solver generates increasing values of cooperation stopping at P1 , because an external event has just occurred. At this stage, A gets scaled and as a result of that, the new dividing line becomes L2 : The new dividing line brings P1 in the lower half-plane, leading to decreasing values of cooperation (increasing competition). Generalizing

2.3

Converting to the Neverwinter Nights Faction System

Converting the to the NWN faction system is straightforward once the proper values of cooperation have been computed: A few function calls are available in NWN Script to adjust the reputation of a single NPC (e.g., AdjustReputation) or of an entire faction (e.g., AdjustFactionReputation). In NWN faction standings assume a value in the [0, 100] range per each faction: Values in [0, 9] indicate competition (in NWN hostility), whereas values in [91, 100] represent cooperation (in NWN friendship). The most straightforward conversion possible would simply use x and y as the faction standings for each faction: x would indicate the way NPCs in faction X would feel about people in faction Y and vice versa, clamping the values outside the [0, 100] range.

Figure 2: Effect of scaling A. the considerations inferred from this last example, suppose that initially L1 · P > 0 (increasing cooperation) and that 0 < λ < 1. Then we will have three alternatives when an external events occurs: • L2 · P1 > 0: The level of cooperation keeps on increasing.

Figure 3: The NWN Faction Editor.

• L2 ·P1 < 0: The level of cooperation starts to decrease.

We are also evaluating the possibility of introducing a scaling factor which could represent the relative importance of each NPC in a faction: It is reasonable to expect that more hos-

• L2 ·P1 = 0: The level of cooperation will move towards a stable value.

62

tility or frienship would be aroused by people in command positions. Hence, if we split a faction (say X for explanatory purposes) in N different ranks, then we will have some coefficients i , with i ∈ {1..N } such that: xN W N

=

x ∗ i

Table 2: The events that will occur in our mod. Events λ values A hero dies 0.25 A low-ranking character dies 0.75 Truce request 2 War declaration 0.1 Peace treaty 2.5 Riots in the city 0.6 Enemies taunting/provoking 0.9 Enemies acting friendly 1.1

(8)

NWN also makes available a Faction Editor to users of its Aurora construction toolset [3], but unfortunately its use is only limited to setting up the initial values of factions standings (see Figure 3).

3.

SCENARIOS OF USE

The conceptual framework underpinning our model is illustrated in figure 4: The level of cooperation (competition) generated by our model can be influenced by the actions that players will perform in game, but in turn the model will alter the game world perceived by players in a feedback loop. The longer term applications of our model, and the main drivers for our efforts have been navigation and generation of non-linear gameplay. Before achieving these more complex goals though, we wish to apply our model to the generation of random encounters in a CRPG like Neverwinter Nights. Other applications are possible, as detailed later on in this section.

Table 2: Each of the events is reported along with a tentative λ-value that will probably need to be fine-tuned in game, however. As dictated by our conceptual framework, not only will players be able to influence the level of competition in-game, but they will also experience first-hand the effect of the model on the random encounters in the game world.

3.2

Navigating non-linear game narrative

Suppose a game has a narrative content that is arranged in a non-linear story or short episode, we can visualize its structure as a collection of game scenes (see Figure 5). Each red

Figure 4: The conceptual framework of our model.

3.1

Figure 5: A game based on a non-linear plot.

Generating random encounters in Neverwinter Nights

circle represents a scene of the game where choices render multiple paths possible, whereas blue circles represent ordinary scenes which will just move the storyline along (yellow circles indicate start or end scenes). We envision attaching scripting logic to each of the nodes where a choice is possible, so that alternative paths are taken based on the current level of competition. Thus, our players will be able to experience different subplots as a result of their own actions and strategies. From a practical point of view, exponential growth of non-linear structures has to be kept under control due to resources implications: A widespread game structure used to preserve non-linear design without leading to unbearable resource consumption, is a convexity [27]. Each of the nodes containing scripting logic will incorporate fuzzy rules [32], describing what action should be taken based on the value of fuzzy predicates. We could theoretically use classic logic to express these conditions, but fuzzy logic is very good at expressing formal properties using quasi-natural language. For instance, we might have some scripting logic like below: IF cooperationX IS LOW THEN Action1 or: IF cooperation IS AVERAGE THEN Action2

Random encounters are common place in RPGs, for example to attenuate the monotony of traversing very large game areas. The main problem with this approach is that most of the time attentive players will notice the trick, because monsters will come out of nowhere without any apparent rationale. Because our model can modulate values of cooperation/competition over time, these can be used as cues for the application to generate random encounters. Supposing we are in a scenario where players joined faction X, their actions will cause specific in-game events able to influence the equilibrium of the system. Now, the higher the level of competition of X towards Y, the harder and the more frequent the encounters will be. Also, players will encounter NPCs willing to negotiate truces and/or alliances in case the level of cooperation is sufficiently high, in order to render the interaction more believable and immersive. Such a mechanism could be used to deter players from using a pure hack-and-slash strategy forcing them to solve puzzles, and concentrate on the storyline narrated in game. We expect that Two Families will incorporate random encounters with a structure similar to the ones depicted in

63

Clearly, opportune fuzzy membership functions will need to be created, or classical ones like trapezoid, triangle, Gaussian, etc. will need to be used. The net result though, will be scripting logic that game designers will be able to use and understand without too much hassle, and which will resemble to some extent natural language. Ultimately, the goal we have in mind is to render a new game genre viable i.e., RPS (Role-Playing Strategic). The best of both worlds, Role-Playing Games and Real Time Strategics, is pursued here as a blending of the classic story-driven approach familiar to RPG players with strategic considerations that can substantially influence the gameplay.

3.3

game is played will shape a path for the player to follow.

4.

A tool to create non-linear stories

A tool to create non-linear stories would allow game designers to both interactively script the game structure, and make changes to the structure itself. In order to restructure the game narrative it is foreseen that a more complex language will be needed that not only will be able to describe the choices occurring in the storyline, but also script more generic game events. The simplest (and probably most effective) idea we have been thinking about would see the fuzzy rules systems incorporated through an API exposed by a more generic games-friendly scripting language (e.g., Python, Lua, Javascript, etc.). An example of a language used to script narrative content is given by ABL, a reactive-planning language used to script the beats (dramatic units) in the interactive drama Fa¸cade [20]. Even though ABL did a good job in scripting Fa¸cade dramatic content, it clearly falls short in terms of complexity of the scriptable actions: All in all, Fa¸cade is a piece of interactive drama with a quite sketchy 2D interface, and not a real game (which is what we are really interested in). Also, people at the University of Alberta proposed an approach based on software patterns to help game designers in story building [12]: Scriptease, the tool they produced, can be used to automate to some extent the scripting of typical narrative and interaction patterns in Neverwinter Nights. The concept of a formal structure underpinning a story is not new at all, as it was first analyzed at large by Propp in relation to traditional Russian folktales [25]. Despite some criticism to Propp’s work, it is our intention to incorporate the core of its arguments to be able to recombine essential story elements in multiple ways: This could lead to the generation of new storylines, which can then be manually refined by game designers and writers with less effort. Ideal candidates for this task are represented by evolutionary algorithms, whose power of recombination driven by an automatic or semi-automatic fitness procedure has been applied to music [22] or graphics [31] and animation [30]. Of course, building a tool to forge non-linear stories is a far-reaching goal outside the scope of our current research, but an intention in our future work.

3.4

EXPERIMENTAL RESULTS

The prototype we have implemented has been used to conduct functional testing of the model, and it will be integrated in the scenario introduced in Section 1, currently codenamed Two Families. The ODE solvers implemented have been based on Euler’s and the midpoint method [7]: The latter is more accurate though, and it is warmly suggested for accuracy reasons. The scripts solving the ODE can be hooked up to act as a proper event handler: In our case, the natural fit is the module OnHeartbeat, which is invoked by the game engine every six seconds of real-time. Political scenarios will generally evolve slowly, but a good balance must be met to achieve both some realism and fun in playing the game. Reported below in Figure 6 are our current settings for the module we are developing.

Figure 6: The current advanced settings in our module. Finally, as we have not yet implemented a full NWN module integrating our model what we tried to achieve was some indication on how the model could behave over time. Therefore, we implemented a small simulator to predict the behaviour of the system in presence of simulated external events that will be generated by PCs and NPCs in a real game scenario instead. We have tested a number of discrete random distributions in the simulator, and we present some of the interesting results below. We want to stress that trying to use the model in any generic NWN scenario is not of much help, because the context of the story would not present the features we are definitely seeking. A simulator on the other side, can inform us about some of the patterns that will likely arise when implementing in the future our own NWN module. This idea is clearly coherent with our game design philosophy, which sees formal models and techniques as a very informative way to produce better and more predictable games. Another formal methodology, which stresses the role of game dynamics is gaining increas-

Other Scenarios

The approach considered here for manipulating stories could be extended to other game design mechanics and aspects: For instance, we might use our model to differently navigate skill trees in MMORPGs like Guild Wars [2] or World of Warcraft [4]. We might grant players specific extra skills if they relied more on cunning and oratory than on strength. We could even go as far as having a game where no predefined class is selected at any given time, but the way the

64

Table 3: The value of the parameters used in the experiments. Parameters Values k 20 l 20 a 10 b 10 g -20 h -30  0.5

ing acceptance and consideration [18]. Besides, we did not try to complete a thorough and rigorous statistical analysis (that should have also involved averaging the system trajectories over time), but rather attempted to discover some relevant patterns likely to emerge in real-world scenarios. We report in Table 3 the value of the coefficients used for the experiments that have not been changed at run-time. We designed a set of experiments using the simplest setup possible: The simulated external events all belong either to a positive or a negative category, but the possibility that no external event occurs is also contemplated. Positive events will raise the amount of cooperation, whereas negative ones will clearly make the opposite happen. Each of the experiments covers a simulated time span corresponding to 10 hours of real-time. As in NWN the OnHeartbeat event is invoked every 6 seconds and we computed 6000 ODE updates, then we obtain 36000 seconds of real-time (i.e., 600 minutes or 10 hours). In our current settings 1 minute of real-time equates to 1 hour of game-time, or 600 hours of game-time that is 25 days of game-time (probably more than we expect to provide with Two Families). We first started with a random distribution where we have 25% chances of negative events, 25% chances of positive events, and finally 50% chances of no event at all per each update (see Figure 7(a) and 7(b)). What our preliminary results seem to suggest is that starting from a slightly different initial value of cooperation for X and Y, the system can either evolve towards cyclic phases of oscillating increase and decrease in cooperation or simply approach a clearly defined state (utmost competition in the figure, but total cooperation can be achieved as well). Then, we changed the distribution to 45%, 45%, and 10%, with the results given in Figure 8(a) and 8(b). Making the assumption that it is unlikely no relevant event will occur in any ODE update (only 10%) leads to quite interesting behaviour: The system will be full of ups and downs, but will remain generally localized (the system is not chaotic). Ultimately, what these results would suggest is that our model can generate interesting and relevant behaviours that can be used by games to support the scenarios and goals listed in Section 3. Obviously, more experimentation will be performed in the future on complete games to fully validate the claims displayed here.

5.

(a) Cooperation increases, entering subsequent repeated oscillating phases.

(b) After a tenuous increase, the system evolves towards a clearly null cooperation. Figure 7: Using an event distribution of 25%, 25%, and 50%

as a Hybrid Control Problem), and analyzed some stochastic patterns generated by the factions behaviours that are likely to emerge during the production of Two Families, our forthcoming Neverwinter Nights 2 module. Two Families will incorporate both random encounters (as described in this paper), and a non-linear story that can be navigated via the output generated by our model. Having the opportunity to test the model in a real-world scenario will likely affect its current formulation. We have already identified a few areas of improvement, although. First and foremost, the scaling operator used in Section 2.2 could become a non-uniform scaling operator. For instance a and k could be scaled more or less than b and l, which would mean that an event is perceived differently by each faction. In its current formulation the model emphasizes more the general equilibrium of the system (conflict/cooperation) than the specific values of x and y. Secondly, the current choice of clamping values outside the currently used range might not prove ideal in a real game. Reformulating the module to cater for thresholds in the system could probably imply that the model has to become a non-linear ODE, with all the associated problems in terms of accuracy of the solution and run-time computational efficiency of the ODE solver adopted (in this case a RungeKutta method will be more likely to lead to good results). Finally, a very interesting possibility would involve a random

CONCLUSIONS

We introduced our modified version of Richardson’s model that, based on a stop-and-go variant, provides game designers with a tool to introduce political scenarios in their storydriven games and game mods. We have discussed the formal properties of the model (that can be more formally regarded

65

Other approaches to formal game design proposed different views on the matter: • Semi-formal approaches as in the 400 Design Rules project [13]; • FADT (Formal Abstract Design Tools) [9]; • Game Design Patterns [6], heavily borrowing from the software patterns [15] metaphor; • An approach based on Petri-Nets to describe quests in a formal (and graphical) way [23]. (a) A cyclic phase of increase/decrease of cooperation. None of these approach has become dominant to date, or at least widely used in the games development community.

6.

REFERENCES

[1] Bioware’s Neverwinter Nights web site. http://nwn.bioware.com. [2] Guild Wars web site. http://www.guildwars.com/. [3] Neverwinter Nights Builders web site. http://nwn.bioware.com/builders/. [4] World of Warcraft web site. http://www.worldofwarcraft.com/. [5] D. Alighieri. La Divina Commedia. Mondadori, Milano, 2005.

(b) Still ups and downs, but with a different initial dynamic.

[6] S. Bjrk, S. Lundgren, and J. Holopainen. Game design patterns. In Proceedings of the Level Up: Digital Games Research Conference, November 2003.

Figure 8: Using an event distribution of 45%, 45%, and 10%

[7] W. Boyce and R. DiPrima. Elementary Differential Equations and Boundary Value Problems. John Wiley & Sons, Hoboken, 2004.

noise term to be added to the current equation to yield: z˙

=

Az + r + (t)

(9)

[8] M. Branicky, V. Borkar, and S. Mitter. A unified framework for hybrid control: Background, model, and theory. In Proceedings of the 11th International Conference on Analysis and Optimization of Systems, June 1994.

Here, (t) is a noise term that can be used to perturb the solution of the ODE leading to disturbances in the behaviour of the system. This term could factor in that no matter how much a faction’s behaviour is coherent, it will never be totally unilateral. This possibility opens the door for even richer behaviour and twists in the story as NPCs might change faction or act as mavericks in their own factions. The problem with such a modification is that the ODE becomes an SDE (Stochastic Differential Equation) [24], which is generally harder to solve, although methods exist like the Euler-Maruyama [17] that work well in a linear case. One of the most ambitious aims of our work will also include a tool that can assist designers in creating story-driven games and mods using the approach described in this paper, and lead to the proper definition of an RPS genre as hinted in Section 3.2. A final consideration regards formal methodologies for game design: Our view is that more formal processes and methodologies need to be investigated to help inform and guide the game development process. In the current climate of skyrocketing budgets, the games industry simply cannot afford to spend huge amounts of time and money on testing approaches in a totally informal and unguided fashion. The already cited MDA approach [18] seems a good starting point in this sense, but it is far from being widespread.

[9] D. Church. Formal abstract design tools. Online article available at www.gamasutra.com, 1999. [10] C. Crawford. The Art of Computer Game Design. Course Technology PTR, New York, 1984. [11] C. Crawford. Chris Crawford on Game Design. New Riders, Berkeley, 2003. [12] M. Cutumisu, C. Onuczko, D. Szafron, J. Schaeffer, M. McNaughton, T. Roy, J. Siegel, and M. Carbonaro. Evaluating pattern catalogs - the computer games experience. In Proceedings of the 28th International Conference on Software Engineering (ICSE ’06), May 2006. [13] N. Falstein. Better by design: The 400 project. Game Developer Magazine, 9(3):26 – 27, 2002. [14] G. Freytag. Freytag’s Technique of the Drama. Griggs & company, Boston, 1995.

66

[15] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Professional, Boston, 1995. [16] J. Goulet. Richardson’s arms model and arms control. In Proceedings of the SIAM Conference on Discrete Math and Its Application, June 1983. [17] D. Higham. An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM (Society for Industrial and Applied Mathematics) Review, 43(3):525 – 546, 2000. [18] R. Hunicke, M. LeBlanc, and R. Zubek. MDA: A formal approach to game design and game research. In Proceedings of the AAAI-04 Workshop on Challenges in Game AI, July 2004. Available online at http://www.cs.northwestern.edu/ hunicke/pubs/MDA.pdf. [19] J. Lee. Introduction to Topological Manifolds. Springer-Verlag, New York, 2000. [20] M. Mateas and A. Stern. Structuring content in the Facade interactive drama architecture. In Proceedings of the Artificial Intelligence and Interactive Digital Entertainment Conference, June 2005. [21] D. Michael and S. Chen. Serious Games. Games That Educate, Train, and Inform. Osborne/Mc Graw, Boston, 2005. [22] E. R. Miranda and A. Biles, editors. Evolutionary Computer Music. Springer-Verlag, Nwy York, 2007. [23] S. Natkin, L. Vega, and S. Grnvogel. A new methodology for spatiotemporal game design. In CGAIDE 2004, 5th Game-On International Conference on Computer Games: Artificial Intelligence, Design and Education, November 2004. [24] B. Øksendal. Stochastic Differential Equations. Springer-Verlag, New York, 2000. [25] V. Propp. Morphology of the Folktale. University of Texas Press, Austin, 1968. [26] M. Puzo. The Godfather. NAL Trade, New York, 2002. [27] S. Rabin, editor. Introduction to Game Development, chapter 2.1, pages 71–97. Charles River Media, Hingham, 2005. [28] L. Richardson. Arms and Insecurity. Boxwood Press, Pittsburgh, 1960. [29] W. Shakespeare. Romeo and Juliet (Folger Shakespeare Library). Washington Square Press, Washington, 2004. [30] K. Sims. Artificial evolution for computer graphics. In Proceedings of the SIGGRAPH ’91 Conference, July 1991. [31] K. Sims. Evolving virtual creatures. In Proceedings of the SIGGRAPH ’94 Conference, July 1994. [32] L. Zadeh. Fuzzy sets. Journal of Information and Control, 8:338–353, 1965.

67

Experiencing ‘Touch’ in Mobile Mixed Reality Games Paul Coulton

Omer Rashid

Will Bamford

Informatics Lancaster University Lancaster, UK, LA1 4WA 44 (0)1524 510393

Informatics Lancaster University Lancaster, UK, LA1 4WA 44 (0)1524 510537

Informatics Lancaster University Lancaster, UK, LA1 4WA 44 (0)1524 510537

[email protected]

[email protected]

[email protected]

objects in this way creates a relational interface [12] in which the system will map the relationship between physical objects to a computational representation [12]. In mixed reality games these types of interface can often also be viewed as reality interfaces which draw their strength from exploiting the user’s pre-existing abilities and expectations rather than trained skills [6].

ABSTRACT In this paper, we discuss player experiences relating to the interaction with objects included as part of mobile mixed reality games. In particular we focus on the player experience of a touch based reality interface using mobile phones equipped with RFID/NFC reader/writers for interaction with the RFID tagged objects used in three different mobile mixed reality games. The player experience highlights that the simple interface of touching the phone to the tag on an object is both readily understandable and very simple to use for game players, and enhances rather that detracts from the game experience. Further, the use of objects in mobile mixed reality games can speed up the game play and reduce the presence of ‘the mixed reality stoop’ where players have to concentrate so intently on the virtual world on their mobile screen they become almost oblivious to the real world setting.

In terms of the actual technology, that will facilitate these reality interfaces between the player’s mobile phone and the real objects within the game, the principle methods available can be defined as either wireless or visual. Note that we have not included the scanning process that precedes interaction as this can also be visual even with wireless techniques. These wireless techniques could benefit from a range of technologies such as WiFi, Bluetooth, IrDA, or more recently Radio Frequency Identification (RFID), although each will vary the detection proximity distance between the phone and the object and thereby alter the dynamics of the game play. In general our experience is that players prefer to be in close proximity to the object they wish to interact with. This would thus bias towards Bluetooth, IrDA, and RFID, although all but the RFID would require the objects to be augmented with power supplies and some form of processor. As for visual methods the principal method apart from some form of complex object recognition software are the various forms of two dimensional bar codes, such QR codes [5]. All of the two dimensional barcode systems use a phone with an on-board camera to either decode the code on the phone or through interaction with an online database. Despite the fact that 2-D barcodes would also negate the need for power supplies and processors in the objects we preferred RFID and its associated technology Near Field Communications (NFC) because it provides [10];

Categories and Subject Descriptors: J. Computer Applications, J.7 Computers in Other Systems-Consumer General Terms: Design, Experimentation, Human Factors. Keywords: Touch, Interaction, Mobile, RFID, NFC, Mixed Reality.

1. INTRODUCTION Mixed and augmented reality games link the physical and digital worlds to create new experiences. The types of games will often incorporate knowledge of their physical location and landscape, and then provides players with the ability to interact with both real and virtual objects within the physical and digital worlds. These games can incorporate some, or all, of these elements using a rich variety of technologies across all games genres [7]. Amongst the most interesting are the location based games utilizing mobile phones [9] as they are the most pervasive consumer device of the current era allowing real world deployment with large numbers of people. As yet these games have generally relied on virtual objects but the use of real objects can greatly enhance the user experience and encourage greater immersion within the fantasy of the game. The use of physical

• faster read times, as the tags can be accessed at rates between in excess of 100 kbits/s, whereas the two dimensional barcodes require Image capture and processing which we found typically takes a few seconds •

RFID tags can be written to as well as read from

• simpler reading method, as the phone and the tag have merely to ‘touch’ or be placed in close proximity (less than 3cm) whereas the barcodes require the user to take a picture. In fact, the phone and RFID tags used in this project provide round target icons to make positioning intuitive. • more robust, as errors are more likely to occur when scanning a barcode due to irregular camera orientation.

.

68

relatively low, 106, 212, or 424 kbits/s, although for the applications envisaged this should be more than sufficient [3].

Although prior research has been conducted on the use of mobile phones and RFID tags [11] this research utilized a separate RFID reader connected to the phone via Bluetooth. This is therefore a more complex experience for the user and the researchers therefore concentrated on the experiences of associating icons with tag functionality. In the research presented in this paper we present results of user experience of the simple touch based reality interface when using mobile phones equipped with in-built RFID/NFC reader/writers in three different mobile mixed reality game scenarios. Before describing these three games, and the associated user experience of those games, we shall start by providing and overview of RFID and NFC the incorporation of this technology into mobile phones.

2.2 Combining Phones with RFID/NFC Nokia was the first to combine mobile phones with RFID/NFC when it introduced clip-on RFID and NFC shells (Nokia Xpresson Mobile RFID/NFC Kits) for the 5140 and 5140i Series 40 phones respectively. The RFID/NFC shells can be accessed via J2ME applications running on the phone to trigger defined actions within the application. The particular operating range for mobile phones is generally 13.5 MHz which limits the range to approx 3 cms or touch as shown in Figure 1.

2. RFID/NFC In simple terms RFID is a wireless extension of bar-code technology allowing identification without line-of-sight. Although it has become most well known through the recent wide scale adoption of the technology by the likes of Wal-Mart and Tesco as a replacement for barcodes [13], it is also seeing more innovative uses. For example, Mattel, introduced the HyperScan RFIDenabled game console which combines traditional one-on-one fighting games with collectible superhero cards. Players swipe the card over the console to choose characters and additional cards can be swiped during game play to add more powers or skills. Another area that has greatly benefited from RFID is micro payments which has seen huge take up in places such as Japan and Korea and has driven the convergence of this technology with the pervasive mobile phone. In the following sections we shall discuss the technology behind RFID/NFC and in particular their incorporation within mobile phones

2.1 Technology RFID tags, a simple microchip and antenna, interact with radio waves from a receiver to transfer the information held on the microchip. These RFID tags are classified as either active or passive, with active RFID tags having their own transmitter and associated power supply. Passive RFID tags do not have their own transmitter; they reflect energy from the radio wave sent from the reader. Active RFID tags can be read from a range of 20 to 100m, whilst passive RFID tags range from a few centimeters to around 5m (dependent on operating frequency range).

Figure 1. Phone/RFID Touch Interface These phones were but the first of a growing trend and in the Printed Electronics Review on the 13th of May 2006 the Japanese telecommunications giant NTT DoCoMo was reported to have shipped more than 5 million RFID enabled mobile phones for use instead of printed tickets in the National Rail Network. Further Sony has started to ship all of its laptops with RFID/NFC technology so that users can download straight to RFID cards or RFID mobile phones.

NFC is an interface and protocol built on top of RFID and is targeted in particular at consumer electronic devices, providing them with a secure means of communicating without having to exert any intellectual effort in configuring the network [3]. To connect two devices together, one simply brings them very close together, a few centimeters, or make them touch. The NFC protocol then automatically configures them for communication in a peer to peer network. Once the configuration data has been exchanged using NFC the devices can be set up to continue communication over a longer range technology. The other advantage with NFC comes in terms of power saving, and it achieves this by having an Active Mode (AM) and Passive Mode (PM) for communication. In AM both devices generate an RF field over which they can transmit the data. In PM, only one device generates the RF field, the other device uses load modulation to transfer the data. The data rates available are

3. MOBILE MIXED REALITY GAMES In terms of mixed reality games there are few which have incorporated physical objects as an integral part of the game and of those that do, only ConQwest, is mobile phone based. ConQwest was a team-based treasure hunt game developed by area/code1, sponsored by Qwest Wireless in the USA to promote its camera phones, and uses implied positioning from semacode tags. The game area must be ‘set-up’, prior to the game being played, by placing semacode stickers at defined locations around 1

69

www.playareacode.com

the selected urban landscape. Each sticker is given a relative value and players collect them by taking pictures using their phone cameras; the first team to collect $5000 worth2 semacodes wins. As semacodes are a version of 2-D barcodes it suffers the same limitations with reference to RFID as discussed previously.

animated icon whilst the Ghosts see both a white square highlighting their animated icon and red flashing square around PAC-LAN. These character highlights were added after pre-trials revealed players wanted a quicker way of identifying the most important information. The Ghosts can ‘kill’ the PAC-LAN character by detecting him/her via an RFID tag fitted on their costume (as shown in Figure 2), assuming their kill timer has not run-out.

In the follow paragraphs we shall describe the operation of the three mobile/RFID mixed reality games used to collect the user experiences discussed in the subsequent section.

3.1 PAC-LAN PAC-LAN is a version of the video game PACMAN in which human players play the game on a maze based around the Alexandra Park accommodation complex at Lancaster University [10]. One player who takes the role of the main PAC-LAN3 character collects game pills (using a Nokia 5140 mobile phone equipped with a Nokia Xpress-on™ RFID reader shell as shown in Figure 1), which are in the form of yellow plastic discs fitted with stick-on RFID tags placed around the maze as shown in Figure 2. The discs are a direct physical manifestation of the virtual game pills on the mobile screen and are placed at the real location corresponding to the virtual maze.

Figure 3. PAC-LAN kill tags

Once PAC-LAN is killed the game is over and the points for the game are calculated in the form of game pills collected and time taken to do so. When PAC-LAN eats one of the red power pills, indicated by all ghost icons turning white on the screen, he/she is then able to kill the Ghosts, and thus gain extra points, using the same RFID detection process. ‘Dead’ ghosts must return to the central point of the game maze where they can be reactivated into the game. Figure 4 shows a number of typical screens the PACLAN character will experience throughout the game.

Figure 2. PAC-LAN Trials Four other players take the role of the ‘Ghosts’ who attempt to hunt down the PAC-LAN player. The game uses a Java 2 Platform Micro Edition (J2ME) application, running on the mobile phone is connected to a central server using a General Packet Radio Service (GPRS) connection. The server relays to the PAC-LAN character his/her current position along with position of all ghosts based on the pills collected. The game pills are used by the Ghosts, not to gain points, but to obtain the PAC-LAN characters last known position and to reset their kill timer which must be enabled to allow them to kill PAC-LAN. In this way the ghosts must regularly interact with the server which is then able to relay their position to the PAC-LAN. PAC-LAN sees a display with his own position highlighted by a red square around his 2

A virtual rather than actual value.

3

www.pac-lan.com

Figure 4. PAC-LAN Phone UI

70

Writers first register on the project website, www.mobspray.com, with a unique tag name of up to five characters and then upload their own custom mobtag to the database operating on this central server. Writers’ are then able to use their own mobtags, or view other writers’ mobtags, by accessing this database using a GPRS connection initiated by the J2ME application on the phone when an RFID tag is read. Once a writer reads an RFID tag, the client application, shown in figure 6, displays the contents of the RFID tag which consists of a tag location string and the names of last writers to have visited that particular RFID tag [4].

The scoring in the game is simple where the PAC-LAN character gets, 50 points for a normal pill, 150 points for a power pill, 1000 points for collecting all the pills, and 500 points for a Ghost kill. The Ghosts get 30 points per pill (this is linked to the length of the kill timer) and 1000 points for killing PAC-LAN. All players lose 1 point per second to ensure they keep tagging.

3.2 Mobspray The virtual graffiti game described here, known as ‘Mobile SprayCan or Mobspray4’, takes its inspiration from the existing practice amongst traditional graffiti writers5 in their use of stamps6 but in this case using stick-on 13.5 MHz RFID [4]. Unlike traditional stamps, the RFID tags are not just a repository for the actual writers tag but represent a physical location where digital writers can ‘get-up’7 their mobtag8 using the application on their mobile phone. For this project we produced postcards with the Mobile SprayCan logo and RFID tags attached, as shown in Figure 5, to make tag sites easily identifiable for our test crew. Although, some innovative research into creating RFID icons related to the function associated with the tag is being conducted [1] we felt for this particular project that the postcard provide greater opportunity for branding with the corresponding online game site, although we may explore the use of game based icons in later research.

The application then connects to the database which returns the time and date at which those writers tagged that location. The writer can then choose to view any of the writers’ mobtag images or place their own mobtag at that location and these details are stored within the database.

Figure 6. MobSpray UI

If the user chooses to write his tag, the application creates a new list of the last 5 writers by dropping the last writer from the previous list and shifting the remaining by one position. This list is then written on the RFID tag [4].

Figure 5. MobSpray Site Marker 4

www.mobspray.com

5

Graffiti practitioners refer to themselves as writers rather than artists as the most basic form of modern graffiti art, the ‘tag’, represents the writer’s signature. It is normally in the form of a moniker or nickname of four to six letters in length

6

A later metamorphism of the tag was the stamp, which is a pretagged sticker and in the US the stickers used are often overnight postal delivery stickers.

7

Get-up or getting-up is the term used by writers for the physical act of writing their tag at a particular location.

8

To avoid confusion between RFID tags and graffiti tags we have coined the term ‘mobtag’ to describe the images created in this project.

It would be possible to achieve the same functionality without storing any data on RFID tags, with writers’ interactions and the tag location stored server-side. However, the design outlined was chosen in order to demonstrate the potential for future systems where tag storage is not as severely constrained and also because of graffiti’s inherent nature of physical interaction with a particular location [4]. We have tried to create greater gaming through a website that links the tags to a map of the local area which marks all Mobspray sites together with a history of mobtags submitted by different writers as shown in Figure 7. This produced a number of gaming behaviors amongst the writers.

71

In terms of operation Mobile Treasure Hunt (MobHunt) [8] utilizes the same combination of mobile application connecting to a server over GPRS as for PAC-LAN and Mobspray. However, it differs from the previous two RFID systems in that it was designed for flexibility so that new treasure hunts, in new or existing locations, can be created quickly and easily utilizing a simple mobile application and online website.

Firstly, as we had linked the website to the mobtags a competitive aspect of tagging did emerge as writers tried to get their tag on as many locations as possible, a practice known as ‘bombing’ amongst conventional writers. There is an obvious ludic parallel of trying to win the game. Secondly, the mobtags started out as very simple one color signatures in the majority of cases but a competitive element emerged where the mobtags became much more elaborate in a similar vein to traditional spraycan tags as discussed in section two. This is also akin to the ludic desire of identifying yourself as an experienced player rather than a novice.

There are two mobile clients for MobHunt one is the basic user client and the other is an administrator client which allows for new MobHunt to be created dynamically in a real world scenario. The MobHunt user client is very simple application allowing the user to login before instigating their authentication with the web server. Initially we implemented a standard name and password login solution but decided to simplify the process for the user by providing a unique membership card, as shown in Figure 8, that could be kept in a wallet or purse that the user simply touches to their phone to initiate login This proved to be a much more elegant solution and considerably easier for the user given the restrictions of the mobile phone keypad as an interface [2].

Finally, the storage of a list of the last writers to tag a particular location gave a sense of community and belonging to a space which appears to be an important attribute for players looking for a more socially orientated game play [4].

Figure 7. MobSpray Website

3.3 MobHunt The inspiration for Treasure Hunt games arguably started with the early stages of the development of archaeology as this often included a significant aspect of treasure hunts and they certainly served to capture public imagination. Treasure Hunt games9 are either a single player or a group of players trying to find hidden articles, locations or places by using a series of clues. In the earliest versions of these games the clues consisted of pieces of paper secreted at specific location. A modern variant of this is Geocaching which is an outdoor treasure-hunting game in which the participants use a Global Positioning System (GPS) receiver or other navigational techniques to hide and seek containers (called "geocaches" or "caches") anywhere in the world. A typical cache is a small waterproof container containing a logbook and "treasure", usually toys or trinkets of little monetary value. Also common are objects that are moved from cache to cache, such as Travel Bugs or Geocoins, whose travels may be logged and followed online. In our version of this phenomenon clues are placed on RFID tags which can be read by a suitably equipped mobile phone and information can be changed or added in realtime using the data network.

9

Figure 8. MobHunt Login/Membership Card and Site Marker

Once the user is logged into the game they simply tag the first RFID tag of the treasure hunt to start the game. A visual place marker was used to make locations or objects used within a game easily visible to players as shown in Figure 8. The application also allows the users to browse through the clues as they are collected, this is because initial trials showed users often sought to reconfirm their earlier decisions if the got ‘stuck’ at some point within the game. The mobile administrator client allows the creation of MobHunts in the field by scanning tag and entering a description. This information can be subsequently be altered or amended on the web but we felt it would be beneficial to able to place some tags whilst surveying a site for likely objects or places to include in a game. Figures 9 and 10 provide some example screen shots for both the administrator and user mobile clients respectively.

Scavenger hunt games are a similar concept although the idea is to collect a set of pre-defined objects within a certain time. A successful mobile version using SMS alerts and email to transfer photos to a website is available at www.ispot.com.

72

4. USER EXPERIENCE 4.1 User Groups Before we discuss the results of the usability and users experience it is important to consider the nature of the users from whom we gained the feedback for each of the three games. The data for PAC-LAN was from eight games played by forty five different players (five per game) and was taken after their first experience of the game. The players were students from Lancaster University who answered an advert distributed via the Universities email system. The player groups were selected by the research team and care was taken to select groups from the various faculties across the campus to ensure which did not have a technophile biased sample. Of these forty five players, seven were female and the players were aged between 18 and 24 except six males aged between 25 and 35. Mobspray was played by six staff researchers from Lancaster University and consisted of five males and one female with an average age of 29. All but one of the players had no experience of RFID enabled phones prior to the game although they all considered themselves as highly technology literate and as such would likely represent the so called ‘ early adopter’10 user group. A version of MobHunt was created as tour around the Infolab21 building at Lancaster University and unlike the previous two games was designed to be played indoors rather than outside. The game itself was very simple and primarily provides a tour around the building and its facilities as shown in Figure 11.

Figure 9. MobHunt Administrator Application

Figure 11. Infolab21 MobHunt The game was played by 12 people within Infolab21 and was made up of both University staff and employees from companies within the Infolab21 incubator unit. The ages ranged between 22 Figure 10. MobHunt User Application 10

73

A person who embraces new technology before most of their peers. They tend to be the first to buy or try out new hardware items and programs or newer versions of existing programs. Defined by Everett Rogers in Diffusion of Innovations, The Free Press: New York 1995.

Overall the overwhelming user experience across all of the games was that the simple touch interface was very easy to use and as one player put it

and 44 and there were eight males and four women none of whom had any prior exposure to RFID phones.

4.2 Usability

“There is something intuitive about simply touching the object you want to connect with”

Whereas our previous papers on these projects has presented a largely technical and qualitative analysis [4,8,10] in this paper we concentrate on aspects of user’s behavior common to all three games and the following information has been gathered from both questionnaires and ethnographic studies.

Further the vast majority of the players expressed a willingness to use and RFID enabled mobile phone to access other services which is encouraging for the proponents of this technology

In terms of learning to use the phone with the tags this proved to be remarkably easy and a very quick demonstration and explanation was all that players required. Interestingly unlike previous research [11] where users expressed concern about the social acceptability of touching tags in public places even the University campus where the trial took place, none of the users in our trials expressed this worry even though they were all conducted in environments open to the general public. When we discussed this further with some of the users they felt that in many ways creating questions in the mind of non players, who were not aware that users were playing a game, added to the sense of fun.

5. CONCLUSIONS Mobile mixed reality games offer very unique experiences for people to interact in news ways with their environment. The use of mobile phones equipped with RFID/NFC allows these experiences to be extended down to object level and are helping create Mark Weiser’s seminal vision of future technological ubiquity – one in which the increasing “availability” of processing power would be accompanied by its decreasing “visibility”, the so-called internet of things. Further the traditional phone interface is generally too cumbersome for anything beyond dialing numbers and the simple act of touching an object to gain access to information and services is both simple and intuitive for the majority of users.

The users found the objects very useful compared with just placing an RFID tag at a location as they found it much easier to see and felt it added to the immersion within the game play. Another aspect of the objects was that for PAC-LAN, which was played at a much faster pace than the other two games, the players felt that the game disks were an important element of the game experience and minimized the time they had to spend checking their position on the mobile phone screen. Having played many location based games that rely on purely virtual objects we observed that players often become completely focused on the screen to guide them and often become oblivious to their environment which both defeats the premise of mixed reality gaming and can also be very dangerous. Our observations of the games presented in this research showed that players were much more immersed in their environment because of the real objects and were less inclined to rely solely on what the phone screen indicated.

6. ACKNOWLEDGMENTS The authors wish to acknowledge the support of Nokia for the provision of phones, RFID/NFC SDK, and access to the LI server to the Mobile Radicals Research Group at Lancaster University for the development of the games featured in this paper.

7. REFERENCES [1] Arnall, T., A graphic language for touch-based interactions, Mobile Interaction with the Real World Worshop, Mobile HCI, Espoo, Finland, September 12 2006, [2] Coulton P., Rashid O., Edwards R. and Thompson R. “Creating Entertainment Applications for Cellular Phones”, ACM Computers in Entertainment, Vol 3, Issue 3, July, 2005

One of the other aspects we experimented with was related to giving the user feedback after they have successfully read or written from or to a tag. For PAC-LAN we initially created version that had either visual feedback, through a pop-up note, or audio feedback, by playing a short tune. The audio feedback was unanimously preferred as players were often running at speed and the audio feedback was perceived much less intrusive on the game and harder to miss. Whilst the small screen available on the phone used (128 by 128 pixels) no doubt made notes more difficult to read in PAC-LAN, the fact that tagging the game pill produced a visual change in the pill on the maze (pill changes from yellow to grey) meant there was little benefit from a textual note. In the cases of Mobspray and MobHunt we provided an audio alert to indicate the tag was read, but then displayed either, the list of writers in the case of Mobspray, or the clue in the case of Mobhunt. As Mobspray also provided for a write to the tag we found a visual note was more beneficial in this case as the time to write to a tag is longer than to read, and the user liked the fact that the application said it was writing until the process had completed.

[3] ECMA, Near Field Communication: White Paper, Ecma/TC32-TG19/2004/1, 2004. [4] Garner,P., Rashid, O., Coulton, P.,and Edwards, R., "The Mobile Phone as a Digital SprayCan”, Proceedings of ACM SIGCHI International Conference On Advances In Computer Entertainment Technology, Hollywood, USA, 14-16 June 2006. [5] International Organization for Standardization, Information Technology – Automatic Identification and Data Capture Techniques – Bar Code Symbology – QR code”, ISO/IE 18004, 2000. [6] Jacob, R.J.K., “What is the Next Generation of HumanComputer Interaction?, proceedings of ACM CHI 2006 Human Factors in Computing Systems Conference, pp 17071710. [7] Magerkurth, C., Cheok, A.D., Mandryk, R.L., and Nilsen, T., Pervasive games: bringing computer entertainment back to the real world, ACM Computers in Entertainment, Vol 3, Issue 3, July, 2005.

74

[8] Nettleton, O., Location Based RFID Treasure Hunt Game, MSci Dissertation of Degree of IT and Media Communications, Lancaster University, 2006.

[11] Riekki, J., Salminen, T., Alakärppä, I., Requesting Pervasive Services by Touching RFID Tags, IEEE Pervasive Computing, vol. 5, no. 1, pp. 40-46, Jan-Mar, 2006.

[9] Rashid, O., Mullins, I., Coulton, P., and Edwards, R., Extending Cyberspace: Location Based Games Using Cellular Phones, ACM Computers in Entertainment, Vol 4, Issue 1, 2006.

[12] Ullmer, B.; Ishii, H., Emerging Frameworks for Tangible User Interfaces, in Human- Computer Interaction in the New Millenium, John M. Carroll (ed.), Addison-Wesley, August 2001, pp. 579-601.

[10] Rashid, 0., Bamford, W., Coulton, P., Edwards, R., and Scheibel, J. PAC-LAN: Mixed reality gaming with RFID enabled mobile phones, to appear ACM Computers in Entertainment October 2006.

[13] Want, R., An introduction to RFID technology, Pervasive Computing, IEEE, Volume 5, Issue 1, Page(s): 25 – 33, Jan.March 2006.

Columns on Last Page Should Be Made As Close As Possible to Equal Length

75

Infinitely Adaptable Gaming: Harnessing the Power of Distributed Network Environments and Component Reuse M. Merabti, P. Fergus, D. Llewellyn-Jones, A. El Rhalibi Networked Appliances Laboratory School of Computing and Mathematical Sciences Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK. Email: {M.Merabti, P.Fergus, D.Llewellyn-Jones, A.Elrhalibi}@ljmu.ac.uk an entire course within the Second Life virtual world. These new virtual environments are fast becoming a key part of modern day life, where people escape from the physical world they inhabit.

Abstract: Recent years have seen a growth in the popularity of massively multiplayer online games. At the same time, many games have become more modular and open to adaptation by general gamers, allowing non-experts to fashion their own game environments. However, most games allow adaptation and extension only within well-defined limits, and the fast progress of gaming capabilities renders game engines obsolete within just a few years of their release. To overcome such difficulties we propose a gaming architecture that exploits the capabilities of arbitrarily interoperable networked components, providing mechanisms to harness the components offered by different games. This allows emergent behaviour to surface, whereby games evolve to include graphical objects and strategic behaviours they were not initially designed to have. Combining different gaming components is challenging. Although much work has been done on ‘modding’ there is a need to automate this process to improve and adapt game play with little or no human intervention. This will allow for better exploitation of dispersed gaming functionality and provide obvious benefits to game players and developers alike. It also presents exciting opportunities to allow the projection of real-world physical objects into the gaming environment. Our approach provides a new perspective that we demonstrate using a working prototype to implement a game that allows renderers and game objects provided by other games to be discovered and used to adapt game play.

This somewhat changes the dynamics of conventional gaming. The idea of merging the virtual with the physical via global networks provides new platforms for innovation. However, even the relatively new technologies that allow for massively multiplayer environments rely on traditional centralised client-server techniques with only limited practical extensibility. Researchers suggest that new platforms will exploit networks and massively multiplayer online games to include self-management capabilities where games self-adapt through the acquisition of new resources. Games will initially be developed with base functionality. However, over time, they will evolve to include additional components provided by other games they were not initially designed to support. In this way games will evolve and change far beyond their initial inception. They will not simply follow pre-defined game play rules or inbuilt behaviours, but extend their capabilities to include gaming rules and components provided by other games. This can be distinguished from ‘modding’ where developers utilise gaming engines, software development kits, and level editors to adapt games within relatively fixed parameters [2-4]. Rather, new platforms must provide autonomic mechanisms that allow components to be automatically discovered, integrated and assessed based on game play situations. Increasing the granularity of interchangeable game components, we can allow gaming applications to evolve and adapt without limits, reacting to increased user familiarity, promoting innovation and counteracting the otherwise inevitable obsolescence of game engines as technology advances.

Keywords: Networking, Games, Peer-to-Peer, Service-oriented Architectures, MMOG, Semantics, Self-Management, Evolution, AI

I.

INTRODUCTION

Gaming is at the forefront of next generation entertainment systems and constantly aims to push technological boundaries to their limits. Over the years there has been a marked evolution in how games are developed and played. Rather than single machine installations, such as Knights of the Old Republic, games have been enriched with networking capabilities giving rise to massively multi-player online games such as World of Warcraft [1]. New approaches to game entertainment, such as Second Life, are utilising networking features to provide a different user experience. Although this is not conventional game play in the traditional sense, the technology has given rise to a new form of entertainment where the principles of gaming are superimposed onto the behaviour models typically associated with the physical world. For example, Bradley University is the first University to teach

Technologically, the prospect of interoperating potentially heterogeneous gaming components presents significant challenges and there is a need to automate this process to improve and adapt game play with little or no human interaction. This will allow for better exploitation of games and the components they provide, increasing the ease with which games are able to share components, adapt virtual

76

environments, advance capabilities such as rendering engines and improve strategic functionality. But this raises some questions. Most notably, how can game components be dispersed within networked environments so that they can be shared between other heterogeneous games? In the short term centralised approaches such as Xbox Live can be used that carefully control how components are shared and used. However, the long-term challenge is to develop a suitable platform that will allow games to automatically integrate and manage components in global networks in a decentralised way and without any third-party intervention.

networking, its client-server architecture enforces a number of limitations. Most notably, game play and enhancements must be carefully controlled through centralised gaming servers. This results in bottlenecks, central points of failure, and the inability to appropriately react to real-time changes in large virtual worlds. Gamers are tied to games through proprietary software and hardware installations. User interactions do not affect strategic developments and games do not support selfmanagement capabilities to extend functionality beyond those they have been pre-programmed with. This has lead to shifts within the gaming industry, where increasing access to game engines, software development kits and level editors has allowed games to be changed more easily. This phenomenon – known as modding – marginally alleviates some of the limitations discussed above [2-4]. Although modding provides a means of adapting and evolving games, it is restricted to more technically savvy users, such as software developers, rather than people who simple just play games. Furthermore, mods are tied to specific games. For example, a mod developed for the unreal engine will be incompatible with the quake engine. Some researchers suggest that distributed technologies in conjunction with middleware may relieve many of these difficulties, however it is generally agreed that more research is required to establish a suitable architecture [11].

We propose a platform whereby games can disperse their resources within and across networked environments. The platform has a number of advantages over existing gaming solutions. First, it allows the components a game provides to be dispersed as ad hoc peer-to-peer services. Second, it allows games to discover and utilise shared components using ontologies. And third, it allows games to self-adapt to user interaction and the influx of gaming components to evolve virtual gaming worlds, including strategic game play and strategic behaviours. As an additional benefit, we find that such a platform would allow the natural integration of physical networked appliances into gaming environments through projection, blurring the boundaries between physical and virtual environments. We take an opposite approach to that proposed by projects such as ARQuake [5], which aim to project gaming aspects onto the physical environment using augmented reality. Instead, we project physical objects into the gaming world, thereby releasing real devices from their physical constraints whilst allowing the game to benefit from real world functionality.

Modding is an activity that runs alongside mainstream games development, with developers providing modding tools as a way to attract customers. In essence modding is seen as a business strategy. Although not explicitly stated, incentives to mod games are used as a means of generating free development for publishes, for example through the use of modding competitions that act as a means of screening game enhancements in order to include them in future releases. In most cases this is an unpaid source of labour and gaming organisations carefully control how it is executed [2]. Through competitions and gaming subscriptions for massively multiplayer online games, the industry has a healthy flow of mod software. In support of this several game companies adopt the principle of modding as a key strategy, where only a base solution is initially provided. Any enhancement to the game thereafter is dependent on user modifications. One example of this is BioWare’s Neverwinter Nights, which is heavily reliant on gamer-created content [12]. Successful mods have been incorporated into subsequent releases. One example of this is Counterstrike, which is a modification for team play of Valve Software’s Half-Life [12]. In this case modding can be seen as an important and welcome source of innovation where commercial risks are not afforded by the gaming industry, who typically rely on the goodwill of the modders [3].

The remainder of this paper is structured as follows. In the next section we discuss current research initiatives relating to massively multiplayer online games, modding, and distributed gaming. Section III introduces the concepts behind and details of our proposed framework. This is followed in Section IV by a description of the implementation and prototype for this framework. Finally we conclude and discuss our future work. II.

BACKGROUND AND RELATED WORK

Massively multiplayer online games already attract huge numbers of players and are expected to become increasingly popular, forming the basis for next-generation gaming. Utilising Internet communications, games have blurred virtual and physical worlds and converged with social networks [6]. This has changed how users view and play games. Many games such as Planetside [7], Star Wars Galaxies [8], The Sims Online [9] and EVE Online [10], are dependent on network communications. None more so than the game World of Warcraft, which became the fastest selling PC game in North America in 2004-2005 and in 2006 was reported to have 6 million subscribers worldwide[1]. Although multiplayer gaming clearly provides significant benefits over single-player games though the use of

Whilst modding has been discussed from a game coding perspective, mods may also exist as part of and within the game itself. Communities such as Second Life [13] are heavily reliant on users shaping the virtual environment, extending the concept of MUDs into realistically rendered virtual worlds

77

[14]. Graphical objects of any description can be developed and added to the virtual world, which can then be shared or sold between avatars’ within that world. Modifications to land can be made and buildings can be constructed. This differs somewhat from conventional modding in that all modifications take place within the virtual world. However, there is no mechanism to allow the objects created in Second Life to be shared and distributed amongst different online games and a better approach would be to expose these modifications so that they could be utilised universally.

Adopting the principles laid out in our earlier work [15] should allow not only principles of modding to be incorporated into a distributed gaming platform, but also the provision of mechanisms to automate this process. Consequently the adaptation of games both by users and through automated ‘evolution’ will become possible, allowing functionality not initially deployed when the game was first released to be easily incorporated. Even fundamental components such as the physics or rendering engines could be swapped out for alternatives without affecting other portions of the game.

An interesting strategy within the gaming industry that we are able to build upon is the decoupling of game specific components from the game engine. This has resulted in new plug and play platforms on which any game functionality can be built. This resembles strategies used within other entertainment sectors such as home networking. More specifically we find analogies with the strand of research within the Networked Appliances community aimed at allowing the interoperation of networked devices so that their operational capacity can be dispersed within and across different networked environments [15]. We can therefore expose a useful similarity between such devices and game components. Through the application of techniques from networked appliance technologies, scaled in order to disperse the operational capacity games support within and across different networks. This will allow components to be discovered and composed with other components used in other games.

In the following sections we present our proposed framework and discuss how the use of a flexible middleware solution can address many of the issues discussed above. We propose a framework that is novel in terms of both architecture and functionality, and which shows how new perspectives can be applied to game interoperability, game component utilisation, ontology usage, and game self-adaptation. III.

GAME COMPONENT INTEGRATION FRAMEWORK

Whilst it is important to bear in mind the overall structure that a game might take, it has been a goal of our work to deconstruct as far as possible the holistic notion of a game into a set of autonomous, generalised and reusable components. Whilst the development process of our framework necessarily entailed the compartmentalisation of various aspects of a traditional game, the final result must therefore be considered from the opposite perspective. Ultimately we aim to allow gaming to exist as an ad hoc interaction between various networked components, the entirety of which forms the gaming environment. None of these components in isolation can be considered to be the game itself. Perhaps the closes to what might be considered the heart of the game might be the rendering or physics engines. However, these will only provide one of any number of interpretations of the interactions that occur between components.

Our aim is therefore to extend the notion of networked gaming beyond the classical expectations of networked and massively multiplayer games, in order to provide a fully distributed and extensible gaming framework. This represents a continuation of the trend towards increasing entropy in games, as depicted in Figure 1.

Fig 1 Evolution of networked gaming

78

constraints that might ordinarily be associated with a real world object, might not exist within the virtual world. In this way, we may only perceive a partial slice or facet of the phone within the gaming environment.

In creating our game framework, a particular difficulty involved the decision of determining the granularity of a predefined specification used by the system. Too fine a granularity and there would remain no room for interpretation, and we would simply end up with a game into which various components could be plugged. Whilst a useful application, this would fail to capture the full benefits that we hope to achieve. Too coarse a granularity, however, and the result would be a morass of incompatible components. We believe the key to striking a successful balance lies in providing robustness rather than precise specifications. If interpretation is robust against what a specific implementation might class as mal-formed data, scope for safely bending and extending the protocols is therefore in-built. In our design this is achieved through the combined use of XML as a data transmission format and ontologies for service matching.

The virtual facets are projected into the virtual world by way of the compatible Networked Appliance infrastructure. However, any such projected facet will need to be interpreted in order for it to be perceived within the gaming environment. We initially propose two interpretive engines: 1.

The rendering engine.

2.

The physics engine.

It is possible that other alternative or additional interpretive engines may be created in the future, such as an audio engine. The interpretive engines model our senses. We can understand this better by considering an analogy with our sight. Sight is made possible through the reflection or emission of light from objects. This light hits the retina of the eye after which it is interpreted by our brain. It is well known that the image projected on to the rods and cones at the back of the eye is inverted. Consequently we can immediately see that an interpretive function is required of the brain in order for us to maintain our visual understanding of the world.

XML data formats are traditionally robust, since they are intended to be both human and machine readable, and extensible through the creation of new tags and attributes that legacy systems can safely ignore, in order for the files to degrade gracefully. The robustness of XML data formats can be seen as an inherited characteristic that derive from their pedigree as an evolution of HTML web technologies. Ontologies can be seen as a directed attempt to increase the robustness of data interpretation, by allowing computers to relate and equate otherwise disparate terminology in a way humans find straightforward, but which computers have traditionally found difficulty with. At present our architecture uses ontologies only for attribute-pair matching, but looking into the future we hope to extend this to incorporate the interpretation of the XML data streams themselves. Regrettably, space restrictions prevent us from providing a full explanation of our use of ontologies in the paper, and we refer the interested reader to our previous work in this area [15, 16].

Similarly the rendering interpretive engine takes data provided by the networked object to be sensory data that it uses in building up an image of the scene. Just as the brain provides an interpretation of the light waves that hit the retina, so the rendering engine provides an interpretation of the network data that it receives. It may choose to interpret and render this in any way, however in general we envisage that a relatively consistent set of rules would be followed, with objects describing themselves as being ‘Green’ rendered with a green colour, and so on. Whilst some interpretive engines – such as the rendering engine – remain passive, others – such as the physics engine – would be active. A passive engine purely interprets the data, requiring only a unidirectional pipe from the networked object to the renderer. An active engine on the other hand would respond to the data it receives, sending information back to a peer object through a bidirectional pipe. The physics engine may be an example of an active interpreter.

The general structure of a game using our framework is as a loose integration of various components. Game components can be classified as being either objects or interpreters dispersed within the network as independent services that can be discovered and used by other game components to create, extend, change and adapt gaming worlds that have emerged through component interactions. Note that the component, exposed as a service, could be a piece of executable software that gives rise to some behaviour or produce an object, or it could be an abstraction that allows physical objects that reside within the physical world to form explicit relationships with avatars in the virtual world. To elaborate on this, components may manifest objects that are represented in the real world, with certain aspects of these objects capable of existence within the virtual world. For example, for a phone that provides Networked Appliance capabilities, much of the functionality provided by the phone may be available within a virtual environment, such as the ability to view the user interface, make and take audio and video calls and so on. Some aspects, such as its corporeal presence, or the physical

There is no reason why an object shouldn’t be interpreted by multiple engines simultaneously, although in the case of an active interpreter, the results might in some cases be unpredictable. Current games do not afford such flexibility, and we generally find that all of the components used to create the game world are locked within a single application. Typically the renderer and physics engine – including all of the components it places within the world, such as Player Characters, Non-Player Characters (NPC) and every other object – are embedded and controlled within the same

79

application. In Figure 2 we see this scenario with the renderer and physics engine forming a tight coupling.

approaches, games typically communicate with each other using proprietary communication protocols. These can be somewhat restrictive and alternative approaches suggest that using open standards provides greater flexibility. The proposed framework presented in this paper draws from the successes of these different approaches to ensure that flexibility, scalability, and innovation are firmly embedded within the architecture from the outset. Using ad hoc networking principles the framework avoids the inherent limitations associated with centralised approaches. Whilst overcoming the restrictive nature of proprietary solutions is achieved using open standards and ontologies to more accurately describe, discover, and compose devices. The remainder of this paper explains how these technologies have been incorporated within the framework and highlights the novelty of our approach. B. Game Component Interoperation

Fig 2 Current Component Integration

The game component integration framework we propose provides a middleware that operates between games that emerge and the game components themselves. All communication between emergent games and gaming components passes through the framework. When a game component is connected to the network it does three things. First, it publishes its functionality as independent services. Second, it forms relationships with other gaming components within the network. Third, it self-adapts to composite and environmental changes.

Our approach, based on service-oriented computing, allows each component to act independently and with loose couplings to other components through peer-to-peer links. In this sense no single component constitutes the game, but rather the interactions between components result in the emergence of a virtual world and game environment. Our approach allows each component to implement a small footprint of code enabling services to be dispersed within the network. Using the framework, game components can connect to the network using any underlying communication protocol; discover and/or publish and use any deployed game component either locally (on the same platform the component resides) or remotely (on remote platforms); form communication links with other gaming components within the network and self-manage game component compositions based on component and environmental changes. Our basic alternative approach is depicted in Figure 3.

Services are dynamically discovered by propagating messages within the network using peer-to-peer technologies. As messages containing queries pass through the framework they are received by other components, which respond to them whenever their capability requirements match.

Gaming components connected to the network rely on four key characteristics. First, they provide access to component functionality using services. Second, they advertise their services within the network. Finally, using information garnered from these advertisements, they can automatically form communication links with other game components based on the functions they provide and the functions provided by components they require. Third, each component manages the communication links and self-adapts to unforeseen changes that occur within the environment, e.g. network traffic or loss of communication with a peered component. Using these three steps the framework allows virtual worlds to emerge through game component interaction. A. Interconnecting Game Components

Fig 3 Proposed Component Integration

There have been many proposed solutions for networked games. Whilst centralised approaches are commonly used, new research initiatives are trying to utilise ad hoc technologies such as peer-to-peer networks [17]. Building on these

As a result, gaming components automatically connect to the network without having to register themselves or the services they provide with any third party. When a component is initialised it has the ability to automatically integrate itself

80

within the network and publish the functions it provides by multicasting service advertisements. Each advertisement has a time-to-live value that expires after a given period; advertisements have to be periodically re-published to keep the service alive. Gaming components may also remove themselves and their functions from the network at any time. In these instances service advertisements are no longer republished and are eventually purged from the network. Once an advertisement has been removed from the network the component or service no longer exists.

external game components, emergent games can adapt to changes that occur. This collective interaction between gaming components results in the emergence of the game. During its execution each gaming component monitors its own interaction within the composition and adapts to any changes that may occur. If a game component becomes unavailable, an alternative component can be selected that provides the same or similar functionality. APSn

APS1 Peer Services

The framework achieves this using a distributed unstructured services manager [18], which is the primary capability every gaming component must implement. This is a peer-to-peer interface that can be mapped onto any middleware standard such as, JXTA [19], OSGi [20] or UPnP [21]. Devices connect to the network as either specialised components or simple components as illustrated in Figure 4.

APS1

APSn

P2P Interface Service Interface

Specialised-GC-A

Internet Service Interface

Data

Data

Service Interface

Specialised-GC-B P2P Interface

Game 2

Game 1

P2P Interface

Data

Game … n

Simple-GC-C

Service Interface

APSn

APS1 Peer Services

Game Component Integration Framework

Fig 5 Distributed Unstructured Services Manager IV.

Using the architecture described above, we have designed a distributed service oriented platform for use with virtual world development. In this way we have been able to test our design and demonstrate how gaming components can be dispersed and better utilised. It also illustrates how the framework provides a new and innovative approach, with the potential to take game development beyond current capabilities.

Network Layer

Network Layer

Component 1

IMPLEMENTATION

Component … n

A. Technical Implementation Fig 4 Proposed Framework

We follow a multidisciplinary approach whereby several tools and standards have been combined to implement our framework. This is not simply an integration exercise, but instead provides a new approach to game design, allowing unprecedented interaction between physical and virtual world components. The interpretive components used to implement our virtual world were developed using the Java Monkey Engine (JME) [22], a high performance scene-oriented graphics API. This makes use of the Lightweight Java Game Library (LWJGL) as a means of accessing OpenGL functionality. The rendering interpreter was implemented using the Java Monkey Engine, with the JME Physics engine used as a basis for our physics interpretive engine. Based on the framework described above, we extended a standard JME terrain environment to incorporate peer-to-peer capabilities. The discovery of gaming components within the network was implemented with dynamic integration into the virtual world. All the services used to expose gaming components were developed using the JXTA protocols. These protocols allow such components to connect to the network independent of

Here a specialised game component has the ability to provide services as well as propagate service requests within the network. A simple game component by comparison does not have these capabilities. This type of component joins the network, propagates queries and invokes discovered services. This enables any game component irrespective of its capabilities to effectively choose how it will interact within the network. Figure 5 illustrates two extremes: gaming components that are highly capable and those that have limited capabilities. However, it is envisaged that there will be a myriad of other possibilities between these extremes. The distributed unstructured services manager marshals all communication between a distributed game and its gaming components within the network. It also provides adaptation functions used to monitor communication links between the game components in the network. When game components are initially deployed they try to establish relationships with other game components in the network. By managing references to

81

platform, programming language, or transport protocol and wrap functionality within service descriptions that can be easily deployed. Figure 6 shows a relevant portion of the code used to adapt the terrain environment, illustrating the incorporation of the JXTA functionality.

is extracted and used to bind to the service and request the box information. All successful connections result in the communication of information between the two services. This allows TestGenerateTerrain to project the appropriate facets of the BoxService into the virtual environment, in this case resulting in a box avatar being dropped into the virtual world using the code shown in Figure 7.

protected void simpleInitGame() { startJXTA(); final StaticPhysicsNode staticNode = getPhysicsSpace().createStaticNode(); Spatial terrain = createTerrain(); staticNode.attachChild( terrain ); staticNode.getLocalTranslation().set( 0, -150, 0 ); rootNode.attachChild( staticNode ); staticNode.generatePhysicsGeometry(); showPhysics = true; MouseInput.get().setCursorVisible( true ); new PhysicsPicker( input, rootNode, getPhysicsSpace() ); rootNode.findPick( new Ray( new Vector3f(), new Vector3f( 1, 0, 0 ) ), new BoundingPickResults() ); timer.reset(); findBoxServices(); }

public void addBox(String name, ColorRGBA color, float x, float y, float z){ final DynamicPhysicsNode dynamicNode3 = tgt.getPhysicsSpace().createDynamicNode(); Box meshBox3 = new Box(name, new Vector3f(), x, y, z ); meshBox3.setModelBound( new BoundingBox() ); meshBox3.setDefaultColor(color); meshBox3.updateModelBound(); dynamicNode3.attachChild( meshBox3 ); dynamicNode3.generatePhysicsGeometry(); tgt.getRootNode().attachChild( dynamicNode3 ); dynamicNode3.computeMass();

private void startJXTA(){ manager = new NetworkManager("GCIDNExample"); manager.start("principal", "password");

}

Fig 7 BoxService projected by TestGenerateTerrain as a solid cube in the environment

netPeerGroup = manager.getNetPeerGroup(); //Create a new resolver service, set the handler name //and register a message handler. resolverSvr = netPeerGroup.getResolverService(); resolverSvr .registerHandler(handlerName, (QueryHandler)new TestGenerateTerrain_Resolver_Handler( this));

The characteristics of these facets are controlled via a loose connection to the service provider through the JxtaBiDiPipe it uses for bidirectional communications. Figure 8 illustrates the code used to propagate resolver service queries, provides an example pipe advertisement used by a box service, and shows the code used to discover the box.

}

Fig 6 Incorporating JXTA into the terrain environment At this initial stage of our research all services are predetermined and each virtual environment knows how to discover and invoke them. This will be extended in future by merging our previous work in Networked Appliances [15] to allow ad hoc services to be hosted and discovered without any prior knowledge of the service. Virtual world components propagate messages in peer groups using the JXTA ResolverService protocol. We currently have two services implemented: TestGenerateTerrain and BoxService. A description of the interaction of these services will be presented in the next section.

public void findBoxServices(){ ResolverQueryMsg message = new ResolverQuery(); message.setHandlerName(handlerName); message.setCredential(null); message.setQuery(“Service[Box]"); resolverSvr.sendQuery(null, message); } public void processResponse(net.jxta.protocol.ResolverResponseMsg resolverResponseMsg) { tgt.setResult(resolverResponseMsg); } public void setResult(net.jxta.protocol.ResolverResponseMsg resolverResponseMsg){ Thread thread = new Thread( new ConnectionHandler(this, resolverResponseMsg.getResponse()), "Connection Handler Thread"); thread.start(); } urn:jxta:uuid-59616261646162614E504720503250338944BCED387C4A2BBD8E9411B78C284104 JxtaUnicast JXTA Box Service

Using ResolverService listeners, applications define the type of messages they are able to process. The prototype TestGenerateTerrain component propagates BoxMessage messages, which the BoxService listens out for. Services capable of processing these messages – such as our prototype BoxService – return service advertisements back to the query node. JXTA service advertisements are used to describe highlevel information such as the provider and purpose of the service. These advertisements have been extended to include the Peer ID. The JXTA Peer Advertisement could have been used, however to reduce the number of discovery requests, the Peer ID is placed in the service advertisement. This allows all required information to be discovered in a single request, thus minimising network traffic. Using the Peer ID ensures that although more than one service may exist of the same type, devices only bind and use the service initially discovered when a connection request to that service is made. This guarantees other pipe listeners will not receive messages destined for other services.

private boolean discoverBox(){ PipeAdvertisement padv; try{ StringReader sr = new StringReader(adv); padv = (PipeAdvertisement) AdvertisementFactory.newAdvertisement( MimeMediaType.XMLUTF8, sr); }catch(Exception e){ System.out.println("discoverBox error: " + e.toString()); return false; } for(int i =0; i < 3; i++){ try{ pipe = new JxtaBiDiPipe(); pipe.setReliable(true); pipe.connect(tgt.getNetPeerGroup(), null,padv,3000,this); waitUntilCompleted(); break; }catch(Exception ioe){ System.out.println("TestGenerateTerrain: discoverBox: " + ioe.toString()); } } return true; }

Fig 8 Resolver service query propagation, pipe advertisement and service discovery code

Once the service specification response advertisement has been received by TestGenerateTerrain, the pipe advertisement

82

Box services can be removed from the network in two ways. Ordinarily the owner of the service would perform a controlled disconnection resulting in a disconnection message being propagated within the network. Interpreter components receive these messages and use them to check if the service is being used within the world. If so, the box is removed from the rootNode used to contain the objects in the virtual world. However, in exceptional circumstances a service may unexpectedly fail and may therefore be incapable of performing a controlled disconnection. Consequently virtual world applications periodically ping objects currently being interpreted within the world to see if they are still active. All nodes that do not respond are removed from the world.

responsibility of the box component to perform any reaction and the box itself is controlled entirely by the owner. The Control Panel allows commands to be submitted to the box in order to instruct it to perform some given behaviour. For example, the command shake could be sent to the box resulting in the box shaking in the virtual world. Within the Control Panel there is also an exit button that totally removes the service from the network. This is not to be confused with Disconnect. This example is simplistic, however more complex objects may provide a number of services, e.g. a DVD Player may offer a codec service, reader service and recorder service. Consequently, complex objects may only choose to remove a particular service not all of its services. By pressing the Disconnect button only a specific service provided by the object is removed not the whole object itself. In the box example obviously this is a simple object that only provides one service, consequently in this instance Disconnect and Exit amount to the same thing.

In the following section we will describe how these components can be used in practice, and how the resulting functionality might appear to an inhabitant of the virtual game world. B. Prototype Demonstration

In our prototype the third computer hosts the physics engine interpreter and rendering interpreter. When executed its appearance is dependent on what that virtual world provides and what gaming components are available within the network. For simplicity the prototype is a virtual world that tries to discover Box Services provided by the two computers connected to the peer-to-peer network. For each box service discovered a box object is rendered in the world in conformance with the properties and behaviours the box service supports. Figure 10 provides an illustrative representation of the virtual world used in the prototype containing the two box services.

The demonstration prototype is comprised of three computers connected within a peer-to-peer network. We have used it to simulate how three-dimensional objects can be discovered and attached to the rootNode of a virtual world. The first two computers host BoxService services that support a simple user interface as illustrated in Figure 9. Note that although this interface can be used to interact with the service it is only for illustrative purposes and interactions would primarily only occur between game objects and interpreters.

Fig 9 Box Service Interface Using this interface the BoxService can be deployed within the network allowing it to be discovered and used by any other application, which in this case is the virtual gaming world. The service can be removed at any time by pressing the Disconnect button, thus making it unavailable within the network. The Box Settings panel allows the properties of the box to be changed. This change is persistent within the BoxService itself and broadcast to peers within the network to update any instance of its use in a game. The figure illustrates the fact that the name, colour and dimensional coordinates can be dynamically changed. A subtle but important point is that the properties and behaviours associated with the box are not changed by the virtual world but rather by the owner of the box. Although the properties may be affected by feedback from an active interpreter such as a physics engine, it is the

Fig 10 Virtual world interpreting two box services The virtual world application performs the three steps required of the framework as discussed above. First, the application publishes the service components it wishes to share with other virtual worlds within the network. Second, it discovers all the service components it requires and renders them within the virtual world. Third, it renders any changes to

83

the objects as and when component owners submit changes or execute behaviours. For example using the user interface illustrated in Fig 9 the size and colour of both boxes can be changed. This change is illustrated in Figure 11.

representation of some location you inhabit in the real world in order to interact with the objects within that environment, irrespective of where you are physically located. In Second Life, the environment is disassociated from the real world. However with our framework the objects can be facets of their physical counterparts, and such interactions can have a direct effect on their real world behaviour. Rather than carefully defining what can appear in virtual worlds and how they can be used, the framework presented in this paper empowers games, virtual worlds and users to determine what components they wish to represent, share and integrate digitally, and how they can be used. This allows virtual worlds to emerge through inclusion and interaction between low-level objects and behaviours. The result is a blurring of the boundaries between physical and virtual. Much of the enjoyment in games emanates from the liberation from physical restrictions that they afford. It is likely that many people will actively choose to perform more mundane tasks and interact within virtual environments because of the additional freedoms it provides.

Fig 11 The virtual world response to box service colourchange requests C. Discussion Although this scenario is simplistic it has far reaching implications and presents exciting possibilities. Building on our previous work within Networked Appliances [23], our aim is to embed the avatars of devices (digital representation of physical objects) by projecting facets within the virtual world. The relationship that exists between an avatar and a device is bidirectional. Hence the avatar can affect the device and the device can affect the avatar. For example, pressing the play button on the DVD Player avatar will result in the player responding as expected in the real-world. Use of Networked Appliances within the virtual world provides advantages over their use in the physical world. For example, Figure 12 shows a physical mobile phone with a corresponding projected facet in the virtual world. If this rings in the real world, you might have to locate it before you can use it. In the virtual world the projected facet of the phone is not bound by the same physical constraints and simply calling for it could allow it to fly directly to your location. You could then answer and use the avatar phone much like you would the physical phone. This application provides particular benefits for those of us who are organisationally challenged and predisposed to losing our mobile phones regularly! Technologically, the capability is achieved by redirecting audio and video from the 3G phone to the display and speakers provided by the phone avatar. Whilst using the phone you may not be aware of the location of the phone and you may not particularly be that interested: you just want to use it.

Fig 12 Physical networked appliances can be integrated into the gaming environment. This section has described a work in progress, which aims to propose a new and novel approach to distributed object integration within virtual environments. Although the prototype is simplistic it demonstrates that the basic principles of our framework can be easily implemented to realise our vision. V.

CONCLUSIONS AND FUTURE WORK

In this paper we presented a novel framework for the creation of distributed network gaming that extends beyond the current capabilities of the centralised massively multiplayer model that is in widespread use today. Our framework is based on a novel approach that draws on our expertise in the area of Networked Appliances. This provides a novel perspective on how games might be distributed across networks, resulting in unique and beneficial characteristics that can allow infinite expandability of a game in a manner accessible to all users. Moreover, the intrinsically componentised nature of the framework mitigates the effects of obsolescence by allowing any component – from the

Another application could be the use of the teleportation function, as found in virtual communities such as Second Life. Here it becomes possible to teleport yourself to a digital

84

rendering or physics engine to specific characters within the game – to be switched out and replaced with an alternative, without affecting the operation of other components. As well as these architectural benefits, we also discussed how such a framework might provide additional functionality through interactions with real-world networked appliances, allowing facets of physical devices to be projected into the game in order to provide virtual manifestations of themselves. Whilst we have presented an initial prototype system, it is clear that much work remains to be carried out before a fully effective system is produced. In particular, difficult questions concerning the level of granularity between device interactions remain. We hope to introduce the use of ontologies into the system in order to increase the robustness of interactions between components, allowing for greater flexibility in the way components represent themselves. Ultimately, the success of a framework such as this relies on the development of exciting components that can be used to build up gaming environments. Nonetheless we believe that a flexible and distributed system such as this provides many opportunities for the advancement of gaming into new areas and in new ways. REFERENCES [1]

N. Ducheneaut, N. Yee, E. Nickell, and R. Moore, Building an MMO with Mass Appeal: A look at Gameplay in World of Warcraft. Games and Culture - Journal of Interactive Media, 2006. 1(4): p. 281 - 317.

[2]

O. Sotamma, Have Fun Working with Our Product!: Critical Perspectives on Computer Game Mod Competitions. International DiGRA Conference, 2005, Vancouver, Canada, p.

[9]

The Sims Online. 2006, Accessed, http://www.ea.com/official/thesims/thesimsonline/.

[10]

EVE Online. 2006, Accessed, http://www.eve-online.com/.

[11]

T. Hsiao and S. Yuan, Practical Middleware for Massively Multiplayer Online Games. IEEE Internet Computing, 2005. 9(5): p. 47 - 54.

[12]

Computer Game Modding, Intermediality and Participatory Culture. 2003, F. University of Tampere, Accessed: September, http://old.imv.au.dk/eng/academic/pdf_files/Sotamaa.pdf.

[13]

N. Yee, The Unbearable Likenes of Being Digital: The Persistence of Nonverbal Social Norms in Online Virtual Environments. to appear in the Journal on CyberPsychology and Behavior, 2006.

[14]

P. Curtis and D. A. Nichols, MUDs grow up: social virtual reality in the real world. COMPCON '94, 1994, San Francisco, CA, USA: IEEE Comput. Soc. Press, p. 193-200 BN - 0 8186 5380 9.

[15]

A. Mingkhwan, P. Fergus, O. Abuelma'atti, M. Merabti, B. Askwith, and M. Hanneghan, Dynamic Service Composition in Home Appliance Networks. Multimedia Tools and Applications: A Special Issue on Advances in Consumer Communications and Networking, 2006. 29(3): p. 257 - 284.

[16]

P. Fergus, A. Mingkhwan, M. Merabti, and M. Hanneghan, Distributed Emergent Semantics in P2P Networks. (IKS'2003) Information and Knowledge Sharing, 2003, Scottsdale, Arizona, USA: ACTA Press, p. 75-82.

[17]

A. E. Rhalibi and M. Merabti, Agents-based modeling for a peer-to-peer MMOG architecture. Computers in Entertainment

Comput. Entertain., 2005. 3(2): p. 3-3. [18]

P. Fergus, A. Mingkhwan, M. Merabti, and M. Hanneghan, DiSUS: Mobile Ad Hoc Network Unstructured Services. (PWC'2003) Personal Wireless Communications, 2003, Venice, Italy: Springer, p. 484-491.

M. S. El-Nasr and B. K. Smith, Learning Through Game Modding. ACM Computers in Entertainment, 2006. 4(1).

[19]

L. Gong, JXTA: A Network Programming Environment. IEEE Internet Computing, 2001. 5(3): p. 88-95.

[5]

W. Piekarski and B. Thomas, ARQuake: The outdoor augmented reality gaming system. Communications of the ACM, 2002. 45(1): p. 36-38.

[20]

L. Choonhwa, D. Nordstedt, and S. Helal, Enabling smart spaces with OSGI. IEEE Pervasive Computing, 2003. 2(3): p. 89 - 94.

[6]

A. F. Seay, W. J. Jerome, K. S. Lee, and R. E. Kraut, Project massive: a study of online gaming communities. CHI '04 extended abstracts on Human factors in computing systems, 2004, Vienna, Austria: ACM Press, p. 1421-1424.

[21]

UPnP Forum. 2005, Microsoft Corp., Accessed: 2006, http://www.upnp.org/.

[22]

[7]

PlanetSide. 2006, http://planetside.station.sony.com/.

Accessed,

Java Monkey Engine. http://www.jmonkeyengine.com/.

[23]

[8]

Star Wars Galaxies. 2006, http://starwarsgalaxies.station.sony.com/.

Accessed,

P. Fergus, M. Merabti, M. B. Hanneghan, A. Taleb-Bendiab, and A. Minghwan, A Semantic Framework for Self-Adaptive Networked Appliances. (CCNC'05) IEEE Consumer Communications & Networking Conference, 2005, Las Vegas, Nevada, USA: IEEE Computer Society, p. 229-234.

[3]

J. Kucklich, Precarious Playbour: Modders and the Digital Games Industry. International Journal on Fibreculture, 2005. 1(5).

[4]

85

2006,

Accessed,

Darwin’s Dream Andreas Huber

Markus Paulhart

Christian Kloiber

Helmut Hlavacs

Institute of Distributed and Multimedia Systems, Univ. of Vienna, Lenaug. 2/8, 1080 Vienna, Austria

Institute of Distributed and Multimedia Systems, Univ. of Vienna, Lenaug. 2/8, 1080 Vienna, Austria

Institute of Distributed and Multimedia Systems, Univ. of Vienna, Lenaug. 2/8, 1080 Vienna, Austria

Institute of Distributed and Multimedia Systems, Univ. of Vienna, Lenaug. 2/8, 1080 Vienna, Austria

christian.kloiber@gmail. com

[email protected] t

[email protected]. [email protected]. ac.at ac.at

ABSTRACT This paper introduces Darwin’s Dream, a system for simulating artificial plant life on a planet wide scale. Darwin’s Dream contains a sophisticated climate simulation, which produces different climate zones on a planet, taking into account things like rain fall, clouds, and sun light intensity. The climate simulation is used to drive the evolution of a complex plant ecosystem. As planet topology we use something like a “flat sphere”, which eases the use of 2D height maps. An important focus of Darwin’s Dream is also on the realistic visualization of the planet’s surface.

First, it is a model explaining why evolution itself works. Second, it might be a way for optimizing complex systems, as done by genetic or evolutionary algorithms, or genetic programming. Third, it might try to forecast life of our planet in a few million years, or it might try to model how life may evolve on alien planets with conditions totally different to ours. Apart from serious applications, evolution of life might be simulated for the sake of fun, in a game like fashion. The result of the evolution of artificial life, as unpredictable as it is, might have fascinated Charles Darwin (thus the name of the project).

Categories and Subject Descriptors

2.

I.6 Simulation and Modeling: Miscellaneous

Keywords Climate simulation, artificial life, evolution of plant life, weather visualization

1.

RELATED WORK

The simulation of life can focus on the evolution1 of animals, or the evolution of plants. For the evolution of animal life or more abstract “life” forms like computer programs, numerous systems exist2, for example [2, 4]. Evolution of pure plant life is seen not as often as animal evolution, probably due to its more static nature. Panspermia [3] was a system very similar to ours, focusing on the evolution of plants with their various shapes, and using professional graphics workstations. Interactive Plant Growing [5] was more an art installation than the simulation of plant life in an ecosystem. Nerve Garden [3] simulates the evolution of plant life, but may also include insects and is based on VRML 2.0, but on a smaller scale (islands) than our planet wide system. Nerve Garden itself evolved into the Biota@Home initiative, which focuses on the creation of artificial nature systems run on peer-to-peer networks.3 The company Maxis created several games focusing on the evolution of life, including SimEarth and SimLife [6]. Together with Electronic Arts, Maxis recently released the game Spore, a commercial artificial life game.4

INTRODUCTION

The availability of high computing power in off-the-shelf PCs has enabled the simulation of life evolution for everyone. Evolution of life is an extremely complex area, since evolution as a stochastic process requires the investigation of the genome of large populations, i.e., hundreds, better thousands of subjects should be observed for many generations. Evolution here means the process of inheriting and mutating genomes which decide about how the individuals fit into their environment and whether species survive. If resources are scarce or the environmental conditions are unfavorable, then only the fittest will survive, as postulated by Charles Darwin. The simulated genome must represent basic properties of the simulated species which are necessary for survival under different environmental conditions. Simulation of evolution can be viewed from various aspects.

The innovation of our system called Darwin’s Dream is the complex interaction between a detailed climate simulation on the one hand, and a planet wide plant ecosystem on the other hand. However, Darwin’s Dream is rather meant for pleasure in a game like fashion than for studying plant evolution itself.

.

86

1

http://www.google.com/Top/Science/Biology/Evolution/

2

http://www.google.com/Top/Computers/Artificial_Life/

3

http://www.biota.org/

4

http://www.spore.com/

South Pole

Since the computation of the behavior of thousands of plants demands high computational power, a main goal has been to find a suitable software architecture for reducing the main task into smaller manageable subtasks. As a consequence, the system is split into a server part being responsible for computing the main simulation, while the visualization of the results is done on the client side. For doing this, the server transfers a snapshot of the world to the client, which then may roam freely through the world to inspect its flora and current weather.

Equator

South Pole

The main idea is to create a world similar to ours, with similar plant evolution and weather. It must be noted that though the used models try to mimic as much of real evolution and climate as possible, the project does not claim to implement models of the same complexity as found in the respective specialized sciences. Instead, it is an attempt to create a sandbox for playing with a world, similarly to games like SimCity or Spore.

North Pole

bottom. The equator also is represented two times, being between the North Pole and the two South Poles. The idea of this topology is as follows. The South Pole at the top is bent down and is connected to the South Pole at the bottom, thus creating a cylinder in 3 dimensions with only one north and one South Pole. Of course, a planet always has the shape of a sphere, and not a cylinder. Since a sphere contains only one equator, the cylinder is then squeezed again into a flat form, thus connecting the left and right sides to their respective counterparts on the other side. The result is something like a flat sphere (Figure 3).

DARWIN’S DREAM

The aim of the interdisciplinary project Darwin’s Dream is to simulate and visualize a planet’s biosphere, including its climate and flora (but not animal life). Plant life of the planet is ruled by the laws of evolution, while the climate simulation adheres to knowledge from climatology and meteorology.

Equator

3.

Figure 2. The planet’s height map (transposed).

Figure 3. The planet’s 2D/3D topology. The reason for this peculiar topology is its ease of use, both when using it in the climate simulation (Section 5), and the visualization at the client (Section 7). For instance, when using a 3D sphere for the climate simulation, the projection onto a square map (which is necessary for the visualization) would produce significant distortions.

Figure 1. The system architecture. The whole system has been implemented by using C++, for reasons of its object oriented nature and a probably better performance compared to languages like Java. The server side implementation has been implemented in standard C++ without any platform dependent libraries, thus enabling to run the simulation on different platforms.

4.

Assuming that the world map from Figure 2 is described by a coordinate system ( x, y ), 1 d x d n, 1 d y d 2n , then if someone for instance stands at position (1, y ) and moves one terrain field to the left, he will enter the terrain field (1, 2n  y ) . On the other

PLANET TOPOLOGY

The planet’s surface is annotated by height information. This information is taken from a grey-level graphics file, each pixel representing the height information of the terrain field it represents. As already pointed out, a terrain field’s area is 1 km2, the height map contains exactly n u 2n terrain fields as shown in Figure 2, where the integer n depends on the graphics file’s resolution and can be chosen arbitrarily.

hand, someone standing at position (n, y ) and going one field to the right will enter terrain field ( n, 2n  y ) . This additionally makes sure that people travelling east or west always remain at the same latitude, and thus in similar climatic regions. The result of the above described mapping is a way for representing sphere textures by a 2D grid. Throughout a simulation run the grid fields are updated row and column wise by using the climate map. During such an update it is tested whether a terrain field is able to emit water, which cannot be stored in the terrain field, to a neighboring field. This way it is possible to

Figure 2 also shows the topology of the world map, and explains why the chosen size is n u 2n instead of n u n . The height of the world map is twice the size of the width, the North Pole being assumed to be in the middle of the map, rather than at the top. The South Pole is split into two parts, one at the top, the other at the

87

dynamically create and destroy stagnant and flowing water bodies.

5.

6.

GENETIC MODEL

Darwin’s Dream implements a mature genetic model of plant life and evolution. Plants develop according to their genome, which is a set of genes and which describe the plant physiology. Darwin’s Dream implements a very complex genome which allows to describe not only the plant’s appearance, but also its interaction with the environment, and additionally its interaction with other plants. As an example, one gene describes how sensitive the plant reacts to different light conditions, i.e., areas with much sun light, or with less sun light. As a design principle we tried to minimize the set of genes which have only either positive or negative properties, i.e., where the optimum profit (or maximal damage) for the plant is achieved either at the minimum or maximum value. Also the genome contains detailed properties of the plant’s appearance. In total, the genome of each plant consists of 697 bits describing 58 genes.

CLIMATE SIMULATION

The climate is purely simulated at the server, clients downloading a snapshot of the planet use the climate data only for the visualization of the current weather, including clouds, rain, water and day time. The GUI currently only allows to query specific information from plants, not from the climate state. The climate simulation is based on three main objects.

The planet’s surface. Here all plants are grown. The surface is realized using a 2D grid, each element of the grid represents an area equivalent to one square kilometer. These elements are called terrain fields and are able to retain water, or give off water to its surface. The planet’s atmosphere. This is a medium above the planet’s surface, which is able to store and transport air and water. It is modelled using a 3D grid, each grid element being called climate block.

6.1

Plant Status

An important part of the genome decides whether the plant can survive under certain environmental conditions. For instance, a plant needs water, nutrients, sun light and certain temperature conditions for its survival. Since these factors are usually not all satisfied optimally, the growth of plants depends on the environment, and the plants will show different properties depending on their position. Thus, additionally to its genome, each plant is equipped with individual status variables describing its current condition, for instance its current size.

The climate map. This is the main simulation result of the world climate, storing average climate data of days and nights for each month and climate block. The climate map is then used as an input into the plant simulation with constant values for each simulated month. For instance, since in the month May rain is observed only rarely, the average rainfall of May is quite low. This value then is used for defining the average rainfall for each day in May.

In order to create a complex system of woven properties, the status variables also depend on numerous genes, but also on the current value of other status variables. In total, each plant is equipped with 36 status variables.

Since the climate of a world is very stable, the climate is computed newly only every ten simulated years. When starting the climate simulation, the year is split into 12 month, and one month is split into 30 days, each day again being split into four day time periods. The main difference between the months is the different inclination of the planet’s axis, thus mimicking the different intensity of the sun light warming up the planet’s surface throughout a year.

6.2

Plant Life Cycle

The life start of each plant is triggered by a seed which might be either created artificially by the user, for instance at simulation start up, or it might be created by a mother plant which passes on its genome. The seeds are first carried around by the wind, and later dropped down to the floor after some time. At some instances, the wind phase might be quite short, and the seed might drop down almost immediately. Once lying on solid ground, the seed starts to grow its roots, but only if the environmental conditions are within some bounds. If the conditions never fulfill the necessary preconditions, the seed will die after some individual random time has gone by.

Depending on day time and month, the terrain simulation is responsible for transporting water between terrain fields. Additionally, the temperature and the amount of water that is evaporated from each terrain field into the above climate blocks is computed as a function of the sun intensity, clouds and plant density. The climate simulation itself then tries to balance different pressures, temperatures and water in the climate blocks. Depending on this, the simulation computes the average cloud density, rain fall, wind direction and wind speed. After all months of a year have been simulated, the average values are stored in the climate map, which is used for the next nine years to produce rain, light, wind and clouds for the respective terrain fields.

Once a plant spreads out of its seed, the following stages are gone through: First, the minimum amount of water needed by the plant is computed. If this minimum is not available in the plant’s surrounding, then the plant suffers damage. The amount of damage depends on the difference of available to minimum required amount of water. Generally, any insufficiency of nutrition is recorded in the plant’s status variables.

The plant life of the world indeed does have a significant influence onto the world climate, since it influences the amount of water and the temperature of the surface. As a consequence, plant life changes the climate, while a changing climate changes the plant life, the result being a highly dynamic system of complex interdependencies.

Only if the minimum amount of water is available, the plant starts growing its stem. Again the amount of growth depends on environmental factors, like enough sun light, water and nutrition. The third stage is the growth of leaves and blossoms, again depending on environmental conditions. Since it would be very demanding to visualize the slow growth of leaves and blossoms, we decided to simplify the visualization at this point, meaning that leave and blossom growth happens instantaneously without time delay.

88

The final stage of a plant is its reproduction, which is carried out periodically throughout the life of a plant. The amount of reproduction again depends on the environment, the better it is, the more seeds are produced and sent into the world. There are two ways for reproduction. First, if a plant was not hit by similar spores from other plants, it may decide to reproduce autonomously and inseminates itself. If, on the other hand, it has been hit by spores from similar other plants, the produced seeds will be a combination of both genomes. This is done by randomly selecting genes from the two genomes and creating a new genome from them.

7.

7.1

VISUALIZATION

As was explained earlier, a snapshot of the world simulation is transferred to the client to be visualized. The client enables to roam this world freely by flying in 3 dimensions above the surface. The client of course also allows the retrieval of data about the climate, plants and plant evolution.

7.2

Depending on the position of the sun, the direction, color and intensity of its light rays are computed. Additionally, the color of the sky also depends on the position of the sun and the observer. If the observer enters a region with night, the sky’s color changes to dark blue or black, while the terrain fog gets more dense and darker. Furthermore, a starry sky is emulated, its transparency and thus visibility depends on the amount of darkness of the area. Stars are represented by pictures of stars, being similar to a sun, but without their own lightening source, and being placed at a constant distance above the observer. As a consequence, we do not use a sky box as done in many other computer games, although OGRE of course enables the use of such a tool.

As target platform we chose Windows XP with compiled C++ programs, a Java based approach was not chosen because of the already mentioned performance considerations. For graphics rendering, the Object Oriented Graphics Rendering Engine (OGRE) has been chosen.5 Amongst others, the reasons for choosing OGRE include its strict object orientation and C++ binding, its maturity, the good documentation and its free availability. It must be noted that for OGRE there is already a project for visualizing plants called Wish.6 However, currently, we have implemented our own OGRE-based plant visualization system.

Figure 5. A cloudy and sunny sky.

Figure 4. Terrain with trees.

7.3 http://www.ogre3d.org/

6

http://www.projectwish.com/

Day and Night

Since the visualization depicts only one particular snapshot, time is frozen during the visualization. This means that the areas with daylight and nightfall do not change when roaming through the planet. Especially the sun does not change its position. For reaching this effect the picture of the sun and the sun rays are put at a fixed distance from the observer. As soon as the observer moves, the position of the sun is moved and rotated around the observer (Figure 5).

An important point for choosing a suitable rendering system was given by the fact that the visualization of a whole world and possibly thousands or even millions of plants is very resource demanding, thus requiring powerful PC technology and mature graphics cards. Furthermore it was clear that an important aspect of the visualization would also include the rendering of weather and climate.

5

Terrain

OGRE already contains support for the visualization of terrains and surfaces (Figure 4). A possible obstacle here is given by the fact that OGRE only allows to use n u n maps for height information. In order to use the same maps for visualization and climate simulation, the n u n height map used for visualization has been stretched in its y-dimension by a factor of 2, thus yielding the corresponding height map for climate simulation. For improving the realism, further textures were put on top of the standard ground texture. For instance, in case of considerable vegetation, the surface is covered by green meadows, while the arctic regions are covered by white ice. Depending on the simulation outcome, several different textures can be combined.

Water

As was said before, the simulation enables the dynamic creation and deletion of stagnant and flowing water. These are visualized by using squares, which are equipped with a texture showing a

89

water animation. The position of these squares then depends on the water level above the average height of the respective terrain field. Since terrain fields have different heights, this way, small islands may be created. The positioning is done after the water levels of neighboring terrain fields have been counterbalanced with each other, resulting in a homogeneous water level. An additional animation of oscillating waves is then put over the common water surface (Figure 6).

size, it is equipped with a cloud bottom including a suitable cloud surface texture. The bottom is surrounded by small smooth silky cloud elements which always point into the direction of the observer. On top of a cloud, a number of large floating cloud elements form a kind of stack above each other. These stack elements are again equipped with realistic cloud textures. Smaller clouds do not possess a bottom, and their cloud stacks are created by using smaller stack elements. As a result clouds are complex dynamically created objects, neighboring clouds even can be combined into one larger cloud. This way almost all cloud types observed in real nature can be simulated (Figure 7).

The observer is also able to dive into the water, thus resulting in a darkening of the scenery, the light source is hidden and the used fog is made denser.

The weather simulation also allows the visualization of rain and snow fall. Since the OGRE engine uses particle systems for both, heavy rain or snow fall may induce severe performance problems to the client.

7.5

Plants

Similar to clouds, plants are also created by combining several elements into one plant object. According to the respective plant genome, the plant simulation delivers data for the plant stem and its treetop. However, the terms “stem” and “treetop” here represent the respective parts of all plant types, not only trees. For instance, in the simulation, herbages and bushes do not contain a stem but a “treetop” with suitable texture. Trees then are additionally equipped with a properly scaled stem. For delivering a suitable representation of different plant types, the simulation engine chooses the suitable textures from a large palette of plant parts. Since Darwin’s Dream has been designed for simulating large numbers of plants, the respective parts of a plant are created by placing the texture onto two orthogonal pictures. This way, a strong off-the-shelf PC, perhaps additionally equipped with a consumer graphics card, is able to visualize hundreds or even thousands of plants without showing a significant loss of performance (Figure 8). However, if too many plants are to be visualized, this results in the major bottleneck of the system.

Figure 6. Arctic sea.

Figure 7. Clouds.

7.4

Weather

For the visualization of the weather, a major challenge was the dynamic creation of clouds. Clouds are made of a number of different cloud elements, which are chosen depending on the current weather conditions created by the simulation engine. A cloud is positioned above a terrain field and may overlap neighboring terrain fields. The higher the cloud density should be rendered due to simulation, the more cloud elements are used, and the darker the center of the cloud gets. If a cloud exceeds a certain

Figure 8. Plants at evening.

7.6

The User Interface

The graphical user interface (GUI) allows loading different simulation snapshots from the simulation server. Once a snapshot has been transmitted to the client, the user is free to roam the world in three dimensions by using the keyboard.

90

A major achievement is given by the complex interdependencies of the system components. On the one hand, a detailed climate simulation influences the plant life of the planet, on the other hand the plant life influences the planet’s climate. The outcome of such a dynamic system is unpredictable. An important focus lies on the accurate visualization of the planet’s climate and its plant life. Both components require the use of dynamic composition of scenery objects. For instance, clouds are dynamically created as a result of the climate simulation, rather than using predefined static clouds. The water bodies can rise or fall, and water areas may swallow up land dynamically, thus changing the planet surface at least visually. Plants are dynamically created according to their genome. Furthermore, the goal indeed is the visualization of a whole planet and its plant life, although the planet has a predefined size and topology. This requires the use of efficient methods for rendering thousands of simple scene objects.

The GUI also allows the retrieval of information about all plants growing in the world. This is done by mouse clicking onto a plant, the respective information is then presented in the GUI. This information includes the plant age, the plant height, its fertility cycle, the genome, general plant status, and many other things.

8.

COMMUNICATION

For reasons of human readability the communication between server and clients is carried out via an XML file. Fortunately, many free libraries for the manipulation of XML structures, like Xerces-c7 or TinyXML8 exist. A drawback of XML is the fact that documents containing thousands of objects might get very large. In order to limit this growth, we have decided to use abbreviations instead of full names whereever possible. The abbreviations are explicitly specified at the start of the document. An example for this is given by

The system now is in a very early beta state, requiring still large effort for reaching a stable version. However, all important parts, including climate simulation, plant evolution, and visualization already work and can be tested.



10.

Here the attribute “ulPlantInitDate” is replaced by the abbreviation “A“. Later in the file, for a specific plant, the attribute “A” is then assigned a certain value, in this case zero.

[2] T.S. Ray, Overview of Tierra at ATR. In Technical Information, No.15, Technologies for Software Evolutionary Systems. ATR-HIP. Kyoto, Japan, 2001.

During simulation, the server periodically produces such a system snapshot, which may then be transported to the client, for instance by using HTTP. The file may be used for visualization at the client, but also as a starting point for additional simulations.

9.

[3] K. Sims, Artificial Evolution for Computer Graphics, Computer Graphics 25-4 (1991), pp. 319-328. http://web.genarts.com/karl/panspermia.html

CONCLUSION

[4] K. Sims, Evolving 3D Morphology and Behavior by Competition., Artificial Life 1-4 (1994), 353-372.

In this paper, the evolution game Darwin’s Dream is presented. Darwin’s Dream is meant for creating a world of plants, and simulates artificial plant life and its evolution. The main focus of Darwin’s Dream lies on an accurate simulation of a planet’s climate, the evolution of plants growing in different climatic zones, and a realistic visualization of the simulation result. 7

http://xml.apache.org/xerces-c/

8

http://www.grinninglizard.com/tinyxml/

REFERENCES

[1] B. Damer, Nerve Garden, ACM SIGGRAPH 97, 1997.

[5] C. Sommerer and L. Mignonneau, Interactive Plant Growing, in Ars Electronica - Facing the Future (Cambridge, MA: MIT Press, 1999), pp.393-394. [6] Wikipedia, SimLife, http://en.wikipedia.org/wiki/Simlife

91

Do Robots Dream of Virtual Sheep: Rediscovering the "Karel the Robot" Paradigm for the "Plug&Play Generation" Eike Falk Anderson [email protected]

Leigh McLoughlin [email protected]

The National Centre for Computer Animation Bournemouth University, Talbot Campus Fern Barrow, Poole, Dorset BH12 5BB, UK

Figure 1: ”The Meadow” virtual environment.

ABSTRACT

ming course (using the C programming language).

We introduce ”C-Sheep”, an educational system designed to teach students the fundamentals of computer programming in a novel and exciting way. Recent studies suggest that computer science education is fast approaching a crisis - application numbers for degree courses in the area of computer programming are down, and potential candidates are put off the subject which they do not fully understand. We address this problem with our system by providing the visually rich virtual environment of ”The Meadow”, where the user writes programs to control the behaviour of a sheep using our ”CSheep” programming language. This combination of the ”Karel the Robot” paradigm with modern 3D computer graphics techniques, more commonly found in computer games, aims to help students to realise that computer programming can be an enjoyable and rewarding experience and intends to help educators with the teaching of computer science fundamentals. Our mini-language-like system for computer science education uses a state of the art rendering engine offering features more commonly found in entertainment systems. The scope of the mini-language is designed to fit in with the curriculum for the first term of an introductory computer program-

Categories and Subject Descriptors K.3.2 [Computers and Education]: Computer and Information Science Education—Computer Science Education

General Terms Human Factors, Languages

Keywords Pedagogy, Programming, Visualisation, Games

1. INTRODUCTION In recent years there has been a major shift in computer science curricula with the adoption of the ”objects first” approach (emphasizing object-oriented design and programming). This has subsequently led to a redesign of many introductory computer programming courses. New studies suggest that computer science education is fast approaching a crisis as enrolment numbers to degree courses have fallen to extremely low levels [7]. One of the reasons why that may be is a general misconception of computer science and programming among prospective students. A lack of knowledge of the subject area may well be the underlying reason for this bias against computer science. Computer science itself as well as related subjects are wrongly perceived as boring (non-creative), unglamorous and difficult (i.e. actually requiring work) and computer programming especially is often regarded as a monotonous and uninteresting task. We have encountered prospective students

92

who appear intimidated by the technological aspects of the course we offer, as well as the maths involved. Other students are overconfident after having taken an information technology course (little more than a computer literacy course) at school, only to quickly lose interest once they have embarked on the computer science related course (or module) as soon as they realise their mistake, an observation also made by Beaubouef and Mason [2]. While this misconception might in part explain the drop in student numbers, the fact that educators now also debate ”whether or not the objects early approach has failed” [19] might indicate that a return to the older ”procedures first” method of computer science education might be called for. It has been suggested that one way to achieve a greater uptake of computer science would be to make programming ”more fun” [7]. This is much easier said than done, as the challenge that educators face when teaching computer science in general and computer programming in particular is to maintain the interest of students in the subject. A major part of this problem appears to be a lack of patience that we have observed among students. The students from the ”Plug&Play generation” expect to see immediate (and spectacular) results, often before they have learned enough to achieve anything remotely spectacular. An analogy for this would be an illiterate person setting out to write a bestselling novel. A side effect of this clash of realities is that students’ programs often have unintended results, causing confusion and once faced with difficulties (a recent study of which was made by Lahtinen et al [13]), their motivation suffers and they quickly lose interest in the subject. In our experience the result of this is that the students fall behind in their studies which causes frustration and a further loss of motivation, ending in a downwards spiral and eventual exam failure. Motivation is a major factor in the success of students of programming who need to practice writing programs to improve their skills [10]. The fact that students have taken up a course involving programming is alone no indicator for the existence of motivation. This is something we have observed among students of our course (Computer Animation and Visualisation), an arts degree with a strong technical component. Our students come from very diverse backgrounds, often with little prior experience with computers. As a result we especially face motivational problems among those students who joined the course out of interest for its artistic elements, who consider the computing aspects of the course as a necessary evil. We intend to address these problems with our C-Sheep system (see figure 2), a re-imagination of the ”Karel the Robot” paradigm using modern 3D computer game graphics that today’s students are familiar with, our aim being to motivate students to take up programming and to provide them with an enjoyable experience at the same time.

2.

Figure 2: ”The Meadow” and the C-Sheep UI (running on a low-spec graphics card)

allow users to take control of virtual entities, acting within a micro world. It is named after the very successful ”Karel the Robot” program [17], one of the widest known computer science teaching tools which uses a program structure based on the syntax of the Pascal programming language. Untch describes Karel as ”essentially a programmable cursor that can move across the flat world” of a 2D grid with obstacles (walls) that cannot be passed and objects (beepers) that can be placed in or removed from the micro world [21]. The instructions found in mini-languages are usually a set of actions to be taken by the virtual entity in the virtual environment, as well as a set of (sensor) queries, providing information about the immediate surroundings of the virtual entity in the micro world it inhabits. The success of the ”Karel the Robot” paradigm is based on a number of factors:

TEACHING WITH ROBOTS IN VIRTUAL WORLDS

It is generally understood that programming cannot be learned from reading books on the subject alone, but only by practicing it by actually writing programs on a computer. The question that therefore needs to be answered is: ”how can students be motivated to practice programming?”

• By providing a game-like setting for the task of computer programming, the students’ imagination is captured and their interest is maintained. Students are motivated to spend more time programming and are rewarded with an enjoyable experience. In our experience, this heightened motivation is likely to have the side effect that fewer students question the relevance of tasks they have been set, a problem often faced by educators if students are confronted with toy-problems in programming exercises. • The graphical representation of the micro world provides instant visualisation of the algorithms used in the programs controlling the virtual entities - their position and orientation within the virtual world show the current state of the program. This is especially useful as many problems faced by novice programmers can possibly be traced back to an inadequate understanding of program state [9]. The visual feedback is invaluable to the understanding of how a given algorithm works, where in the program potential errors occur and, consequently, how these errors can be debugged.

A suitable answer to this question is provided by the ”Karel the Robot” paradigm which has been reported to have proven itself as highly successful [12, 20].

2.1 The "Karel the Robot" Paradigm

• Data required by programs is less complex than would be in a real-world programming language. This is achieved by scal-

The ”Karel the Robot” paradigm consists of the use of a mini-language [5], that provides a small number of instructions and which

93

problem by some since it adds a further level of complexity to the mini-language, others have observed that a lack of variables will lead to problems in a transition to real-world languages [21]. CSheep allows the use of variables for arithmetic expressions which may be useful to track object’s histories in the virtual environment (as non-visual states), which has been recognised as possibly beneficial [9]. Like C, C-Sheep has mechanisms for the definition of sub-routines which may be called recursively. Pre-defined actions (instructions for controlling sheep entities in ”The Meadow” virtual environment) and (sensor) queries that can be performed by the sheep entity are disguised as library functions: To introduce novice programmers to the concept of code modularisation and libraries, C-Sheep programs must contain an include statement to (supposedly) parse the function prototypes in order to access library functions, while internally these functions are actually intrinsic to the virtual machine. Some of these sheep-specific instructions allow the querying of states in the virtual world (e.g. the current state of the weather - see figure 3). These world states can be altered interactively by the user (while C-Sheep programs are running), adding a separate layer of interactivity to the learning game. By instigating a state change in the virtual environment, the user can cause different sections of C-Sheep programs to be executed, allowing experimentation with different behaviours of the sheep entity from within the same C-Sheep program. Dann et al state that this use of 3D computer graphics to represent program state is ”intrinsic in the natural way to view the data itself” [9].

ing down the data representation to the visual program state - the mini-language uses minimal syntax and is kept variable free to provide an environment with minimal complexity, making this program state ”more real to ... students than collections of alphanumeric values stored in aliased memory locations” [6].

2.2 Shortcomings of ’Traditional’ Mini-Languages Despite the pedagogical benefits described above, the traditional mini-languages have several shortcomings. While existing systems may again be relevant, we believe that they are now severely outdated. Whereas two decades ago, students would be intrigued by the 2D top-down representation of the micro world (often restricted to ASCII characters in text-mode), we are convinced that it is extremely difficult - not to say impossible - to maintain the interest of students from today’s ”Plug&Play generation” by using a textbased graphical representation. This assumption is reinforced by the negative student reaction to the 2D top-down representation of the Robocode system [14], reported by Bierre et al [4]. To prevent mini-language based teaching from becoming obsolete, existing systems cannot just be recycled but must be updated. While traditional mini-language systems have employed a strictly 2D topdown representation, more recent offerings also use pseudo 3D graphics (using isometric projection). Surprisingly few examples, such as Alice or the MUPPETS system [8, 18], use true 3D graphics which are ”attractive and highly motivating to today’s generation of media-conscious students” [16] for their micro world representation. It is this use of the visual gimmickry of modern computer games for representing the virtual world, however, which is most likely to help with meeting the high expectations of the ”Plug&Play generation”.

The C-Sheep program controlled entities exist in the 3D virtual environment of ”The Meadow” virtual world. This 3D game-like representation is essential to interest students from the ”Plug&Play generation” in computer programming. ”The Meadow” is based on our proprietary ”Crossbow” game engine [15]. The Crossbow Engine is a compact game engine which is flexible in design and offers a number of features common to more complex engines. Designed specifically for ”The Meadow”, it incorporates a robust virtual machine for executing C-Sheep programs which is based on the ZBL/0 virtual machine [1].

A different problem is often the choice of programming language on which the mini-language is based. Real-world programming languages often require a lot of work and understanding before meaningful results are obtained, which tends to frustrate impatient students. On the other hand, very abstract ’toy languages’ that could be used as an alternative and which offer immediate results but require only little understanding are often too far removed from real-world systems to appear relevant. We believe that the solution in this case must be a sort of ’toy language’ which is not purely a learning instrument but both simple and close to real-world systems.

The C-Sheep system also includes a counterpart library (in the current stage of development only visualising C-Sheep programs in a 2D top-down view) for programs written in the C programming language. The functions provided by this library mirror the CSheep instructions for the virtual entities. As suggested by Untch, the purpose of such a library is to simplify the migration from the educational mini-language to real-world systems [21], in our case from C-Sheep to the C programming language. Using the library, C-Sheep programs can be compiled into an executable using a normal off-the-shelf C/C++ compiler, allowing novice programmers to make an easy transition from using the C-Sheep system to C. The compiled executable can then be run from within the native working environment of the operating system.

We aim to overcome these problems with our C-Sheep system: The C-Sheep programming language, ”The Meadow” virtual environment and the C-Sheep library.

3.

C-SHEEP

C-Sheep replaces robots with sheep and places them into a 3D computer-game-like virtual environment instead of a simple 2D grid. As the course that C-Sheep was designed for uses the C programming language [11], the C-Sheep language has been designed as a subset of ANSI C. Within the confines of this C subset, CSheep implements the control structures that are required for teaching the basic computer science principles encountered in structured programming [3], these being the (unconditional) sequence, conditional statements and loops. In terms of C, the control structures available in C-Sheep are the block, if and if-else alternatives, as well as while and do-while loops. C-Sheep supports the declaration and use of variables. While this might be considered as a potential

4. SUMMARY AND FUTURE WORK Computer programming is an essential skill for software developers and as such is always an integral part of every computer science curriculum. However, even if students are pursuing a computer science related degree, it can be very difficult to interest them in the act of computer programming, the writing of software, itself. To address this problem we have presented C-Sheep, an educational system for the teaching of computer science principles and a tool for learning the basics of the C programming language. It contains only a small number of reserved words from the C programming language, acting as a mini-language subset. C-Sheep follows in

94

5. ACKNOWLEDGEMENTS First and foremost we would like to express our gratitude towards our supervisor, Prof. Peter Comninos. Without his support this project would not have been possible. We would also like to thank our colleagues, especially Olusola Aina, for their comments and suggestions that have contributed to this project. Finally we need to mention Dominic Halford. It is his ”fault” that our programs control sheep instead of other animals.

6. REFERENCES [1] E. F. Anderson. A npc behaviour definition system for use by programmers and designers. In Proceedings of CGAIDE 2004, pages 203–207, 2004. [2] T. Beaubouef and J. Mason. Why the high attrition rate for computer science students: some thoughts and observations. ACM SIGCSE Bulletin, 37(2):103–106, 2005. [3] C. Bhm and G. Jacopini. Flow diagrams, turing machines and languages with only two formation rules. Communications of the ACM, 9(5):366–371, 1966.

Figure 3: Weather in ”The Meadow”

[4] K. Bierre, P. Ventura, A. Phelps, and C. Egert. Motivating oop by blowing things up: an exercise in cooperation and competition in an introductory java programming course. In SIGCSE ’06: Proceedings of the 37th SIGCSE technical symposium on Computer science education, pages 354–358, 2006.

the tradition of mini-languages, started by ”Karel the Robot”, but employs a state-of the art games rendering engine for algorithm visualisation. C-Sheep consists of a task-specific set of instructions and queries, which allow users to control virtual entities (in the case of C-sheep, sheep) within ”The Meadow” 3D virtual environment, the micro world which they inhabit. C-Sheep uses a real-world language (ANSI C) as its basis and also provides a counterpart library for easy migration from the C-Sheep system to real world compilers. This is where C-Sheep differs from other educational systems as even those that use a more C/C++ or Java-like system are often severely limited in the syntax they use - mainly because their (often variable-free) design explicitly tries to remove language complexity.

[5] P. Brusilovsky, E. Calabrese, J. Hvorecky, A. Kouchnirenko, and P. Miller. Mini-languages: A way to learn programming principles. Education and Information Technologies, 2(1):65–83, 1997. [6] D. Buck and D. J. Stucki. Jkarelrobot: a case study in supporting levels of cognitive development in the computer science curriculum. In SIGCSE ’01: Proceedings of the thirty-second SIGCSE technical symposium on Computer Science Education, pages 16–20, 2001.

There are several directions that we are planning to explore in future versions of the C-Sheep system. Greater interactivity (including communication among several sheep), additional in-game objects and entities (like a sheep-dog) and an integrated development environment with a JIT (just-in-time) compiler are possible extensions to the system. The current prototype uses colour-coded bitmap-images to store game levels and C-Sheep program source code is written in a normal text editor, so better - possibly integrated - authoring tools for creating different scenarios would be a highly desirable improvement. Further refinement of the system would be the provision of more comprehensive documentation (in addition to the current language specification). Finally, feedback from using C-Sheep in the classroom will hopefully allow us to fine-tune the system. The C-Sheep system itself has not yet been used for teaching. So far students have only been exposed to a problem solving and algorithm design exercise (on the white-board) using the C-Sheep language. This exercise was very well received by the students which leads us to believe that our system is suitable for the task it was designed for, but conclusive proof will only be available after we have collected data from a trial run of the C-sheep system. For this we are planning to introduce the current prototype of the C-Sheep system as a teaching tool for the first term of the first year computer programming unit at the National Centre for Computer Animation (Bournemouth University).

[7] L. Carter. Why students with an apparent aptitude for computer science don’t choose to major in computer science. ACM SIGCSE Bulletin, 38(1):27–31, 2006. [8] S. Cooper, W. Dann, and R. Pausch. Alice: A 3-d tool for introductory programming concepts. Journal of Computing Sciences in Colleges, 15(5):107–116, 2000. [9] W. Dann, S. Cooper, and R. Pausch. Making the connection: Programming with animated small world. In Proceedings of the 5th annual SIGCSE/SIGCUE ITiCSE conference on Innovation and technology in computer science education, pages 41–44, 2000. [10] T. Jenkins. The motivation of students of programming. In ITiCSE ’01: Proceedings of the 6th annual conference on Innovation and technology in computer science education, pages 53–56, 2001. [11] B. W. Kerninghan and D. M. Ritchie. The C Programming Language. Prentice Hall, 1988.

Further information on the C-Sheep system is available at http://ncca.bmth.ac.uk/eanderson/C-Sheep/

[12] K. L. Krause, R. E. Sampsell, and S. L. Grier. Computer

95

science in the air force academy core curriculum. In SIGCSE ’82: Proceedings of the thirteenth SIGCSE technical symposium on Computer science education, pages 144–146, 1982. [13] E. Lahtinen, K. Ala-Mutka, and H. Järvinen. A study of the difficulties of novice programmers. ACM SIGCSE Bulletin, 37(3):14–18, 2005. [14] S. Li. Rock ’em, sock ’em robocode! IBM developerWorks: Java technology - http://www106.ibm.com/developerworks/library/j-robocode/, 2002. [15] L. McLoughlin and E. F. Anderson. I see sheep: A practical application of game rendering techniques for computer science education. Poster at Future Play ’06 Conference, 2006. [16] B. Moskal, D. Lurie, and S. Cooper. Evaluating the effectiveness of a new instructional approach. ACM SIGCSE Bulletin, 36(1):75–79, 2004. [17] R. E. Pattis. Karel the Robot, a Gentle Introduction to the Art of Programming. John Wiley and Sons, 1981. [18] A. M. Phelps, B. K. J., and D. M. Parks. Muppets: multi-user programming pedagogy for enhancing traditional study. In CITC4 ’03: Proceedings of the 4th conference on Information technology curriculum, pages 100–105, 2003. [19] S. Reges. Back to basics in cs1 and cs2. In SIGCSE ’06: Proceedings of the 37th SIGCSE technical symposium on Computer science education, pages 293–297, 2006. [20] E. Roberts. Strategies for encouraging individual achievement in introductory computer science courses. In SIGCSE ’00: Proceedings of the thirty-first SIGCSE technical symposium on Computer science education, pages 295–299, 2000. [21] R. H. Untch. Teaching programming using the karel the robot paradigm realized with a conventional language. On-line at: http://www.mtsu.edu/˜untch/karel/karel90.pdf, 1990.

96

Coalition in Auctions Using Shapley Index M. Oussalah and B Hayet University of Birmingham, Electronics, Electrical and Computer Engineering Edgbaston, Birmingham B15 2TT [email protected];

strategies); iii) gains or losses incurred by each player as a result of each combination or strategies (payoffs); iv) dependence or independence among players. In N-person games [10], each player has a fixed number of strategies, and each outcome has a payoff and is determined only by player choices. Importantly the players behave rationally and the goal of each player is to maximise payoff. But it is the combination of strategies by each player and possibly coalitions that determines the outcome. A large part of N-person game theory involves cooperation between players since more than 2 players exist. This gives rise to a coalition formation between players in order to maximise their payoff. The issue of coalition formation and its size are important concept in this framework.

ABSTRACT Gaming has become a huge market for industry as well as in social science and economics where it is rather used as a tool to quantify and analyse rational behaviour of different actors. This paper addresses the issue of use of coalition formation in the context of auction based market. Especially, a stochastic prediction model for coalition will be developed which makes use of Shapley values. The design and implementation of a game platform involving four players with advanced chat functionalities in order to quantify players’ preferences regarding various attributes will be investigated. The developed tool can be compared to advanced auction based systems like eBay, among others.

This paper attempts to use concepts from game theory, especially Shapley value [12] to monitor the coalition formation and development that may occur in auction like scenarios. In conjunction with Shapley Value, the paper suggests a stochastic like model for predicting the success of possible coalition based on elicitation of players’ preferences concerning the different attributes of the context of interest. On the other hand, we investigate the design and implementation of up to a four-player platform, which incorporates chat functionalities and networking between different players. Monitoring and prediction of the outcome will be examined using the game-theoretic model by predicting the success/failure probability of the coalition.

Categories and Subject Descriptors Distributed Artificial Intelligence, Social and Behavioral Sciences, Analysis of Algorithms and Problem Complexity.

General Terms Algorithms, Performance, Design, Economics.

Keywords Game theory, Shapley value, coalition, auction.

1. INTRODUCTION 2. COALITION FORMATION

Game theory can be viewed as an approach to the study of human behaviour which aims to enhance social economics by looking at various players’ strategies to defeat opponent or reach consensus. Especially, the meaning of rational behaviour and game are still important concept in game theory. Since the pioneer work of John von Neumann and Morgenstern [14], "games" have become a metaphor for a wider range of human interactions where the outcomes depend on the interactive strategies of two or more people. So there exist many questions and issues about how people would react in different situations. In this respect, the game theoretic models also help us in understanding various forms of human interactions in real life situations. In this course, in each game, one needs to identify: i) set of players (participants to the games), ii) possible actions that each player can take (player

Coalition formation is a vital part of the developed game [4,5]. It is this concept of game theory that the developed game is particularly based upon. If there are n>2 players in the game, then there might be cooperation between some, but not necessarily all of them. One may ask for instance which of the coalitions are likely to form and what are the relative bargaining strengths of these coalitions? Label the players 1,2…n. A coalition of players, S, is then a subset of N={1,2,…n}. In the worst eventuality, all the players unite and form a single opposing coalition N/S. This then becomes a two-person non-cooperative game and we can calculate the maximum criterion.

97

maintain. These hypotheses are supported by the inverse relationship between group size and the cohesiveness of a group reported by Cartwright and Zander [3]. In contrast to Bargaining theory, the Weighted Probability model assumes that an individual's share of the prize should be a function of the number of alternative winning coalitions available to him/her, rather than the quality of these alternatives.

Let v(s) denote the maximum value that coalition S can guarantee itself by coordinating the strategies of its members, no matter what the players do. This is called the characteristic function. This determines the strength of possible coalitions. However, given the possibility of formation of a given coalition, how should v(s) be shared between its members? The distribution of individual rewards will obviously affect whether any coalition is likely to form. Each individual will tend to join the coalition that offers him the greatest reward. If a player feels that joining a coalition is unlikely to benefit him, he is unlikely to become a part of one. Once again this brings up the issue of players aiming to maximise their payoff [14]. So it is obvious that many aspects of game theory are linked. These will be evaluated at a later stage to see if this is demonstrated through players’ behaviours. However, it is still necessary to investigate how the total gains or losses of the coalition will be distributed. A number of theories exists which try and explain this, the most popular is the Shapley Value, which will be detailed later on. Nevertheless, from experimental psychology perspective, several theories have been put forward to explain, model and monitor the formation of coalitions in game theory: -

-

-

-

Minimum Resource theory of Gamson's [6] is based on the assumption of a "parity norm" which specifies that rewards be divided in direct proportion to the resources of the coalition members. Assuming that individuals are motivated to maximize their share of the reward, the theory predicts the formation of the coalition that minimizes joint resources and is just large enough to win ("the cheapest winning").

Minimum Power theory is based on an index of pivotal power proposed by Shapley and Shubik (1954). The pivotal power of a player is determined by dividing the number of times a person's resources (votes) are "pivotal" (in the sense that a losing coalition is converted into a winning one by adding this person to the coalition) by the total number of permutations of the players. If players are motivated to maximize their rewards, and if they believe that rewards should be divided in direct proportion to pivotal power [6], the theory predicts that players will attempt to form a coalition that minimizes the coalition's total pivotal power. This approach is very similar to Minimum Resource theory; the focus, however, is on pivotal power rather than resources.

2. SHAPLEY VALUE Since some players may contribute more to the coalition than others, the question arises how to fairly distribute the gains among the actors. In other words, how important is each actor to the overall operation, and what payoff they can reasonably expect. This is an important issue within the game. Shapley value [12] contributes to this issue by allowing gain/loss proportional to the contribution of each player to the coalition. On the other hand, the importance of Shapley’s index has been strengthen by Aumann’s finding [1] regarding its asymptotic properties in the sense that as the number of participants grows, the Shapley value converges to the competitive equilibrium allocation.

Bargaining theory of Komorita and Chertkoff [8] is based on the assumption that, in a given coalition, those members who are "strong" in resources (above average) will expect and demand a share of the rewards based on the parity norm, while those who are "weak" (below average) will demand equality. For an iterated game over trials, the theory makes differential predictions on the initial trial and at the asymptotic level. For the initial trial, the theory predicts that the rewards allocated will be the average of those prescribed by the parity and equality norms. At the asymptote, the theory predicts that, for a given coalition, rewards will be divided in direct proportion to each member's maximum expectation in alternative coalitions. The theory postulates that the most likely coalition to form is the one that minimizes coalition members' temptation to defect.

More formally, the Shapley value for an agent i corresponds to the sum of its marginal contributions to all of the coalitions which can be formed to contain that agent, up to and including the ‘grand coalition’ comprising all agents. In other words, let I be the grand coalition, n the number of agents, and k the cardinality of a given coalition K, K ⊂ I . Then the Shapley value for each agent i is the unique payoff

The Weighted Probability model of Komorita [7] assumes that, because of the logistic problem of communicating offers and counteroffers, large coalitions are more difficult to form than small ones. As the number of potential coalition members increases, the severity of the problem of achieving both reciprocity and unanimous agreement on the terms of the offer also increases. The number of potential defectors from the coalition also increases with its size; hence, a large coalition is not only more difficult to form, but may be more difficult to

vυi =

 ( n − k − 1)!k!   (υ ( K ∪ {i}) − υ ( K )) (1) n! K ⊂ I ,i∉K 



In the preceding, the quantity

{υ ( K ) − υ ( K \ i)} represents

the increment in the total worth of the coalition K due to the entry of agent i. While [( n − k )!( k − 1)! ] / n! represents the probability that, in a random build-up of the grand coalition I;

98

completely without power in this game, even though it has only slightly fewer votes than its companions.

agent i will be the next agent to join a coalition containing the first k-1 members. The Shapley value thus identifies for each player i its expected marginal contribution to each of the coalitions where player i is involved. The use of the Shapley Value allows us to have the total gains and losses incurred within a coalition distributed proportionally to the contribution made by each player within that coalition.

The above provides us with a method of working out the ‘bargaining power’ of each player, and therefore the possibility of a coalition occurring. However, it does not indicate what the size of the coalition might be. To deal with this issue, the theory of minimal winning coalition can be used. The latter is based on Riker’s size principle [11], which stipulates that a winning coalition is minimal in the sense that any member’s defection will reduce them to a non-winning size. This is based on the commonsense idea that if a coalition is large enough to win, it should avoid taking in new members because these new members would demand a share of the pay-off. So, the optimal size of the coalition is the one that defeats its opp onents in terms of awarded profits. If this occurs, no further rounds need to be undertaken by the players and the game ends by noticing the winning coalition to all other players.

From an axiomatic viewpoint, notice that the worth function υ is such that i) υ (∅ ) = 0 ; ii) υ ( S ∪ T ) ≥ υ ( S ) + υ (T ) , for all disjoint set of players S and T. Accordingly, it is easy to check the properties of the Shapley index i)

vi :

viυ ≥ υ ({i}) , e.g., each player will get at least equal worth

as he will get if he has not entered any coalitions; ii)

∑v

υ i

= υ (I ) , e.g., conservation of the total worth;

i∈ I

iii) iv)

vi If

is invariant to any permutation of the order of players; players

i and j are equivalent under

υ ({i} ∪ S ) ≥ υ ({ j} ∪ S ) ,

υ,

i∉S, j ∉S,

with

i.e.,

4. PREDICTION OCCURRENCE

then

viυ = vυj ; v) Additivity: If two coalition games with worth function υ1 +υ 2 i

υ 2 were used, then v

υ1 i

OF

COALITION

As far as the formation of coalitions is concerned, the prediction capability is an important issue; namely, this answers, for instance, at what extent the observer can predict that a coalition is likely to form between player 1 and player 2? The approach developed here is somehow similar to that developed by Axelrod [2]. In the latter, each player describes his preference xi (ideal point) in a cardinal scale represented in some Euclidean space. Therefore, the smaller the distance between the ideal points of the players on a policy scale, the closer are their preferences, and therefore, the more likely they are to form a coalition with each other.

υ1 and

υ2 i

=v +v

It has been proven that Shapley Index is unique under constraints of axioms ii)-v). This provides an axiomatic characterization of Shapley’s index as pointed out by Neyman [9]. For example, depending on the bargaining power of each player, the chances of a coalition winning can be identified. A winning coalition is given a value of 1.0 and a loosing coalition is given a value of 0.0. Bargaining power is simply players who arrive at an agreement based on their own assets [5, 6, 13].

Formally, consider a set N = 1,2,…,n of players. The preference, or ideal point, of each player, xi (with i = 1,...,n), is assumed to be distributed on cardinal scale in an Euclidean space, such that 0 ≤ x1 ≤ x2 ≤ ... ≤ xn ≤ 1. The difference in preferences between two players, i and j, can hence be measured by the distance on the cardinal scale separating the preferences of the two players: di,j = | xj – xi|||. Note that as the choice might be subject to several attributes, the quantity xi is represented in a vector where each component corresponds to preference intensity for the corresponding attribute. Therefore, each player’s preference can be placed on a linear scale and represents a coordinate in the mdimensional Cartesian space Rm (in case where m attributes were

Initially, since all players have the same capital and no properties, the ‘bargaining power’ for each is the same. As they progress through the game, the bargaining power will begin to change. The example below shows how this theory may be applied into a practical context : Consider the 4-play er game: A has 51 votes, B has 51 votes, C has 51 votes and D has 47 votes. Assume, for instance that a majority vote (101/200) is needed to pass a proposal. Any winning coalition has to contain two of A, B or C in order to control a majority of votes. This coalition will be assigned 1.0 to identify it as a winning coalition. D alone with A (or B, or C) only controls 98 votes, and so is not a winning coalition therefore is assigned 0.0. But in a coalition with two other players (such as {A, B, D}) D is completely superfluous. The coalition could win just as easily without them. D has nothing to bargain with, and is

used). It is common to use a weighted distance structure in case one desires some attributes to have more weight than others; that is,

|| xi − x j ||=

p1 ( x1i − x1j ) 2 + ... + pm ( xim − x mj )2 ,

where the weights pi sum up to one.

99

(2)

coalition. Consequently, the value of u can be suggested as a function of Shapley values attached to all players. More specifically, a possible candidate for this purpose is given by:

Coalitions are therefore more likely to form among players sharing similar preferences. Hence, the closer the preferences (i.e. the smaller the distance) between players, the more likely they are to reach an agreement.

u =1−

Based on this general principle, the probability of a coalition may be formalised in various ways, depending on the underlying assumptions made as regards the mechanisms of coalitionformation. So, assuming the probability depends only on the distance between (extreme) players of the coalition C exclusively. The weight W(i,j), defined by

vin υ =

viυ max vυi j

(4)

and

vn =

1 N

N

∑v

υ i

(5)

i =1

Accordingly u is here inversely linked to the variance of the normalized Shapley values of the players. •

The rational behind representation (3-4) is that the parameter u is quantified as inversely related to the spread of the normalized Shapley values. The normalization is motivated by the desire to ensure a value of u within the unit interval. Consequently, in case of same Shapley values for all players, indicating that each player does have equal contributions to all possible coalitions that he can take part to, then the value of u takes its maximum value 1. Clearly, the latter corresponds to a high rigidity situation in which the agreement among the players is difficult to achieve since there is no substantial gain rewarded to the player if he joins any coalition. Otherwise, if the overall spread among the Shapley values is important in the light of (3-4), then some players would be rewarded substantial gain if they decide to join some coalitions, which provides more flexibility and willingness to form coalitions, which, in turn, lowers the value of u.



It should also be noticed that the suggestion (3-4) assumes a normalized value of u, while in the literature it is taken open within the set of positive real numbers. Clearly, the rational for the normalization is only justified by the readability of Shapley values and commensurability purposes.



Expression (3) arises concern about the preference structure that needs to be used in this framework. Strictly speaking as far as the cardinal scale is concerned, the choice is very limited and excludes most of ordinal and interval type preference developed in decision making theory [14]. In contrast to Shapley value where the calculus requires enumeration of all possible scenarios, which is therefore computationally expensive, the calculus of coalition probability relies on a single scenario of a coalition. Therefore, given the scale of players’ intensity of preferences, the decision-maker can speculate on a possible scenario (coalition), which is likely to maximize expression (3). For this purpose, it is enough to rank the preference intensities within appropriate scales. While the Shapley values used in the calculus of u are related to previous stages. Nevertheless this arises the issue of optimal communication links within the

determines the probability that players i and j reach an agreement in a negotiation process, the chance of a positive outcome being greater if the respective positions are close to each other on the preference scale. The parameter u ≥ 0 is a measure of the flexibility in the negotiation between players. The higher the value of u, the less likely is agreement among all players. But generally, players close to each other on the scale are more likely than players distant from each other to reach an agreement. In the boundary case, when u → ∞, only players with an identical position (i.e. di,j = 0) will reach an agreement. When u = 0, any two players will agree irrespective of their particular preferences, as if compelled by some external factor. Hence, u → ∞ corresponds to infinite rigidity and u = 0 to infinite flexibility. In general, in the absence of prior information, any strictly positive and finite value can be a candidate to u. While allocating a specific value to u can be context dependent and it is strongly influenced by the willingness of players to reach common agreement. A suggestion based on Shapley value will be given later on. For u ≥ 0 fixed, the probability of reaching agreement only depends on the respective preferences of the players. Hence, two players i and j with identical preferences have an identical ideal point, and will therefore always reach an agreement: W(i,j) = 1 when xi = xj. Notice that from representation (3), remark that:



,

with

W (i, j ) = exp( −u .d ie j e ) = exp − u. max || x i − x j ||  , (3) i , j∈C  



1 N nυ ( vi − v n ) 2 ∑ N − 1 i =1

The probability of the success of the coalition is mainly dependent on the distance among the extreme players in the sense of preferences. Therefore, the case of a full homogenous coalition where all play ers have equal preferences yields a maximum value one to expression in (3), otherwise, the more the coalition contains players whose preferences are far away, the more the above probability decreases to a zero-value. The parameter u, which quantifies the flexibility in the negotiation can be guessed from past history. A potential guess that can be used is the Shapley value, which provides some hint on a possible impact of adding a given player to a



100

order for Server to be able to handle multiple players simultaneously, it would need to spawn a thread containing the PlayerThread class for each player. Likewise, each PlayerFrame would need to spawn its own thread containing Player. That leads to a system whereby all of the Player instances communicated with all of the PlayerThread instances, each class invoking its parent's methods according to the messages it received. Next a flexible communication protocol needs to be implemented. The natural choice for transporting the protocol was XML. This was because you can create human-readable messages, and Java has classes that allow you to easily parse those messages. This was relatively simple to write and was intended to simply convert strings into XML documents, load XML files into XML documents and them retrieve the value of the XML nodes. The XML Utility class grew out of this concept, as I did not want to duplicate code across my classes. Once the controls were in place, we put in the logic to send requests via XML to Server whenever a control was used (e.g. sending a Bid, chatting with another player, or joining a Coalition). These XML messages were exchanged between the PlayerFrame and the Server. When a player selected a specific control, an appropriate method was invoked. During the implementation of the Server, chat functionality was included within the system to allow players to communicate privately. This was necessary within the system to allow players to discuss strategies, amount to bid, attribute preferences, whether to form coalitions and so on. As a final part of the design phase, we chose to create a class which represents the player history. The StatsFrame class was able to display whether a player had won properties, if so, the amount they had paid and the name. This was necessary to allow the player to know about the investments they had made. The ability to access this was from each player window and was invoked when the ‘stats’ button was selected.

players’ configuration which is still a debatable issue in game theory [5].

5. GAME DESIGN 5.1 System Architecture Figure 1 exhibits the implementation of a four-player game architecture in which the players can communicate freely before deciding on possible formation of coalitions.

Each player frame gets config info from player.xml file. The PlayerFrame class has in instance of the Player class, which sends and receives socket messages.

Player 1 Player Frame Player Frame Player 2 P

P

PT

PT

Server gets config info from Server.xml, initializes a port and a socket, and listens for connection requests. Game data is given to it by an instance of the Game class, which gets the data from Game.xml.

PT

Server

PT

P P

Player Frame

Player Frame Player 3 Player 4

XML-based communications protocol of the form

The rules for the game were integrated within different classes according to which one it were relevant to. The following rules were included:

Each player has a PlayerThread class processing messages on a separate thread spawned by the server. The class holds a socket that sends and receives the messages

• Figure 1. System overall architecture



The aim of the Server class is to form connections with players and then perform appropriate actions such as bid, log player onto coalition and so on. The aim of the PlayerFrame class is to represent the main window for each player. This also connects to the server so appropriate actions can be performed such as bid etc. Although this was the task of the two classes, we had to figure out how Server would communicate with multiple instances of PlayerFrame, and how PlayerFrame could receive messages while simultaneously allowing the user to interact with the frame. This suggested the use of two thread classes, PlayerThread and Player.

• •

• •

The PlayerThread class represents the connection thread for a player, which handles incoming and outgoing requests for the player and is controlled by the server object. While the Player class handles communication between a player and the Server. In

The game has four levels. Each player has an initial amount they start off with. In the First stage, no coalitions can be formed, and you must bid for a property without knowing how much other players have bid. Every time you bid for a property, you lose that money, even if you do not win the property. From the second round onwards, players can form coalitions. Forming coalitions is important to defeat stronger opponents to gain mutual benefit. You can talk to players privately and decide on what the best way to succeed is. To win the game, you must have the most properties and also must not be in debt.

5.2 Design of the coalition prediction

101

This allows us to formalize a methodology for calculating the probability for a coalition winning. Notice that the implementation of the Shapley value scheme is rather complicated since at each round the algorithm requires, for a given player i, to contact all other players, which will reply by transmitting their payoff functions and resource vectors. After that the calculus requires the computation of all possible coalitions (2 n coalitions in case of n players), where for each coalition, the corresponding payoff is determined. Next, all agents are contacted to inform them about the result. Figure 2 exhibits the different stages in the formation and calculation of coalition probability.

6. TESTING 6.1 System description The testing was carried out on four groups of 4 people. Each person was given a user manual explaining the rules of the game and its purpose. This makes the players bit familiar with the game. The players were then instructed to begin the game and each person was monitored and observed on how they played the game. The following pseudo-code describes the overall functioning of the game. Pseudo-code of overall game 1. Initialization - Initialize members’ variables. - Load server and players information from server xml. - Open a socket and listen to players’ messages. 2. Player Logon - Check if player already exists, if yes send error message. - Otherwise, save player name and associate it with image. - Start a communication thread for the player. - Send a message to all players that new player logged on. 3. Player Logoff - Remove player from player list . - Send message to all players to inform of log off. 4. Remove Player from Coalition - Find the coalition the player belongs to and remove him . - Inform all other players. 5. Add Player to a Coalition - Make sure coalition isn’t full and that the player doesn’t already belong to. - Make sure player doesn’t belong to the other coalition. - If player can’t join, send the player a message. - Otherwise, add the player and inform the other players 6. Start a New Round - Reset any holding variables for the round. - Send updated capital to all players. - Send new round info to all players. 7. Process a No -Bid from Player - If player was from a coalition, then set a no bid for that coalition and send a message to coalition’s members. - Otherwise, add a no bid for this player. 8. Process a Bid from Player -If player was from a coalition, then set a bid for that coalition and send a message to the coalition’s members. - Otherwise, add a bid for this player. - Keep track of the highest bid and bidder. -Subtract money from player and any possible coalition members. 9. Process the End of Round - Determine the winning player and send the players a message. - Determine the probabilities of winning for the round and send them to their respective players. 10. Get Player Statistics - Get the number of properties won. - Get the name and price of each property won. - Request probability of success in joining a coalition - Send the information to the player.

Chat ↔ Preference elicitation

Ranking Preference intensities

Shapley Values

Calculus of coalition probability

Update

Coalition

Figure 2: Calculus of probability of coalition formation As it is noticed from Figure, first the initial chat among different players allows to provide a first elicitation of the preference with respect to each attribute. This allows the central server to determine a global ranking and at the same time determine the Shapley values associated to each player. Notice that individual players cannot accomplish full ranking as the player’s view is only local and relies on the players the individual has chatted with unless a single player has chatted with all remaining players in the game. The use of Shapley values allows us to determine the parameter u of expression (2), and therefore, the coalition probability. The latter can be updated by choosing another coalition which yield a higher probability W or by simply going back to the chat and refine the player individual preferences.

102

The example shown in Figure 3 describes the PlayerFrame, which corresponds to the window that the player is able to see. It gives each user information and a number of options through which they are able to make decisions.

Figure 4: Chat window Each time communication does take place, it is logged as a text file called ‘server.txt’, therefore a text mining subroutine is used to identify key attributes and the player’s preference. This of course assumes that the players clearly mention the attributes in their chat and assign quantitative evaluation given as a numerical value. The game provides freedom for the player if he wants to join a coalition or visualise the probability of success of a coalition first.

Private chat boxes

Figure 5. Figure showing “Joining a coalition”

Figure 3. Player window

However, players are unable to join as teams in the first round. If they attempt to do this, a ‘Server Message’ informs them of this as part of Help functionalities implemented in the system.

Before a player can take part in the game, they have to log onto the system using a unique user name. Players cannot have the same user name. If this occurs, the player is informed. Once the player has logged onto the system, the player window is loaded. After the game, short interviews were held with players to understand the process of thinking involved when they were playing. The testing generally showed that each player wanted to maximize his or her own payoff. If players did cooperate, it was mostly for their own benefit. This observation is also suggested by game theory and n-person game theory, which the game is based upon.

Figure 6. Bidding on a property

Each player has the ability to talk to other players at any stage of the game, through this they can discuss possible strategies, the amount to bid on properties and also decide based on the information they gather about each player whether it will be beneficial to join as teams. The example below shows player 1 sending a message to player 2. When Player 1 sends a message to Player 2, it will appear in the appropriate box as shown below, see Figure 4.

However, players cannot bid the same amount. If a player attempts to bid the same as another player, a warning message is displayed. Each time a player or a coalition bid for a property, the probability of them winning the property is displayed in the server message box. This gives each player or coalition an idea of how the other players are performing and the amount they may be bidding. Through this, each player is able to change their strategy according to how they want to play the game. During the testing of the game, the probability produced by the system provided a realistic prediction of what the chances of that individual or the

103

coalition success was within the game. During the testing phase, it was revealed that in 90% of the cases, the probability estimation of success in that round was true.

-

As suggested by the theory of maximum expected utility in rational game theory, it revealed that each player does enter the game with the goal of winning and maximizing its own payoff. Even when coalitions were formed, players only did so because they saw some benefit for themselves in terms of increasing their chance of win, and therefore, the expected profit.

-

An interesting revelation was that as soon as players began to win properties, they were reluctant to form coalitions since they felt that it would be more beneficial for them to bid for properties solely. However, in many situations, this led to their downfall since opponents formed coalitions and were able to bid higher as a result.

-

Some players in particular have another interesting strategy. This was to bid a low amount for the first property. Through this they hoped to see who the stronger opponent was and then try and form coalitions with them in the next round while communicating with other players to form coalitions with them in the third round. This way, in the final round they knew that they had more capital than some players and were able to bid higher. However, this strategy was not very successful since the stronger opponent was not always willing to form coalitions.

-

Another observation stipulates that some players who were in a coalition were actually revealing to other players how much their coalition was going to bid in order to defeat them at a later stage by realizing how strong the different players were in the game.

-

The preceding observations highlight the agreement of players’ behavior with the principle of maximizing expected utility or profit that each player is attempting to achieve. However, one aspect it did not point out as clearly was of ‘cheating’ and ‘traitors’ within coalitions who would go to further extents to maximize their payoff.

-

From this, it is surprisingly obvious how some of these situations in the real world exist today. Game theory uses ‘games’ as a metaphor to describe more ‘real-life’ situations as well as the underlying process of decision making involved in such scenarios. The developed platform merely highlights the strengths of the game theoretic model in describing players’ behaviors with respect to coalition formation.

Figure 7. Displayed winning probability.

Figure 8. Game winner message

The final version of the system can be compared to other systems available in the market place. One such example is eBay [15]. This is an online auction where users can sell and buy products. The system is similar to this since it also involves a bidding process for each property. Another example is Age of Mythology [16] produced by M icrosoft. This is a very complex game but employs some of the ideas behind by game. Coalition formation is a major aspect within the game. Players are able to form teams and through these teams defeat other opponents. Although the game does not predict the probability of coalitions winning, it does suggest the strength of each player. This is done through a rating system. Each time a player plays against other opponents and wins, points are gained. If the player loses, points are lost. Depending on the rating of each player, the strength of that player is determined. The higher the number of points, the stronger the opponent is. The game also allows players to think logically and strategically about situations since players do not know how their opponents will play. In a sense, the system created is a simple version of Age of Mythology in terms of the theories used to create the game and play it.

6.2 User Testing

7. CONCLUSION As a part of the system, user testing was carried out to determine the extent to which players behave in agreement with the game theoretic model. As already indicated in the previous sections, tests were carried out on 4 groups of 4 players to evaluate the process of thinking involved when playing the game and what strategy each player employs. In this respect, one notices:

The inclusion of game theory and coalition formation was successfully accomplished and implemented within an auction based system where players (buyers) attempt to form coalitions to bid for houses. Players were able to form coalitions according to the strategy they employ. To aid this, players were given probabilities of how likely they were to win. Shapley value was

104

[9] A. Neyman, Uniqueness of the Shapley Value, Games and Economic Behavior, 1, 1989, 116–118

used in a conjunction with a stochastic model based on preference intensities as a method to predict the coalition success, and, therefore, distinguish the winning coalition. The gains or loses incurred by a coalition were also distributed as a proportion of how much they contributed within the coalition.

[10] A. Rapoport, N-Person Games: Concepts and Applications, Dover Publications, 2001 [11]. W.H. Riker, The Theory of Political Coalitions, New Havens, CT, Yales University Press, 1962.

Through the use of testing procedures, it was realized that the players did perform as suggested by the game theoretic model as the players’ behaviour in joining or leaving a given coalition is only driven by the desire to maximize the total payoff.

[12] L. S. Shapley, A Value for n-person Games, in A.W. Tucker and H.W. Kuhn, eds., Contributions to the Theory of Games. Annals of Mathematical Studies, 28, Princeton: Princeton University Press, 1953, 307-17; reprinted in Roth, Alvin E. ed. (1988), The Shapley Value: Essays in Honor of Lloyd S. Shapley, Cambridge: Cambridge University Press, ch.2, 31-39.

Although it is difficult to portray the overall user-friendliness of the system, however, the results highlight the simplicity of the developed system. The system functions are linked in a logical order so that players did not become ‘lost’ when using the system. On the other hand, the creation of a networked system allowed the players to communicate and play the game over a range of computers. The system also incorporates the creation of a Chat System between each player and the logging of each communication which takes place between the players into a text file. In order to optimize the implementation stage a careful thought and consideration towards the graphical view of the system has been made through the use of cascading style sheets.

[13] M. A. Van Deemen, Coalition Formation and Social Choice. Boston/Dordrecht, Kluwer, 1997. [14] J. Von Neumann and O. Morgenstern. Theory of Games and Economic Behavior, 2nd ed. Princeton University Press, Princeton, NJ, 1953 [15] http://www.ebay.co.uk [16] www.microsoft.com/games/ageofmythology/

ACKNOWLEDGEMENTS The authors are grateful to Nuffield Foundation grant, which partly supported this work.

REFERENCES [1] R. J. Aumann, Value of markets with a continuum of traders. Econometrica 43(4), 1975, 611–646. [2] R. M. Axelrod, Conflict of Interest: A Theory of Divergent Goals with Applications to Politics, Chicago: Markham, 1970. [3] D. Cartwright and A. Zander, Group Dynamics. New York: Harper and Row, 1968. [4] J. S. Coleman, Control of collectivities and the power of a collectivity to act, in: Lieberman, ed., Social Choice, Gorden and Breach, 1971, 277-287; reprinted in J.S.Coleman (1986), Individual Interests and Collective Actions, Cambridge University Press. [5] S. Guiasu and M. Malitza, Coalition and Connection in Games, Pergamon Press, 1980 [6] W. Gamson, A theory of coalition formation. American Sociological Review, 1961, 26, 373-382. [7] S.S. Komorita, A weighted probability model of coalition behavior. Psychological Review, 81, 1974, 242-256. [8] Komorita, S. S., & Chertkoff, J. M. A bargaining theory of coalition formation. Psychological Review, 1973, 80, 149-162

105

Games Technology and its role in Social Work Farath Arshad

Lynette Partington

Head, Centre for Health and Social Care Informatics (CHaSCI), Liverpool John Moores University Byrom Street, L3 3AF UK

LP Consultancy, P.O. Box 373, Southport, PR8 4X, UK

[email protected]

Gerry Kelleher Abdennour El-Rhalibi School of Computing and Mathematical Sciences Liverpool John Moores University Byrom Street, L3 3AF, UK

technology has become part of the fabric of work and society. The launch of the British Governments ten year NHS plan in 2000 saw the foundations being laid with the launch of the 6 billion pound National Programme for IT (NpfIT). Along with this came the drivers that greater partnership working should be encouraged to ensure integrated care pathways, enabling information and data to be shared between different agencies based in Local Authorities, NHS, Voluntary, Police etc.

ABSTRACT The authors explore the potential of Games Technology in social care settings. The paper gives an overview of related computing research, which has developed techniques for social care environments. Drawing on these developments the authors present a model of the way in which current developments in Games Technology can potentially alter the way vulnerable children are able to affect the assessment process which endeavors to tailor care for their specific needs

An example of how such authorities may need to work together could include vulnerable children, particularly those in care. Numerous distinct agencies may become involved depending on the range and complexity of care required, for example a child who may have been abused, with NHS needs and/or known to the police for one reason or another. Often, agencies involved in the care of these children will include a social worker, educational psychologist, health visitor or school health advisor. Invariably there is the requirement for assistance from at least one of these agencies, often at a time that is unpredictable. All the professions and agencies involved need to share specific data about the child. Furthermore, the social worker in charge of the case would need to review the case from time to time. This would include an interview/discussion with the child. However, it is now becoming apparent that this process of nominally including the child in the service planning does not fully engage them in the process. Often, the child will give replies they feel fit what the adult wants to hear. Consequently, care plans are drawn up which do not fulfil the needs of the vulnerable client. They are being let down by the very services which are supposed to look after them.

Categories and Subject Descriptors D.3.3 [Programming Languages]: Language Contructs and Features – abstract data types, polymorphism, control structures. This is just an example, please use the correct category and subject descriptors for your submission. The ACM Computing Classification Scheme: http://www.acm.org/class/1998/

General Terms Algorithms, Performance, Design, Experimentation, Human Factors.

Keywords Games Technology, Artificial Intelligence, Social Work, Assessment Process, Modeling

1. INTRODUCTION

When dealing with vulnerable people, especially children, organisations that care for these people need to look at different ways of working and ensuring that these clients have a say in what and how care is planned for them.

The interface between the evolving technological advances in computing and its application within entertainment, education, business (administration/clerical, automation) and decision support, is pervasive in many areas of work. The games industry has mushroomed and, indeed, is big business. Whereas once it was the preserve of geeks with ‘know how’ of machine code, it is now a formal discipline in education with promise of generic skill acquisition in computing software and hardware. Times are exciting, we have developments in many areas of computing research which offer tremendous power and quality in both software and hardware developments. Many subject areas have benefited from the widespread utilisation of computing. Games technology has seen huge interest from disciplines as diverse as the film industry to architecture and medicine. Yet application of computing in less formal disciplines such as social care is not as common. Indeed, applications are restricted to supporting administrative and clerical tasks (such as record keeping, form filling, email, appointments - all for case management). Over the last two decades we have seen huge improvements in the way

This paper looks at developments in social care, particularly, social work around vulnerable children. This interest lies in the fact that we are dealing with children who are going through difficult times (in their family environment, social, education and even in health needs). So, how do we engage them in this important process of ensuring that services tailored for children are appropriate? In trying to answer this fundamental question, we look at children’s natural interest. In this day and age, virtually all children watch television (therefore respond to specific cues), use a computer for one thing or another (usually to play a game) or have access to or own a gameboy or playstation. We want to look at how these tools used by children can be embedded into processes that are designed to evaluate and, make value judgements about their needs by professionals, such as

106

complex environments. The authors suggest that work from Artificial intelligence, particularly that related to planning and scheduling (where data consistency from multiple sources in dynamic environments is also an issue) may well offer a conceptual framework within which these issues may be addressed (see for El-Rhalibi and Kelleher 2004)[2].

social workers, educational psychologists, and even healthcare personnel.

2. THE NEEDS OF CHILDREN – THE APPLICATION OF TECHNOLOGY The web source, ‘Technology and Software: The Complete Social Worker, A Guide to using the Internet in Social Work’[1] makes an interesting point that emphasises how greater use of computing technology can be “key to improving service outcomes for vulnerable populations”. As we have already pointed out, the group with which we are interested in working with - vulnerable children, commonly demands collaborative care networks spanning multiple care organisations. The essence of the issue facing internet supported social work also includes the need to deal with the need for complex data sets from multiple sources to be integrated consistently, often when they are changing dynamically over time. In addition, we must recognise the need to maintain a general notion of consistency based on a semantic interpretation of data rather than a strictly syntactic consistency. This will be required if we are to make real progress with the delivery of effective and, resource efficient services in such

Perhaps more interestingly, it is possible that work on so called Reason Maintenance Systems (or RMS) which have been applied to the management of dynamic consistency in multiple competing formal belief sets may offer a highly focused and relevant model for information integration in social work settings where multiple, mutually inconsistent but self internally consistent models may have to be reasoned with (see Spragg and Kelleher 1996) [3]. This will be particularly an issue if the approach that we propose – integrating children’s views on care plans alongside professional views – is taken seriously and implemented in a technical solution. Essentially what we are proposing requires that the integration within a complex social and information network of the child’s contribution to their care planning must sit well within the information framework that such technical and internet based solution provides. capturing information and data will be developed by the AI and Games Technologists. Evaluation and embedding the whole process within health and social care setting will involve a range of interested partners and, as we have already discussed above is an area ripe for appropriate technological innovation.

3. THE NEEDS OF CHILDREN – THE SOCIAL CONTEXT Children are made to feel powerless in systems and structures that affect them as there is little meaningful consultation involving them. Although there is recognition of the need to consult with children, this maybe hampered by age, disability, mental state, comprehension and language ability of the child. As pointed out above, lack of resources and aids available to support consultation with children (which is often time consuming) may also have an affect. Also, there is no universal, or formal protocol to involve all children in this process in a way that they can relate to and meaningfully contribute to (see Arshad et al 2006)[4].

4. COMPUTING AND GAMES TECHNOLOGY Techniques for modelling game playing – especially in adversarial setting such as Chess have been of interest for several decades and owe much to the early efforts in Artificial Intelligence which successfully developed techniques (e.g. games trees) for representing possible moves of both the player and the opponent (p25, Barr and Feigenbaum, 1981) [5]. Further, many other representational methodologies have enabled us to couple such developments with abilities to model our ‘users’. A large body of work exists on ‘user modelling’ or ‘student modelling’, which gives us the ability to customise our systems for the needs of specific communities. As a consequence we can embed features that are either more likely to be viewed favourably by our users (i.e colour cues, audio and so on) or allow presentation of information in ways that model the way our users engage with particular concepts, ideas or information (see for example Mandl and Lesgold, 1988) [6]. The importance here is that virtually any system that purports to dynamically capture and adapt to the evolving needs of its users has some elements to enable interaction adaptation to the current user. Furthermore, Artificial Intelligence has provided many solutions which lend to developing adaptive systems, not least the ability to represent domain and world knowledge with which systems can interact at a level of proficiency that can be labelled ‘expert ‘ (cf expert systems, rule-based systems. See Hayes-Roth 1997) [7]. To simplify, we have had the ability for some time now to represent and talk about different kinds of knowledge used, managed and displayed in different ways. Exploitation of this knowledge for problem solving, inference and control are well understood and formalised. Importantly the advances in the last 10-15 years in hardware and Operating System platforms mean that they now

In the current assessment process children’s input is minimal and the view is that they are rarely consulted and, when they are, they are not heard. It is accepted that children perceive the world differently from adults and focus on specific information, and therefore, the importance of visual stimulus in their learning and supporting their communication is vital to consultation. The reality is that they have grown up and are familiar with computing tools - games in particular. They naturally accept media, which takes into consideration their need for visual triggers in order to understand and express their needs. In our research we recognize the need to encourage children to become more actively involved in the assessment process. Involvement of children will require harnessing evolving technologies in gaming and advanced computing, particularly artificial intelligence so that tools can be developed which will allow children to help tailor an assessment process that captures their views and understanding about the services being mapped out for them. These tools will be designed with the help of children so that they drive how consultation with them happens using a technology that they have grown up with. Adopting a multidisciplinary approach, we are working with experts in social work with skills in involving children in these processes and in capturing their views, whilst modeling tools and schemes for

107

to assess and identify the needs of children and young people through a series of queries based around health, education, family and social settings (with a small section included to allow the child to provide comment on the assessment). It is, however, entirely adult driven, leading to conclusions, solutions and actions, which need to be achieved through a multi-agency approach. Hence, the data needs to be shared. Our work on Gaming Assessment for Children - GAC is exploring the use of computer gaming in a move towards including the voice of the child being truly reflected in the assessment processes. GAC is about merging technology, incorporating multimedia and gaming into an assessment process that the child can understand and relate to.

have the capability to do AI-based systems justice. An excellent example of this is gaming technology where increasing the quality of “the AI” is seen as fundamental to the success of the game and it’s likely acceptability for players. The potential of Games technology outside of the entertainment arena has also been steadily growing. We now have wide application in the areas of medicine, business process modelling and, certainly a large market within military applications. In the 90s, a related technology, virtual reality was much exploited within psychological areas, see for example Riva, G., Bacchetta, M et al, (1998) [8], Smokowski, P.R. & Hartung, K. (2003) [9] and Wendy-Lou & Greenidge (2005)[10 ] . In an area closely related to our area of interest, gaming and AI techniques have seen use within the area of education, for example in the development of tools to aid in language development and role-playing in training (see Jeffrey Rothfeder 2004) [11]. The exploitation of these technologies within Social Sciences is a relatively new development. In 1998, James Nolan [12] developed DISXPERT. This was a rule-based system, which enabled, intelligent referral of clients in receipt of social security support for vocational rehabilitation services. This project recognised the importance of the role of caseworkers in social service areas and, how intelligent technology could enable ‘unbiased and consistent assessment decisions regarding referral of clients’. The emerging developments in games technologies thus provide the framework within which an immersive, child oriented communication technology may be developed. Particularly important in this respect will be the integration of visual and audio based technologies with the emergent agent architectures that will allow the development of a variety of human like software entities (agents) with which the child can interact and, between which the child may elaborate relationships (for example see, Bonnevay et al 2005 [13], for a specific type of game embedded agent entity). In addition to the social work possibilities available from the technology, however, there are the possibilities inherent in the growing pool of human expertise in this area (for example, many universities now have highly technical games development courses) that will make not only the technology, but also the expertise to exploit the technology increasingly pervasive and available for such applications (Neti et al 2006) [14].

5. GAMING ASSESSMENT FOR CHILDREN GAC There are two aspects to our work, (a) Engagement of children in designing the game (and tools to be made available for play) (b) The architecture of the actual game (in play mode) and the data it provides which relates to the assessment process. Our work attempts to support the assessment process, making it more ‘interactive’ through the concept of a game, which the children help to design using objects and tools which represent variables important in their lives. The principle is one of engaging children in helping identify the type of game environment, which they would like to design. Consequently we have a collection of game scenarios that need to be supported. The children will also help specify the type of features (tools) they wish to be able to manipulate in the scenarios. Generic tools will be developed to represent features necessary to interact with the game. These will include, amongst others, buildings (house, school, office), rooms, routes, corridors, furniture, people (male/female). Each of these items can be selected and tailored by the player. Variables such as colour, personality characteristics, roles can be defined from a range of predefined lists with some limited flexibility for text input. A data collection module will record choices of tools and the nature of the interaction selected. These will be matched to a sample of data needs identified in the Common Assessment Framework used currently in Social Work. Because children’s needs are assessed in terms of health, education, family and social settings, the games scenarios will include such features, so that we can capture and interpret the child’s behaviour regarding certain choices. It may also be necessary to record the child’s interactions with the system either through video or audio so that we would then have more than one way of capturing the key information about why certain choices are made.

A strong case can thus be made that the technologies of interest are mature enough to be reasonably considered as candidates in the social care context within which we propose to operate. A further development of note, however, is the launch of the largest IT infrastructure project in Europe - National Programme for IT (now Connecting for Health) [15]. This programme is well on its way to being rolled out. This potentially provides an important piece in the jigsaw that we are attempting to construct – a framework for the consistent sharing of information about a major sub-set of the data with which we need to operate. In addition the success of this system, assuming it will eventually be successful, is likely to lead to calls for further such data sharing networks and, if we are to support multi-agency working, their integration with tools for the provision of information from end users – even (or perhaps especially) technically and professionally challenging groups such as vulnerable children.

Earlier, we touched on techniques for representing different types of knowledge, modelling user traits and needs, employing planning and scheduling technologies - a subset or all of these are applicable here, however, we want to demonstrates how the assessment process can be enriched through greater input from children and young people. We want to show that the information pertinent to the assessment process can be captured more discretely, in a non-threatening and informative way; using technologies that children have grown up with and can immerse themselves in.

Further parts of the data web required are already emerging. The Common Assessment Framework (CAF) [16] in the UK attempts

108

sequence in which the activities in a session are presented is important because certain activities require a capacity of certain actions that the child must know in other activities (i.e prerequisites).

5.1 Scenario Work in learning, teaching and assessment tools is relevant in developing platforms for use as interactive playmates for children. These have explored the use of virtual environments in teaching, focusing on the design of Human-Computer Interfaces. However, to our knowledge, there is no work that proposes an adaptive approach based on this technology that takes into account the child’s needs in designing the structure of professional interventions (as in the case of social work case loads for example). Efforts in AI provide valuable lessons on knowledge representation (modelling of the knowledge of experts), the users profile and the dynamics of supporting interactions between the various modules.

*A Child Profile characterizes a system state (in particular its evolution related to the user behavior) and associated assessments that adapt during the activity execution. Figure 1 gives the general architecture of the system, which consists of three sub-systems for observation and analysis of behavior, decision and action. The first manages the process of observation, which enables observing, analyzing and interpreting the behavior of the child. The second provides adaptive behaviour and monitors execution of games. The third is essentially an executive module responsible for running the games. This system executes the activities provided by the decision system. Another important task of this system is to save the execution sequence of actions. This represents the child-activities data.

Although, we do not want to specify what the software would look like at this stage until we have completed the first of our goals – the engagement of children in designing the game (and tools to be made available for play, we do have a tentative model of the architecture (which is still evolving).

Each child is characterized by particular abilities and preferences, so that a degree of adaptation is possible for the assessment mode. It is impossible to generalize activities without caution, but we favor adaptability of the system to take into account specific abilities observed for each child. It is important to locate and interpret carefully these behaviors, in order to support the assessment.[The system will not only provide a framework for the child to express their thoughts via a game, but also become a tool for the assessor (and the system) to observe and identify the behaviours, and use this information to support the child].

As already explained, our objective is to implement a software system, using computer games technology, knowledge based systems, and player/user models, that could help the children during the assessment process. This will require establishing a multimode and multimedia dialogue between the child and the interactive system, recording their actions, and taking into account their performance to adapt/update the assessment and extract information of relevance to building and structuring effective and appropriate interventions.

The role of such a system would be to provide children with a child friendly rationalization of the agents, activities and assessments in the form of a game. During a session, the system collects, via a range of devices (e.g. camera, touch screen, mouse, keyboard, gamepad…) reactions, in order to interpret behavior and respond to it, in real time, by adequate actions that explore the child’s interpretation of the assessment and intervention process (Arshad et al 2006).

5.2 Observation and Behaviour The approach we propose for the recognition of behaviour is based on a set of observers representing view points covering various levels of analysis, of the user’s behaviour and, a set of forms representing different sequences from events. This allows us to extract, from the flow of events the indices of the various forms in order to calculate a model of these forms. From this model, it is possible to analyse the way in which the reactions of the child evolved in order to obtain an interpretation of the behaviour. The decision system makes available (to the action system) a protocol adapted to the child’s profile and goal needs, to enable goal achievement and/or assess. To support this task, the system contains five modules: goal needs, child’s profile, cases memory, reasoning and exceptions treatment module.

Fig. 1: General architecture To facilitate such an investigation, we will use the following concepts; *A Game is characterized by: virtual environment and/or avatar, objects (pictogram, music, picture...) and functioning rules (to interact with the game). Each game has configuration parameters and objectives to be reached.

In each session, certain goal needs are active in the system and certain information concerning the child’s profile are contained in the user profile module. Given this information, the reasoning module makes it possible to generate an adapted Protocol. The generated protocol will be placed in the action system for its

*An Activity is an instance of a game (with a particular configuration and, qualified and quantified objectives). *An Action Sequence is an activities sequence, given in order to make it possible for the child to reach complex objectives. The

109

(i) consistent with their capacity to externalize and codify those needs and (ii) susceptible to the extraction of information that is coherent and sophisticated enough to support and influence the professional systems engaged in their care. This dual role, of engagement with the child in a cognitive environment in which they can express a rich and detailed set of requirements and their provision of compelling and relevant input to a range of different professionals, has the potential to be a significant breakthrough for child support agencies. It moves the child from an, often passive recipient of care to an active and influential agent in its construction.

execution. The Case-Based Reasoning is used by the decision system to generate Protocols adapted to child’s profile. The basic underlying principle of Case-Based Reasoning (see figure 2) is to solve new problems (called Target Case) by adapting solutions that were used to solve problems in the past, based on the assessor expertise, and the assessment of different child needs in the past. A Case-Based Reasoning-module uses a Case Memory that stores descriptions of previously solved problems (called Source Cases) and the solutions derived. The CBR process consists of seeking source cases that are similar to the target case and adapting the solutions identified to fit the current situation. A Case-Based system learns from its experiences by storing these in new cases, and using the experience of the expert.

7. ACKNOWLEDGMENTS We would like to thank LJMU for speculative support for this project. Thanks are also due to Dr. Dhiya Al-Jumeily for helpful discussions and comments.

8. REFERENCES [1] Technology and Software: The Complete Social Worker, A Guide to using the Internet in Social Work. http://hadm.sph.sc.edu/Students/KBelew/import.htm [2] El Rhalibi A and Kelleher G, (2004).“An approach to dynamic vehicle routing, rescheduling and disruption metrics”, IEEE International Conference on Systems, Man and Cybernetics, Vol. 4, ISSN: 1062-922X. [3] Spragg J and Kelleher G "A Discipline for Reactive Rescheduling", AIPS96, AAAI Press, 1996. Fig. 2: Case-Based Reasoning for GAC

[4] F. Arshad, L. Partington and A. El Rhalibi (2006). GAC – Gaming Assessments for Children. International Conference on Multidisciplinary Information Sciences & Technology. Merida, Spain, October 25-28th, 2006

In the context of play, the child is involved in a game, creating an environment that is familiar to them, populating the environment with people and features that they know and want. The information obtained whilst playing the game will be logged and printed out by the system at the end of the game. Unlike similar interactive assessment programs GAC will store and evaluate the responses and information that the child inputs. The program will produce an analysis in assessment form that can be used by the professional (i.e. social worker) in discussions with the child. Hence this process will enhance the overall quality of the assessment, as it brings views and choices that the child has made without ‘pressure’ from anyone else. This process enables the professional conducting the assessment to involve a child or young person in the assessment process in a non-threatening and less stressful way. However, the real efficacy of the approach advocated will require rigorous testing and evaluation with children, a range of professionals and, of course we have still to address the issue of how to embed a system of the nature proposed within stretched resources.

[5] Barr, A and Feigenbaum, E. (eds). 1981 The Handbook of Artificial Intelligence. Vol 1, William Kaufmann Inc. USA [6] Mandl, H and Lesgold, A. (eds). 1988. Learning Issues for Intelligent Tutoring Systems. Springer Verlag. [7] F. Hayes-Roth (1997) Artificial Intelligence – what Works and What doesn’t? AI Magazine, Vol 18, No2.Summer 1997 [8] Riva, G., Bacchetta, M., Baruffi, M., Rinaldi, S., & Molinari, E. (1998). Experiential cognitive therapy: A VR based approach for the assessment and treatment of eating disorders. In G. Riva, B.K. Widerhold, & E. Molinari (Eds.), Virtual environments in clinical psychology and neuroscience: Methods and techniques in advanced patienttherapist interaction (pp. 120-135). Amsterdam: Ios Press. [9] Smokowski, P.R. & Hartung, K. (2003). Computer simulation and virtual reality: Enhancing the practice of school social work. Journal of Technology in Human Services, 21 (1/2), 5-30.

6. CONCLUSION

[10] Wendy-Lou L. Greenidge, APD, (2005 “The Application of Gaming Technology in Counselor

The work described in this paper outlines the application of AI and gaming technology to a key activity within the lifecycle of the complex, multi-agency decision making involved in developing care plans for vulnerable children. It attempts to use these technologies to give a voice to children by supporting them in describing their own needs and preferences in a manner that is

[11] Training Programs”, Vol 4 Issue 1, Journal of Technology in Counseling, http://jtc.colstate.edu, [12] Jeffrey Rothfeder | February 2004 Terror Games: Can computer games be devised to model the thinking and

110

predict the actions of allies, enemies and even terrorists? Some in the U.S. government think so. Are they playing God? Popular science; Computers and electronics. 11.21.2005-11-21 http://www.popsci.com/popsci/

[15] Chalapathy Neti, Anees Shaikh, Chris Sharp, Randy Moulic, Patty Fry, Dick Anderson, John J. Ritsko, and David I. Seidman, IBM Systems Journal, Vol 4, no. 1, “Online Game Technology – Editorial”, 2006.

[13] James R, Nolan (1998) An International System for Case Review and Risk Assessment in Social Services AI Magazine 19(1) http://www.aaai.org/Library/Magazine/Vol19/19-01/vol1901.html

[16] National Programme for IT in the NHS - NHS Connecting for Health [17] ttp://www.connectingforhealth.nhs.uk/ [18] Common Assessment Framework for Children and Young People - Every Child Matters

[14] Stephane Bonnevay, Nadia Kabachi, Michel Lamure, "Agent-based simulation of coalition formation in cooperative games,", pp. 136-139, IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2005.

[19] www.everychildmatters.gov.uk/deliveringservices/caf

111

Familiars: Social gaming with PASION Ben Kirman and Duncan Rowland Lincoln Social Computing Research Centre Department of Computing and Informatics University of Lincoln, Brayford Pool, Lincoln, LN6 7TS, UK

[bkirman|drowland]@lincoln.ac.uk

social behaviour of the players affects the game itself.

ABSTRACT

The PASION project is funded by the EC to research social presence technologies and their effect on group behaviour within multiplayer games and collaborative working environments. The first steps were to design a mobile multiplayer social game (called “Familiars”), where the success and rank of a player within the game is directly linked to the quality and quantity of social interactions to which that player has contributed. Originally inspired by the capabilities of the Hitchers framework [6] (and the Gopher game [4]) the game is designed to encourage social play through mediated co-operative behaviour. Players create “Familiars” as their animal helpers. These are small creatures whose goal it is to accomplish the task given to them by the player who made them. They move from device to device completing tasks with the help of the players that they meet. The tasks are generally designed to require the co-operation of several players to complete, and so the Familiars must engage many players to help accomplish their goal. A variety of measures (both objective and subjective) will be taken regarding the various interactions players have during gameplay. From these a rank will be generated for each player based on the social network they have built.

The PASION (Psychologically Augmented Social Interaction Over Networks) project is designed to research social presence technologies and their effect on individual and group behaviour within mediated collaborative environments. A mobile multiplayer social game called Familiars is being designed, where the success and rank of a player within the game is directly linked to the qualities of the player's in-game social network. By examining the structures of the game-wide social network generated through playing the game, the aim is to identify patterns in the interactions which can be used to direct further studies and build future versions of the game that will enhance the game experience, bringing more of the face-to-face social value to technologically mediated games and hence mediated collaborations in general.

ACM Classification Keywords Games K.8, Computer Supported Co-operative Work H.5.3

Keywords Social Games, PASION, Group Emotion, Social Presence.

1. INTRODUCTION

By examining the structures of this game-wide social network, the aim is to identify patterns in the interactions [9] which can be used to direct our studies in PASION’s other work packages (including questions relating to basic research and collaborative working) and inform the design of future versions of the game, and so that the emerging genera of socially mediated networked game can be explored. Using sensors and analysis techniques developed through partners in the project, the expectation is that correlatives to individual and group, emotional and social states will be used to enhance the game experience and bring more face-to-face social value to mediated games.

Games and play allow an abstraction of reality that is simpler and more controllable than the veridical world. The aim of PASION is to create a game world that will allow the scientific scrutiny of mediated communications in a way that is simplified yet still scientifically valid [2]. The quality of social interaction through multiplayer games is currently highly impoverished in comparison to other forms of social play. With the explosion in popularity of multiplayer online and mobile games there is an opportunity to enhance the grade of social interactions and develop a new style of "socially aware" game where the

2. FAMILIARS "Familiars" is the first iteration of a multi-player social game designed for the PASION project. The essence of the game is such that the social interactions that occur between players of the game are fundamentally linked to their success. The following is the description as it may appear

112

reality where studies of human interaction can be conducted - it is a first attempt at a socially aware game, where player success is directly related to their social standing within the player community. By observing the interactions of the players with the game and with each other it should be possible to identify patterns not only relating to how the players use the game but how they behave socially around the game (i.e. those interactions within the game that don't serve to further their position but are important in the social context of the game [1]). Connected to this is the requirement to track the “mood” (or at least, physiological indicators) of players and expose these as important elements within the game. A player’s emotional state (both real and apparent), other player’s perceptions of it, and the interactions between them will all be monitored.

on the “back-of-the-box”: "Familiars" are small invisible and friendly creatures that are all around us wherever we go. You can talk to a familiar using your computer or mobile phone, chat, swap gossip and even share photographs. Sometimes a familiar has a task that it is trying to complete - you can pick it up and help it out. The task could be anything: "I’m collection a story, could you contribute the next chapter?”, "I want to visit players in 5 different cities" or even "I want to find my way around the world". When you work with other people to complete your task, you gain "Social Credit" which affects your rank in the game. You can also gain Social Credit by chatting with other players using the mobile chat service, and for meeting new friends in the game. If you are feeling creative you can play the game on your PC and create imaginative tasks for familiars to do, and create new familiars with their own pictures that you have created. You can also log in and look at a familiar's blog, to see what they have been up to, where they have been and who else they have met along the way. Familiars are social creatures and like to know how you are feeling - they will ask you from time to time and sometimes will ask you about other people. Answering truthfully will earn you Social Credit, and may even affect the behaviour of the familiar in the future. Familiars is all about making new friends and having fun together, in fact the players who have the most friends and have the most fun are the ones that are ranked the highest!

The game is designed to be played fluidly, with little impact on player’s daily routines [8]. The game is under continual revision based on finding from user tests. As partners within the PASION project provide new technologies to measure the psychophysical states of the players or analyse the social networks they will be integrated into the game. With continued updates and testing it is hoped to observe some novel social interactions around the game that are unusual for the multiplayer video game format.

2.2 Player Goals From the perspective of the players, the aim of the game is to rise in social status among the community. There are a set of global "High Score Tables" that shows the players’ rankings filtered to show absolute standing within the game and standings within the smaller social groups that emerge from the game. Despite the goal of the game being to rise in social status, the players are not expected to work tirelessly towards that goal - the purpose of any game is enjoyment and of course different players will find enjoyment in different parts of the game. The appeal of Familiars is that it caters to quite a wide range of different play styles. Using Caillois' popular classification of play [3], that the freedom offered by the task creation in Familiars is very much Paidic in nature - in creating inventive and challenging tasks players are anticipated to find the same joy as they do with 3D modelling in Second Life (secondlife.com) or in interior design in Ultima Online (www.uo.com). In contrast, completing the tasks is mostly Ludic as players try to complete tasks as best as they can to maximise their score.

The game is inspired by the Hitchers framework [6] and the Gophers game [4], which originated this kind of mobile task-driven agent game. Advantages of these games are that they are simple to comprehend, make good use of the capabilities of the underlying infrastructure are engaging and promote social interaction between players (since the tasks will usually involve multiple players interacting with the same agent). In these games, players have complete freedom to give the agents any tasks they can think of. Gophers introduced the idea of rewarding players for participation through the scoring mechanism - however, since it is so difficult to procedurally give rewards for completing tasks which have been created with such freedom, players participate in "Jury Service" and give completed tasks scores based on the perceived complexity of the task and the relative value of contributions of the players that helped in its completion. The player who initially created the completed task gets more points based on the judged difficulty of the task, and the contributing players get points based on the difficulty and their judged input.

2.3 Social Status Familiars builds on the play mechanics initial investigated in Gophers and significantly extends the social aspect of the game. Instead of success being measured simply by points accrued, status in the game is a reflection of the size and structure of a player's social network within the game.

2.1 Academic Goals The goal in the creation of Familiars is to create an artificial

113

A player who has completed many complicated tasks individually or with a small group of friends will generally score lower than a player who has completed only a few tasks, but those tasks involved many other players.

Social Capital is that in all systems, including Whuffie, the total value of social credit a player has is viewable at all times by any other person. This makes the system transparent and immune to dishonesty and misrepresentation, as abused by confidence scams in the real world where money is stolen from ordinary people by others who are misrepresenting their social status by, for example, pretending to be members of the clergy or high powered businessmen.

2.5 Earning Social Credit The value of "Whuffie" accrued due to the various interactions in Familiars is not constant and will usually be balanced against the complexity of the tasks and other factors. For example back in figure 1, Task A, which involved three players might have been judged by peerreview to be simpler than Task B, which only involved two. So despite the social interactions being greater in number through Task A, the interactions in Task B were judged to be more valuable, and so score higher. Value is also reduced over time and repetition, so many chat sessions with a single other player will be worth less and less each time. This should encourage players to mix more with others and prevent groups of players conspiring to cheat by repeating simple interactions with one another repeatedly.

Figure 1. Example Social Network in Familiars

Figure 1 shows a simplified example social network that might occur during the game. Two tasks, A and B, have been completed by multiple players, and all the players have engaged in text chat with one or many of the others. Both simple socialisation and active participation in the game increase the players' status in the game. In this example, Edward is clearly the lowest socially ranked player since he has not been involved in the completion of a task, he has only been involved in chat sessions, and only with Charles. Either Bob or Charles belongs at the top of the social ladder. Bob has contributed towards completing two tasks and engaged in one chat. In contrast, Charles has contributed towards only one task, yet has had chat sessions with three different players. Who is ranked higher will depend on the relative values assigned to the different kinds of social interactions.

Over time a player will participate in a great number of interactions. This may favour players that have been playing longer and makes it difficult for new people to be competitive in the social ranking. To counter this the value of older interactions decays over time - players only retain their status within the game by continued contribution and cannot rest on past achievements for too long.

2.4 Social Capital

2.6 Reputation and Wealth

The kind of reputation based ranking system used in Familiars is not new - it is an abstract expression of the players' “Social Capital” [7] within the closed system of the game. While Social Capital is more conceptual in real life and difficult to measure in a quantitative manner, the social values of the players’ in-game interactions are awarded using a peer-review system. This is similar to the notion of a "Whuffie", a word coined by science fiction author Cory Doctorow to describe a highly visible numerical score that represents the absolute value of a person’s reputation and standing within their community [5].

It is very important to distinguish between the collection of social credit or "Whuffie" and any currency or points system that may also be used in the game. The purpose of generating social credit is to simply advance one's social standing within the game. Any other economical system within the game should not interfere with this process at all (i.e. players should not be able to trade or loan social credit from one another and it must be impossible to increase your social credit by simply being wealthy in currency). For this reason, every player regardless of wealth or time in the game must have the same opportunities to generate social credit, for example through chat or task completion). The only exception to this is in the meta-game. Players who have large groups of friends within and around the game may find it easier to generate social credit by virtue of being popular. This may be considered to fit within the ideals of the game since in order to have gained this advantage they must logically already have a good reputation.

“Whuffie recaptured the true essence of money: in the old days, if you were broke but respected, you wouldn't starve; contrariwise, if you were rich and hated, no sum could buy you security and peace.” Other existing reputation systems are those used on trading websites such as eBay or the karma system on Slashdot where users who have made positive contributions to community discussions are rewarded with a louder voice. Future comments are then ranked higher and appear more visible in discussions. The important distinction from real

114

2.7 Supportive Technology

3. CONCLUSION

Familiars is designed to be cross-platform, initially for use on highly mobile devices such as cellphones, medium mobility devices such as tablet PCs and PDAs, and on low mobility platforms such as PCs. Different devices have different capabilities, and these will be taken advantage of as appropriate. For example cellphones will be able to provide location data based on Cell Mast IDs, but PDAs may be able to offer precise GPS data if in the right conditions.

A central goal of PASION is to produce technical developments in shared, computer-mediated environments that will enable the analysis of individual and group behaviours. Improving the quality of social interactions in a mediated game is not only of interest to the gaming community. Play can be seen as the proving ground for much that is important in human social development. Therefore the results may find more general applicability within a variety of social environments. In business, for example, where workers may have their communications mediated to enable more productive interactions.

Partners within the PASION project are working to provide sensors and other devices for tracking the mood of the players using both high and low mobility devices. This includes facial recognition, eye movement, galvanic skin response and posture. Since the project is at an early stage the availability of these devices could not be relied on for use in Familiars. The next steps for Familiars are to integrate this technology as it becomes available and study how the players mood is linked to the quality of their social interactions. This may eventually allow the partial automation of the judgement process.

In the area of social gaming, PASION’s goal is to explore the design space of mobile games in which the social and behavioural state of group processes can be communicated or manipulated as part of the game. To aid in this exploration, a theory regarding the effects of social presence on player interactions is being developed. Familiars is the initial prototype for this new genera of game. Large-scale user studies of this prototype are planned through which it is hoped to get a clearer picture of the possibilities of socially aware mobile gaming and in particular how they may be useful in improving the quality of mediated communication within groups.

2.8 Mood Reporting While reliable mood sensors are not available yet, a selfreporting mechanism for mood will be included in Familiars. Players involved in direct communication through the fluid person-to-person chat system will be asked to judge their own and each other's mood after the interaction is complete. For each player the system will compare the mood they self-reported and the mood the other participant judged they had. The closer the two selections the more social credit they will earn in the game. This reflects the interaction having been more valuable since they have been able to deduce the other participant's mood correctly. Note this is irrespective of the mood chosen – e.g. players will not be rewarded more for being happy than for being distressed. This self-reporting system may be open to abuse from conspiratorial players so the data gathered must be treated with suspicion.

4. ACKNOWLEDGEMENTS The PASION project has been funded by the European Union under the Presence II Initiative in the Future Emerging Technologies within the 6th Framework Programme.

5. REFERENCES 1. Bell, M., Chalmers, M., et al. Interweaving mobile games with everyday life, ACMCHI 2006 2. Brugnoli, M. C., Rowland, D. et al. Gaming and Social Interaction in Mediated Environments: the PASION Project, eChallenges 2006 3. Caillois, R. Man, Play and Games, University of Illinois Press, 2001 edition 4. Casey, S. and Rowland, D. Gophers: Socially Oriented Pervasive Gaming, GDTW2006 5. Doctorow, C. Down and Out in the Magic Kingdom, Tor Books, 2003 6. Drozd , S., Benford, S., et al . Hitchers: Designing for Cellular Positioning, UBICOMP 2006 7. Lin, Nan. Building a Network Theory of Social Capital. Connections 22(1):28-51., 1999 8. Reeves, S., Benford, S., et al. Designing the Spectator Experience, CHI 2005

While mood sensing equipment is being developed by PASION partners, it is clear it will not be possible to equip all devices with every invention (and philosophically there is no guarantee that physiological state can ever be truly said to predict state of consciousness). A good example is the eye tracking software - it will be unreliable on mobile devices since the amount of ambient light may change and alter the shape of the player's pupils, which may be misinterpreted by the software. However for clients in controlled environments it may be very useful.

9. Wasserman, S. & Faust, K. Social Networks Analysis: Methods and Applications, Cambridge Uni. Press, 1994

Based on user tests, the “mood” of the players during interactions will be integrated into the rewards system in Familiars. Exactly how this integration will work is under discussion - is it better to reward consistency in mood or reward variations? Should people get rewards for changing the mood of others?

115

“Edutainment” is a form of “learn through doing”? Kowit Rapeepisarn, Kok Wai Wong, Chun Che Fung, Peter Cole School of Information Technology Murdoch University South Street, Murdoch, Western Australia 6150 {k.rapeepisarn | k.wong | l.fung | p.cole} @murdoch.edu.au

ABSTRACT The idea of combining education

with some entertaining activities by engaging learner with hand on experience has been widely used for formal and informal education. “Learn through doing” and “edutainment” are two of the commonly known areas in this domain. However, “learn through doing” is an area which has been developed over a longer period of time, while edutainment especially using computer game is a new and emerging area. The objective of this paper is to examine the relationship between them.

Categories and Subject Descriptors:

computer

and education

General

Terms:

teaching,

learning,

education,

technology

Keywords: edutainment, learn through doing, experiential education, relationship, learning style

1. INTRODUCTION “Learn through doing” is the process of actively engaging learners with hand on experience while learning that will have benefits and consequences in achieving the learning objectives. Currently, this method of teaching has been widely integrated into school curriculum. At the other end, the idea of combining education with entertainment, commonly referred to “edutainment”, is also widely used for formal and informal education. Both “learn through doing” and “edutainment” include learning and education as their main objectives. There are several similarities in general. For example, edutainment provides a form of self learning through doing. However, in order to design a successful edutainment is not an easy task. This paper analyses the relationship between them and to present in a format such that the designers can apply successful concepts from “learn through doing” when designing edutainment.

2. LEARN THROUGH DOING Many studies have been conducted to understand how people learn. Different learning styles are used to classify how people learn and process information. There are three main learning styles which have been identified by many studies [17]. They are described as follows. The first is visual learning, where learners learn by reading and watching. Visual learner prefers to see what they are learning and learn best from visual displays. During a lecture or classroom discussion, visual learners often prefer to take

116

detailed notes in order to help them to absorb and understand the information. Second class of learning style is auditory learning, where learners learn by hearing. Auditory learners prefer to listen and learn best through verbal lectures, discussions, talking things through and listening to how other learners interpret the underlying meaning of the speech. Written information may have little meaning to this group of learners until it is heard. Next is kinesthetic learning, where learners learn by doing. Kinesthetic learners learn best through a hand-on approach which enable them to actively explore the physical world around them. They may also find it difficult to sit still for a long period of time and may become distracted by their needs for activities and exploration. Kinesthetic learners want to touch, imitate, and practice. Sometime, they may catch up and exceed the lesson plan by working through on their own. Kinesthetic learners are most successful when totally engaged with learning activities. They acquire information faster when participating in a science lab, drama presentation, skit, field trip, dance, or other active activities. Understanding the learning styles of the learner will have a positive impact in knowing the learning preferences. This will allows educators the opportunity to use the most effective technique in teaching. Going to a lecture might not be appropriate to learn how to use a computer system. Learning to ride a bicycle cannot be realized through reading a book or attending lectures. For some learning objectives, kinesthetic learning or learn through doing is likely a better learning style.

2.1 Experiential through doing

education

and

learn

Learn through doing is probably the oldest form of education that allows learners to experience the content. At the fundamental level, it is a method of learning from knowing, doing and being. There are a number of terminologies related to learn through doing and they are: experiential education, active learning, learns by experience, environmental education, natural learning, adventure education, outdoor education, learn through activity, etc. However, the most related term to learn through doing is experiential education. According to Loy [18] who points out that learn through doing is a simple form of experiential education. Thus, learn through doing is most likely relied on experience. Experience in the past refers to the accumulated product or event of the past while present experience refers to the subjective nature of one’s current existence [21]. Both types of experiences influence one’s current and future experience. Experience is a broad topic. However, it is the basis of all knowledge, wisdom, understanding, and meaning.

In experiential education, the learner becomes more actively engaged in the learning process and more independent in learning than in traditional didactic education. This type of education is widely implemented across a range of topics and mediums, for instance, outdoor education, service learning, internships, and group-based learning projects. To clearly understand experiential education, the concept of experiential education from Google including: “experiential education is systematic approach to applied learning whereby a student engages in professional & productive.”; “experiential education is a process through which a learner constructs knowledge.”; “experiential education is an educational strategy that connects classroom theory with practice in the real world.” and “experiential education is learning by doing.”

2.3 Significance of learn through doing Research has been done previously on assessing how much people can remember from the learning process. The findings are as follows: about 10 percent of what they hear, 20 percent of what they see, 40 percent of what they discuss and 90 percent of what they do [12]. It can be inferred that learning does not mainly come from reading or listening. Rather, learning mainly comes from doing. In fact, reading creates ideas, and action creates learning. Learn through doing, as a part of informal learning, becomes a necessary component of formal instruction in colleges and universities for several reasons. Firstly, teachers are concerned with the opportunities for their students so as to enable them to enter their chosen professions in the job market easier. In other words, teachers are concerned about the effectiveness of preparing their future generation for the industries [5]. Secondly, more nontraditional learners are choosing college study and demanding more diversified modes of learning. In addition, typical college students are also becoming more demanding with respect to what they can learn from the college. Luckner and Nadler [19] discussed the reasons why learn through doing is effective. They are: equality, developing relationships quickly, disequilibrium, decreased time cycle, ability to handle chaos and crisis in a safe environment, kinesthetic imprint, encourage risk taking, diversity of strengths, and fun. In summary, learn through doing has become an integral part of education which aims to develop the students’ experiences in the curriculum and to connect classroom theory with practice in the real world.

3. EDUTAINMENT Edutainment, similar to infotainment, technotainment, educational electronic games, is a new term coinage. This term was first used in computer industry describing CD ROM programs that are used to teach with an element of entertainment. “The concept of entertainment is not new, although the term is a neologism. Entertainment facilities have large used the education aspects while adding entertainment or amusement” [30]. The term edutainment is defined in several ways. Hutchison Encyclopedia, for example, defines edutainment as multimedia-related term, used to describe computer software that is both education and entertainment. According to Buckingham and Scanlon [2], edutainment is “a hybrid genre that relies heavily on visual material, on narrative or game-like formats computer games-education-implications

117

for game developers, and on more informal, less didactic styles of address.” In short, edutainment is the act of engaged learning process through one or multiple media such as television programs, video games, films, music, multimedia, websites and computer software. In other words, entertainment is the media and education is the content [30]. In addition, development of an edutainment environment also included the implementation of technological innovations in education [13].

3.1 Types of Edutainment Edutainment is an evolving alternative to traditional education method. It can be organized in different ways [27], [30]: Location-based edutainment which can be divided into two categories: interactive & participatory where children can play and participate in game, and, non-interactive & spectator where children can just be seated and explore. Examples are movies, science shows, museums and zoos. Edutainment by purpose and content includes: informal education which is to improve learners’ life control (it is presented usually with discussion and narrative forms) and skills education which is to give realistic experiences such as simulations. Edutainment by target group consist of motivation-oriented (learners who have same interest) and age-oriented (learners who have same age). Edutainment by type of media involves: edutainment on TV included- comedic drama, historical drama, sketch comedy. Computer edutainment also include game types-adventure, quiz, role-play, strategy, simulation, and experimental drama. Edutainment on Internet includes tele-teaching and telelearning systems, and web-based educational systems. Interactive television uses advanced digital television to provide the interactivity via software, hardware and connection by the available telecommunication systems.

3.2 The essential edutainment

characteristics

of

Edutainment, uses new technologies that stimulate all the sense of the individual, allows the recreation of the content of the message, both in terms of education and entertainment. Its objective is different from any other form of media consumption. A message, both educational and entertainment content, has replaced the object in the general interactive scheme. This message contribution and the consumer’s contribution (his subjective response) together give rise to the individual’s edutainment experience [1]. Hence, two essential characteristics of edutainment are: interactivity and delivery of message content. The first characteristic, interactivity, is the ability to respond to a user’s input. In the other words, the user can choose the topics of interest and the way they find them, and can obtain almost immediate responses to his queries. The second characteristic is the ability to deliver the message content in a virtual environment [24] The edutainment users have now become responsible for what he choose to learn. These young generation users approach this learning experience in a way similar to adulthood learning, which is characterized by independent learning [20].

3.3 Edutainment: Commercial Product With the concept that learning can be fun, and fun can promote learning, more and more entertainment products are marketing themselves using this concept. Edutainment is as much a marketing concept as it is as content [30]. There are more for-profit types of location-based edutainment businesses around such as theatres, museums, zoos, aquariums, planetariums, historical sites, as well as children’s edutainment centers. Many different media used for edutainment are also considered as commercial commodities. Television programs such as Sesame Street and Barney in USA, as well as BBC’s series Teletubbies are confined as commercial products. Books, magazines, audio, video games, and computer games are sold to parents as a kind of commodity [4]. Parents are likely to invest more in their children’s education by purchasing additional educational resources at home. Statistics show that the more educated parents are, the more plausibly that they will agree with children’s participation in extracurricular lesson, and the more edutainment products they will purchase. Data collected between 1996-2000 from the Census Bureau’s Survey of Income and program Participation showed that the families of parents with high school or less education qualifications, only 22% of their children participated in extracurricular lessons. On the other hand, children’s participation in these activities increased to 50 % are from the families of parents who had a bachelor’s degree [30]. Educational institutes seem to be another big target for edutainment marketing. However, the study of Harvey [10] on the Market for Educational Software found that the edutainment market in educational institutes is weak due to limited budgets. The educational institutes are heavily dependent on the use of traditional text rather than educational software, and they are normally unsure of technologies that will help to develop the educational content. Parents’ interests in edutainment software however may reduce because they need teachers to help them to select the appropriate software for home use. The understanding of the products from teachers, however, is not much different from that of the parents.

4. RELATIONSHIP BETWEEN “LEARN THROUGH DOING” AND “EDUTAINMENT” When computers were first introduced into classrooms in the 1970’s and early 1980’s, students mostly used the computers for drill and practice. Soon, the schools began to network their computers and turned to “integrated learning systems” [3]. The purpose is to direct and coach the student through the learning experience. Since 1990’s, Internet has become the main research tool. Computer then becomes popular and widely available to parents and the school system. Educators propose that computer learning in school should be more fun, more productive and more interesting. In some places, educational computer games are included as an additional part of formal learning. Education through the use of computer games is commonly referred to as “edutainment”. The idea behind games is that they make learning fun. Many games offer simulation as a key part of their program. Edutainment provides the self learning through doing and through simulation as stated by Dunn [6]:

118

“All of a sudden, learning by doing became the rule ratherthan exception. Since computer simulation of just about anything is now possible, one need not learn about a frog by dissecting it. Instead, children can be asked to design frogs, to build an animal with frog-like behavior, to modify that behavior, to simulate the muscles, to play with frog.” Both “learn through doing” and “edutainment” include learning and education as their activities. Edutainment relies heavily on a variety of media but mostly refers to computer games. In addition, it also bases on the amusement activities. Learn through doing, on the other hand, is a rather broad term. It refers to the experience learning and hand on approach which learning have to work through. To understand more about learn through doing, table 1 shows the related terms and what learn through doing bases on [21]. Table 1: Related terms of “learn through doing” Related terms Bases on

Active learning, learn by experience, environmental education, adventure education, outdoor learning, natural learning, learn through activity, challenge education, leisure education, service learning. x Schools & workplaces at the same time to allow students to apply classroom learning in the community & workplaces. x Learning by direct experience and using all the senses. x Connect classroom theory with practice in real world. x Engaging students in an experience that will have real consequences.

The comparison of “learn through doing” and “edutainment’’ in terms of general concepts, activity, fundamental skill, technology, commercial concept, visual material, working environment, interactivity and delivery are shown in Table 2 [9], [16], [22], [28], [31]. The effectiveness of learn through doing and edutainment [9], [16], [22], [28], [31] are shown in Table 3. Table 2: Comparison of “learn through doing” and “edutainment” Learn doing General concept Activity

Fundamental skill

Technology (digital environment)

through

Process of activity engaging learners in real experience Hand on approach, actively exploring, touch, imitate, practice, experience, working through Memory, construct knowledge, thinking, reflection, understanding With or without

Edutainment Amusing activities and learning at the same time Explore, plan, imagine, problem-solving, create, dramatize, use logic, thinking, visualize, discover Memory, self – regulation, reflective thinking & meta cognition, abstract thinking & imagination Mostly required

Commercial concept

With or without

Mostly with

Visual materials

With or without

Always required

Working environment Places Interactivity Delivery

Independent/ group

Independent/ group

Indoor or outdoor With or without Not necessary

Mostly indoor Always required Mostly

Table 3: effectiveness of “learn through doing” and “edutainment” Social behavior: Self control, more positive social interacts and companionship, more altruistic behavior, less stereotyped views of other, cooperative, helping, sharing, solving social problems, understand their life experiences ability to take turn, negotiate, compromise, work out conflict, develop skills in leadership & management Cognitive development: Memory, creativity & divergent thinking, construct knowledge, extending skills of mathematical reasoning, basic skills such as counting, reading, and writing Intellectual development: Resolving problems, understanding how things work, devising strategies Emotional development: Love, caring, empathy, curiosity, focusing attention on task, lower anxiety Physical development: Develop gross muscle control, eye-hand coordinative, coordination of movement & speed, a critical precursor to reading and writing skills Therapeutic effects: Health care (learn good eating habits from computer game), hyperactivity (active doing may reduce impulsivity), brain development (increase neural structure) Educational development: Provide or the education and career development of the learners, supporting professional development, holistic approach which incorporates physical activity together with social & emotional challenges, student-centered teaching and learning, connect classroom theory with practice in community, engaging students in an experience that will have real consequences

5. DISCUSSION In traditional education, teachers set the analysis and synthesis knowledge to be learnt before students. They act as supplier of knowledge and transfer it to the students as passive recipients. Education is not an affair of “telling” and being told, but an active constructive process. Researchers [8], [26] have tried many efforts at progressive education reform. Since learn through doing or experiential education is one of the educational strategies that connects classroom theory with practice in the real world, it is suggested that they should be integrated into the school curriculum. Learning through doing uses various tools like games, simulations, role plays, stories and other activities in classrooms. It changes the way the teachers and students view knowledge. Knowledge becomes an

119

active part and no longer just depends on text and pictures on a page where students have to learn through reading. Students become knowledge creators (for themselves) as well as knowledge gatherers. When students are active learners, their attempts often take them outside the classroom. Teacher becomes an active learner too, experimenting together with their students, reflecting and responding to their students’ reactions to the activities. Both teacher and students becomes active learner. However, the idea of integrating learn through doing in formal education is opposed by several professions. There are three reasons why learn through doing is not consented as a form of formal education [23]. Firstly, it is rather troublesome to utilize without “doing devices”. History or literature, for example, is difficult to teach by doing. When there are “doing devices” available for some particular subjects, necessary equipment is too expensive or there is no equipment at all. Secondly, educators do not really understand why learn through doing works, and thus are averse to affirm upon it. Thirdly, learning by doing is natural learning. People learn based on their needs. Thus motivation is not a problem, but school has no natural motivation associated with it. Students go there because they have no choice. In addition, in the workplace, if a new employee wants to learn about his/her job, the best way is simply let him/her learns by doing. However, sometimes it is hard, complex, sturdy or dangerous to learn from direct experience. Employers have to offer the intensive and comprehensive training courses. Investment of training employee to allow learn by doing is costly. However, one likely answer is the use of simulation for training which could also be employed in education environments such as schools and colleges. Can student enhance learning by doing through computer or video games? This is another unceasing controversy. On one hand, a number of research studies show the positive impact of education technology on student achievement. For example, Kulik [15] used a research technique called meta-analysis to aggregate the findings from more than 500 individual research studies of computer-based instruction. The result shows that on average, students who used computer-based instruction scored at the 64th percentile while students in the conditions without computers who scored at the 50th percentile. SivinKachala [25], reviewed 219 research studies from 1990 to 1997 to assess the effect of technology on learning, found that students in technology rich environments experienced positive effects on achievement in all major subject areas. The effects of simulation an higher order thinking technologies assessed by Wenglinsky [29] on a national sampling of eighth graders mathematics achievement shows that students who used simulation and higher order thinking software showed gains in math scores of up to 15 weeks above grade level as measured by NADP. On the other hand, computer game also has its disadvantages. Basically, it affects players’ health. Common complaint found many children obsessed with games are eye stains, wrist, neck and back pains, headaches, hallucinations, nerve, and muscle damages, obesity, etc. Moreover, it could also cause social problem thus resulting in them becoming shy and introvert. Many studies have been done on computers and education. These studies contradict each other, some saying that computers have raised test scores and others saying that they have lowered them. Everything has two sides, the advantages

and disadvantages. Computer games are no exception. Edutainment software is a great idea, if it is used wisely in the correct manner.

6. CONCLUSION “Learn through doing”

and “edutainment” are essential areas that both include learning and education as their activities. While learn through doing is a rather broad term, mostly are referred to the experience learning and hand on approach, edutainment, on the other hand, relies heavily on multimedia but mostly referred to computer game. Both of them are widely consented to integrate in both formal and informal education environments. However, some educators strongly disagree to this idea. The question of whether learn by doing through computer game can enhance student behavior and study is continuously argued. This paper has presented an analysis into two areas: “learn through doing” and “edutainment”. Distinctions and generality have been presented. It is hoped that the general guidelines have provided some suggestions to developers of edutainment systems. Our research is ongoing in determining an empirical model governing the factors of success for such systems.

REFERENCES [1] Adam, M. and Moussouri, T. The interactive experience: Linking research an practice. In Proceedings of International Conference on Interactive Learning in Museums of Art and Design (London, 2002). Victoria and Albert Museum. [2] Addis, M. New technologies and cultural consumptionedutainment is born, European Journal of Marketing, Bradford, 39,7/8 (2005), 729-736. [3] Brush, T. and other. Design and delivery of integrated learning systems: Their impact on student achievement and attitudes, Journal of Educational Computing Research, 21,4 (1999), 475-486. [4] Buckingham, D. and Scanlon, M. Selling learning: Towards a political economy of edutainment media, Media, Culture and Society, 27,1 (2005), 41-58. [5] Cantor, J. Experiential learning in higher education: Linking classroom and community, ASHE-ERIC Higher Education Report series 95-7, 24-7 (2003). [6] Dunn, R. and Dunn K. Teaching elementary students through their individual styles: Practical approaches for grades 7-12. Boston, Allyn & Bacon, 1993. [7] Farne’, R. Pedagogy of play. Topoi, 24 (2005) [8] Goodlad, J. A place called school: Prospects for the future. NY, McGraw-Hill, 1984. [9] Gros, B. The impact of digital games in education. First Monday, 8,7 (July 2003) [10] Harvey, J. The market for educational software. Rand corp., Santa Monica, CA. 1995. [11] Henry, J. Meaning and practice n experiential learning. In Weil, S. and McGill, I (Eds.) Making sense of experiential learning, SUHE & OU Press, Milton Keynes, 1989. [12] Hornback, B. Brain’s blog. From http://brainhornback.blogspot.com/2005_06_1_ brainhornback_archive.html

120

[13] Hussain, H. ad Eshaq, A. The design framework for edutainment environment. In Behesthi, R. in Advances in Design Sciences & Technology. Europia Publicatin, France, 2001, 81-90. [14] Jeff, T. and Smith, M. Informal education: Conversation,democracy and learning, Tichnall, Education Now, 1996. [15] Kulik, J. Meta-analytic studies of findings on computerbased instruction. In Baker, E. and O’Neil, H. (Eds.) Technology assessment in education and training. Hillsdale, N.J., Lawrence Erlbaum, 1994. [16] Lindon, J. What is play. National Children’s Bureau, London, 2002. [17] Longstreet, C. Why learning how you learns is important. Society for Hospitality Management. 2005. From http://www.4hoteliers.com/4hots_fshw.php?mwi=812 [18] Loy, B. Experiential education-What is it? Colleges of Distinction. From http://www.collegesofdistinction.com/subpagetemplates/ contributorpage.asp?articleid [19] Luckener, J. and Nadler, R. Processing the experience: Strategies to enhance and generalize learning. Dubuque, IA, Kendall/Hunt, 1997. [20] Merriam, S. and Caffarella, R. Learning in adulthood: A comprehensive guide. San Francisco, CA., Jossey-Bass, 1991. [21] Neill, J. Experiential learning & experiential education: Philosophy, theory, practice & resources. 2006. From http://www.wilderdom.com/experiential [22] Rauterberg, M. Positive effects on entertainment technology of human behaviour. In Jacquart, R. (Ed.) Building the Information Society, IFIP, Kluwer Academic Press, 2004. [23] Schank, R. What we learn when we learn by doing. (Technical Report no. 60) Northwestern University, Institute for Learning Sciences, 1995. [24] Shih, C. Conceptualizing consumer experiences in cyberspace, European Journal of Marketing, 32,7/8 (1998), 655-663. [25] Sivin-Kachala, J. Report on the effectiveness of technology in schools, 1990-1997, Software Publisher’s Association, 1998. [26] Sizer, T. Horace’s compromise. Boston, Houghton Mifflin Company, 1984. [27] Wallde’n, S. and Saronen, A. Edutainment: From television and computers to digital television. University of Tampere. Hypermedia Laboratory. From http://www.uta.fi/hyper/julkaisut/b/fit03b.pdf [28] Wasserman, S. Serious players in the primaryclassroom. Teacher College Press, New York, NY., 1990. [29] Wenglinsky, H. Does it compute? The relationship between educational technology and student achievement in mathematics. Educational Testing Service Policy Information Center, 1998. [30] White, R. 2003. That’s edutainment. Kansas City, MO: White Hutchinson Leisure & Learning Group. [31] Zoney, J. The misunderstood learning activity: Play. Teacher Magazine, 17(5): 1-2, March 2005.

Modelling Dynamic Virtual Communities within Computer Games: A Viable System Modelling (VSM) Approach Stephen Tang

Dr. Martin Hanneghan

Dr. Abdennour El Rhalibi

Kolej Tunku Abdul Rahman Jalan Genting Kelang, Setapak, 53300 Kuala Lumpur, Malaysia

Liverpool John Moores University School of Computing and Mathematical Sciences, Byrom Street, Liverpool, UK. +441512312577

Liverpool John Moores University School of Computing and Mathematical Sciences, Byrom Street, Liverpool, UK. +441512312106

[email protected]

[email protected]

+6016293569 [email protected]

this context to mean providing a variety of interactions within the game world. Though it may sound simple, the task of creating a dynamic game world is usually tedious and costly.

ABSTRACT Designing an immersive and believable dynamic game world is the ultimate goal for game designers. With continuous advancement in hardware technology, there is now an expectation of very high quality visuals within games and the aesthetic aspect of games is no longer a major issue from a technical standpoint. The dynamic, behavioural aspect of games is now presenting a real challenge. As in-game virtual communities are made up of mostly Non Player Characters (NPCs), a high level of dynamism within the game world is important to foster replayability of the game. It is therefore the interactions of, and economies within, the virtual societies in the game world that help instill an immersive and believable experience. In the studies of systems theory, Stafford Beer’s Viable Systems Model (VSM) provides an interesting framework which can be used to address the design challenges stated. This paper presents ongoing research carried out by the Serious Games Research Group at School of Computing and Mathematical Sciences, Liverpool John Moores University in attempt to provide a technology framework which can assist in designing virtual communities within games from a cybernetics standpoint.

Dynamism within a game world not only exists in MMOG games but also in many other game genres such as Real Time Strategy (RTS), Simulation, Sport, Adventure, Action, and others. Therefore providing a conceptual tool that would assist game designers (especially inexperienced designers) in creating such an aspect would be a great advantage and thus contribute to the success of the game. NPCs constitute a huge part of the game world forming a virtual ‘community,’ often dramatically affecting the dynamism of the game world with which the game players interact. Thus it is valid to equate that modelling a virtual community within the game world promotes dynamism, only if NPCs are viable systems and that the virtual society maintains viability of its existence. Such viable characteristics within the community can be driven by various operational and management roles within game world to fit the gameplay purpose. Cybernetics provides a set of concepts which can be used to model a functional society through the philosophical study of design, purpose, directive principle, or finality in nature or human creations [6, 7]. Within the discipline, the Viable System Model (or VSM), a model of organizational structure created by Stafford Beer complements our research needs. In this paper, we address the issue highlighted and present our approach in game design specifically in modelling virtual societies with the aim to introduce and possibly increase dynamics within the game world. In Section 2, we briefly introduce the core concepts in VSM before we extend the discussion of existence of virtual communities and an application of the VSM in designing virtual communities in games in Section 3. Conclusions and future work for this research will then be presented in Section 4 and Section 5 respectively.

Categories and Subject Descriptors K.8.0 [Personal Computing]: General - Games

General Terms Design, Theory.

Keywords Game Design, Viable System Model, Cybernetics, Massively Multiplayer Games.

1. INTRODUCTION Many computer games are designed based upon the history and fantasy of a living community; represented in iconic figures to mark its existence within the game world. With the rise of the Internet and 3D graphics, game designers have begun to exploit such technology for a new dimension of gameplay, thus the Massively Multiplayer Online Game (MMOG) genre was brought to market. Games such as Neverwinter Nights series [1], Ultima Online series [2], Everquest series [3], GuildWars [4] and World of Warcraft (WoW) [5] are examples of successful MMOG titles.

2. THE VIABLE SYSTEMS MODEL (VSM) The VSM was created by Stafford Beer, the world most successful VSM practitioner, as a conceptual tool to understand organizational structure. It does this through the study of interactions which produce shared communication with particular structure and hence, realises the potential of individuals within such an organisation to act autonomously to a changing environment for survivability of the whole thus contributing to the overall effectiveness and efficiency of the organization [8, 9].

The games mentioned above share a common design challenge; dynamism in the game world. We use the term “dynamism” in

121

2.1 Viable Systems Viable systems are systems that operate on their own and capable of it own survivability within the environment it resides in [8, 9]. Viable systems are capable of problem solving and autonomous to their changing environment. A human on its own is a viable system and such a definition also applies to groups of humans working collectively to achieve shared visions, and even larger context; as an organisation.

ƒ

System II (Coordination) – represents the information channels within System I. It is responsible for the coordination of overall System I activity ensuring stability.

ƒ

System III (Control) – represents the structures and controls responsible for monitoring and management with the aim of optimizing the ongoing activities in System I.

ƒ

System IV (Intelligence) – responsible for obtaining direct feedback from the environment to adapt to changes and promote long term viability within the whole.

ƒ

System V (Policy) – responsible for policy-related decisionmaking governing the overall context, thus providing governance and closure to the viable system.

2.2 Environment, Meta system and Operation The VSM represents the overall balanced system into three elements [8]; ƒ

Environment – the external environment which the viable system is interacting with.

ƒ

Metasystem – the control mechanism of the viable system which is capable of providing identity and managing the operation elements, and adapting to the change in environment.

ƒ

Operation – the action mechanism of the viable system which has direct interaction with the environment. Such operation can be organised in smaller units to meet the demands of a specific environment.

S1

These elements can be organized into a basic VSM diagram as shown in figure 1 (below). The environment element is represented in the free-form amoeba-like shape, whereas metasystem and operation are represented in diamond and circle adhering to Beer’s convention.

Figure 2: Viable systems within System I As illustrated in figure 1, System II to System V represents the functions of metasystem element of viable system, while System I forms the operation element as described in section 2.2. System I can further be represented by smaller unit of viable systems as illustrated in figure 2. Such application can also be applied to System II, System III, System IV and System V wherever necessary and applicable.

2.4 Classical Cybernetics Revisited Cybernetics was made known by Norbert Weiner in the late 40’s in his study of teleological mechanism [10]; the philosophical study of design, purpose, directive principle, or finality in nature or human creations [6]. Such study was then used as a foundation for creation of the VSM. Cybernetics uses mathematical models to represent the adaptive process of a living organism within a changing environment explainable in three parts; mechanism, variety and regulation. We shall present brief concepts of cybernetics based on Ashby’s writing “An Introduction to Cybernetics” [7] which will aid in explaining our approach in modeling virtual society using VSM.

Figure 1: A basic VSM diagram

2.3 The Five Systems within a Viable System The metasystem and operation element within such system’s view then forms a viable system serving an environment. The viable system can further be decomposed into System I to System V [8, 9] each with different functionality; ƒ

System I (Implementation) – responsible for delivering products or services to the environment. System I can consist of subunits which act as viable systems individually.

The concept of change is fundamentally essential for cyberneticians as it leads them to further investigate operations

122

creating interesting gameplay. Team of players in a soccer game such as Winning Eleven 9 [16] can also be observed with this characteristic, but in a slightly different context.

within objects that cause change to take place. Change experienced by an object is caused by the operator which transforms the initial state of the object (operand) to a new state of the object (transform). Properties of the object may have changed to form a new state for the object during transition. Such a new state usually exists within the set of states associated to that object under the transition defined. A set of transitions under the same operator can then be defined as transformation. Though change is continuous in its natural form, it is important that the value of differences is discretely recordable. Transformation shall not produce a new state that does not exist within the set of states available to remain closed. A set of interrelated transformations can be grouped to form a system serving a specific purpose. A state of equilibrium is achieved when a system is left to run for certain duration and may end either at a final state or cycle of states. Such closure can be defined as stability which can be observed within any system. Thus change can be expressed mathematically in algebraic equation, matrices, kinematics graphs or finite state machines for use. Every aspect of change can also be further decomposed into smaller units for more elaborate representation; transformation can be grouped as set of transformations and states can be represented as vectors instead of single value.

3.2 Dynamism in Virtual Community Dynamism within a community is driven by the economic activities from a financial standpoint. From a social science perspective, dynamism can also be created through social interaction. During the process of interaction, an exchange of values between objects would have caused a change in the objects involved. Thus, interactions between smaller units within the community can have direct impact to the dynamism of the community. It is important that such change is recorded and presented in possible ways to signify transformation adhering to concept of change in cybernetics. Such a statement is very relevant to computer games as game players should be notified on the change in the object or the character either visually or aurally (or both).

3.3 Meta-properties of a Viable System within a Virtual Community The cause of dynamism is understood through the notion of change via in-game interactions, thus altering properties of character giving unique identity to the character. In search for aspects to model a virtual community and represent character (we consider a character to be a viable system) based on the idea of uniqueness, we represent “community” in a VSM diagram as shown in figure 3. Scientific knowledge areas relating to the context were studied and based upon Occam’s Razor Principle of Systems and Cybernetics – “one should not increase, beyond what is necessary, the number of entities required to explain anything” [17], we have identified 8 aspects to present identity of a viable system (refer to Table 1 for description).

The concept of variety provides a means of presenting information representing a state that is communicated within the mechanism of a system. Thus providing variety certainly implies providing choices of communication with the system. Variety however can be constrained to have fewer degrees of freedom, thus limiting the accepted values. Constraints may exist naturally or even exist within the object itself. Variety can be represented in a finite set of values or as probability associated to a set of values. The existence of constraints allows prediction of matters and thus promotes learning due to the reduction of the set. An important remark on concept of variety is Ashby’s Law of Requisite variety which states; “The larger the variety of action to a control system, the larger the variety of perturbations if it is able to compensate” [7, 11]. Such a principle serves as a regulation component within a system.

3. MODELING VIRTUAL COMMUNITIES USING THE VSM The VSM has been used as a conceptual tool for diagnosing organizational problems [9, 12] and modelling autonomous software systems [13, 14]. Modelled from the human nervous system and classical cybernetics theory to represent an organization, VSM provides an interesting proposition for game designers and developers to delve into such a discipline for greater understanding and extend its application in designing computer games.

3.1 Virtual Community in Games

Figure 3: Model representing community in VSM context

The term ‘community’ can be described in general as a group of living organisms in a living space. Community consists of societies of different views, laws and structure and this can easily be identified in modern computer games genres aforementioned. NPCs within game world are indeed a community which consist of societies serving different purposes. In an RTS game such as Command and Conquer Generals: Zero Hour [15], troops of soldiers with a variety of weaponry operate as a society in its own views, laws and structures to battle against different societies thus

In biological sciences, uniqueness of a living organism is determined by the Deoxyribonucleic acid (DNA) which carries genetic information (genome) of the living organism [18]. Such information decides the appearance, health and action mechanism that would distinctively represents the identity of a living organism [19]. In psychology, a measure of uniqueness can further be derived from personality as a collection of emotions, thoughts and behavioural patterns unique to animals and humans [20]. While sociology provides other measures of uniqueness

123

gathered can then be represented in probability expressions which decide upon the encoding of such actions as a permanent action within Behaviour in System II.

inflicted from the societal belief, we have generally categorised those measures into ownership of objects and belief. The “8 Aspects of Identity in Virtual Communities” identified can be used as a template for representing in-game characters. Each aspect further denotes a list of characteristics representing a character in detail, and values can be assigned to such characteristics to uniquely represent such a character within the community. Aspect

Description

Appearance

Visual and aural representation of a character such as facial appearance, skeletal structure.

Health

Statistical information which represents the existence of a character such as strength, heart rate and etc.

Action Mechanism

Set of movements associated to the character.

Emotion

Set of triggers to that activate actions associated to the character.

Thought

Intelligence of the character.

Behaviour

A set of reactions to situations encoded in the character based on historical facts or belief system.

Ownership of objects

Objects or values which belongs to the character such as money, vehicle and etc.

Belief

Set of cultural laws and regulation that limit the variety of behaviour of the character.

METASYSTEM

S5 Emotion

Belief

S4 Behaviour

ENVIRONMENT Thought S3

S2

OPERATION S1 Items

Action Mechanism

Health

Appearance

Figure 4: “8 Aspects of Identity in Virtual Communities” in VSM context VSM can be used on a higher level to model a collection of viable systems in various scales serving specific purposes within society and community. Roles, rules and goals can be defined in a game design context and mapped into the operation and metasystem element of such a complex viable system.

Table 1: Description of “8 Aspects of Identity in Virtual Communities”. Behaviour, Thought, Emotion and Belief are S2 (Coordination), S3 (Control), S4 (Intelligence) and S5 (Policy) respectively, while Health, Appearance and Action Mechanism are represented as S1 (Implementation). Item does not belong within the viable system as it can be termed as a viable system or object owned by another viable system within the environment to distinguish itself. Therefore it is possible to have viable systems that own items of the same type within a larger context. Similarity in characteristics possessed by a collection of viable systems can then be identified as a type under the Identity of the Indistinguishable Principle of Systems and Cybernetics – “Two entities that do not have any properties to distinguish them should be seen as single entity” [21]. The aspects presented merely promote variety to a virtual community within a game, but has yet to introduce any dynamism to gameplay. Viability would only exist when regulations within the metasystem are clearly defined as dynamism within a community requiring interactions to be invoked either via belief or needs.

4. CONCLUSION In this paper, we have briefly introduced a cybernetics approach of modelling, the Viable Systems Model, to develop an architectural representation of effective self-management complex systems within a changing environment. We extend the use of VSM from operational management to modelling of a virtual community for use in computer games in order to introduce dynamic gameplay and thus increasing entertainment value to such software. The “8 Aspects of Identity in Virtual Communities” presented in Table 1 is a useful guideline for game designers when drafting characteristics of virtual characters in games and thus promoting variety within the virtual community. The VSM can then be used to aid in defining functional components such as game AI, game physics and game input routine which affect the properties of the virtual character statistically (game data), which in turn are presented via visual and aural representation (game content), thus introducing dynamics to gameplay. We do not deny that game dynamics can be achieved through implementation of AI agents coupled to characters, however we realise the opportunity in exploiting the VSM to provide cohesive game structure leading to greater dynamism. We share the same concern of Pangaro [23] that cybernetics and AI are different and we emphasize that the VSM can be effectively employed for definition of AI functionalities in

The metasystem within the viable system is the core for decisions based upon the beliefs of the viable system. Beliefs are rules introduced to a character that forms the connection between the knowledge of beliefs. Belief mapped onto System V can be a set of finite state machines coupled with neural network governing the Emotion in System IV and Thought in System III as based upon classical psychology theory – Maslow’s Hierarchy of Needs as seen in the work of El Rhalibi et. al. [22]. Statistical information

124

[10] Wikipedia, 2006, Teleology, From http://en.wikipedia.org/wiki/Teleology

computer games (since the origin of AI is cybernetics). The VSM provides great conceptual guidelines to game designers for designing computer games which involve modelling of communities and the recursive application of VSM at various levels further eases the modelling process and aids in understanding of the game ‘engine’ at both a macro and micro level by utilising a common design language for all aspects of the game design.

[11] Heylinhen, F. & C, Joshlyn, 2001, The Law of Requisite Variety, From http://134.184.131.111/REQVAR.html (Accessed 25/09/2005) [12] Leonard, A., 1999, A Viable System Model: Consideration of Knowledge Management, Journal of Management Practice, From http://www.tlainc.com/articl12.htm (Accessed 22/09/2006).

5. FUTURE WORK Work highlighted in this paper serves as a preliminary effort by Serious Games Research Group to apply the VSM into various aspects of computer games creation. The recursive characteristic of VSM provides a great opportunity for modelling a tiny portion of a game or a complete engine recursively all running from a common set of rules on a single engine. We are currently working toward extending the concepts presented in this paper into a notation for creation of a modelling tool that would aid game designers to architect their game. Such a tool can have content and functionality integrated easily into a development environment for production of computer games, thus providing a technology platform for easy creation of games. Application of the VSM can also then be extended into the workflow of a game studio to ease the coordination of designers and developers using such technology platform.

[13] Laws, A.G., A. Taleb-Bendiab, and S.J. Wade ,2001, Towards a Viable Reference Architecture for Multi-Agent Supported Holonic Manufacturing Systems, Journal of Applied Systems Studies, vol. 2, no. 1. [14] Laws, A.G., M. Allen & A. Taleb-Bendiab, 2002, Normative Services For Self-Adaptive Software To Support Dependable Enterprise Information Systems, in Proceedings of the 4th International Conference on Enterprise Information Systems (ICEIS2002), Cuidad Real, Spain, ICEIS Press. [15] Command And Conquer Generals: Zero Hour, From http://www.ea.com/official/cc/firstdecade/us/index.jsp (Accessed 19/09/2006) [16] Winning Eleven 9, From http://www.konami.com/gs/gameinfo.php?id=164 (Accessed 19/09/2006)

6. ACKNOWLEDGEMENT

[17] Heylinhen, F., 1997, Occam’s Razor, From http://134.184.131.111/OCCAMRAZ.html (Accessed 25/09/2005)

This research is supported by the School of Computing and Mathematical Sciences, Liverpool John Moores University.

7. REFERENCES

[18] Wikipedia, 2006, DNA, From http://en.wikipedia.org/wiki/Dna (Accessed 23/09/2006)

[1] Bioware, Neverwinter Nights, From http://nwn.bioware.com/ (Accessed 19/09/2006)

[19] Daniel H. Haft, Jeremy D. Selengut, Lauren M. Brinkac, Nikhat Zafar, and Owen White, 2004, Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics, Bioinformatics Advance Access published on September 3, 2004 Bioinformatics 2005 21: 293-306; oi:10.1093/bioinformatics/bti015, From http://bioinformatics.oxfordjournals.org/cgi/content/full/21/3 /293 (Accessed 23/09/2006)

[2] Origin, Ultima Online, From http://www.uo.com/ (Accessed 19/09/2006) [3] Everquest, From http://eqplayers.station.sony.com/index.vm (Accessed 19/09/2006) [4] Guild Wars, From http://www.guildwars.com/ (Accessed 19/09/2006) [5] World of Warcraft, From http://www.worldofwarcraft.com (Accessed 19/09/2006)

[20] Nairne, J., 2003, Psychology: The Adaptive Mind, Wadsworth.

[6] Wikipedia, 2006, Cybernetics http://en.wikipedia.org/wiki/Cybernetics (Accessed 19/09/2006)

[21] Heylinhen, F., 1995, The Identity of the Indistinguishable, From http://134.184.131.111/IDENINDI.html (Accessed 26/09/2005)

[7] W. Ross Ashby, 1999, An Introduction to Cybernetics, Chapman & Hall, London, 1956. Internet (1999), From http://pcp.vub.ac.be/books/IntroCyb.pdf

[22] El Rhalibi, A., Baker, N., & Merabti, M., 2005, Emotional Agent Model and Architecture for NPCs Group Control and Interaction to Facilitate Leadership Roles in Computer Entertainment, In the Proceedings of ACM Advances in Computer Entertainment Technology (ACE 2005), Valencia, Spain, 15th-17th June 2005.

[8] Espejo, R.,2003, The Viable System Model – A Briefing on Organizational Structure, From http://www.syncho.com/pages/pdf/INTRODUCTION%20T O%20THE%20VIABLE%20SYSTEM%20MODEL3.pdf [9] Walker, J., 2006, The VSM Guide: An introduction to the Viable System Model as a diagnostic & design tool for cooperatives & federations, From http://www.esrad.org.uk/resources/vsmg_3/screen.php?page =preface (Accessed 24/09/2006)

[23] Pangaro, P., 1991, Cybernetics – A Definition, Entry in Macmillan Encyclopedia of Computers, From http://pangaro.com/published/cyber-macmillan.html

125

Tuning the board evaluation function using coevolutionary neural networks for a perfect information game Munir H Naveed

Peter I Cowling

University of Bradford MOSAIC Research Centre, Horton Building, Department of Computing, Bradford, UK. +44 (01274) 233948

[email protected]

University of Bradford MOSAIC Research Centre, Horton Building, Department of Computing, Bradford, UK. +44 (01274) 234005

[email protected]

The “Virus Game” [4][5][18] is a two-person perfect information board game of skill. The game is played on a square board. The player who always starts the game is the Black Player and the other player is the White Player.

ABSTRACT This paper presents an investigation of board evaluation functions using two different coevolutionary models. We use two approaches to coevolve Artificial Neural Networks (ANNs) which evaluate board positions of a two player zero-sum game (The Virus Game). The first approach uses evolution starting from initial population of random ANN where members of population evolve by playing against each other while in second approach, the random population of ANNs is evolved using a pool of strong hand crafted (deterministic) AI players. Our experiments show that use of strong deterministic AI players to measure fitness is highly effective in creating stronger game playing strategies than coevolution with first approach.

In the Virus Game, there are two kinds of moves available for each turn. The first kind of move is grow move or one step move. In this kind of move, a player moves a piece of his colour to an empty position adjacent to its current position. The positions are adjacent if their borders or corners are adjacent. The result of this move reproduces the moving piece and occupies both positions, the new position and the old position. The second kind of move is jump move or two step-move where a player moves a piece to an empty position which is two squares away from its current position via an empty square. The piece leaves the old position empty and occupies the new position. In either case, all opposing pieces adjacent to the new player’s piece change colour. Players alternate, moving only one piece per turn. The game ends when neither player can move. The player with the greatest number of pieces is the winner. The game is declared a draw if both players have the same number of pieces at game end.

Categories and Subject Descriptors I.1.2 [Artificial Intelligence]: Learning, Connectionism and neural nets, Knowledge acquisition. I.1.5[Pattern Recognition]: Models, Neural nets.

General Terms

Competitive learning was initially explored by Samuel [13] to adjust the parameters of a deterministic evaluation function in a checkers playing computer program. Tesauro [14] used the temporal-difference learning approach to evolve a backgammonplaying neural network. Tesauro’s TD-Gammon yields a computer playing backgammon program of world-champion strength. Coevolutionary competitive learning is explored for the Repeated Prisoner’s Dilemma (RPD) by Axelrod [2] and Miller [8]. Axelrod evolve RPD playing strategies using a fixed environment (i.e. using eight fixed opponents) while Miller coevolved the RPD strategies by playing each strategy against every other strategy and itself in a population. According to the results shown by Miller, the best evolved RPD playing strategies in his work performed well against strong strategies (like Tit-forTat) taken from Axelrod’s work. Axelrod and Miller used Genetic Algorithms to evolve RPD playing strategies. Angeline and Pollack [1] used competitive coevolution with Tic Tac Toe as a testbed. They introduced a tournament competitive fitness function. Smith and Gray [12] introduced a competitive function where there are n/2 competitions per generation for a population of size n. Smith and Gray applied coevolution to Othello. The weights of a deterministic evaluation function are evolved using a co-adapted GA with explicit fitness sharing. The coevolved evaluation function may not be very strong but the approach is

Algorithms, Performance, Design, and Experimentation.

Keywords Coevolution, fitness function, Artificial Neural Networks, Virus Game.

1. INTRODUCTION In biology, coevolution occurs if the traits of one species A have evolved due to the presence of a second species B and vice versa. This natural phenomenon has motivated AI researchers to apply coevolution in solving different types of problems where two or more entities are interacting with each other. Coevolution is an unsupervised learning method that requires only relative measurement of phenotype performance, well-suited to the gameplaying domain. Games continue to be important domains for investigating problem solving techniques [9]. Games offer tremendous complexity in a computer manageable form and need sophisticated AI methods to play at expert level. Board games like Chess [3], checkers [6], Othello [15] and backgammon [14] have been used to explore new ideas in AI. In this paper, we used the “Virus Game” as a testbed to explore coevolutionary ideas.

126

selected randomly among 14 other neural networks. At the end of each game, an ANN gets +1 score for a win, 0 for draw and -2 for a loss. There are total of 150 games at the end of a generation and the population is arranged in a descending order according to the scores of each ANN. Top 5 scoring neural networks are selected as parents. Two-point crossover is applied on the selected parents to reproduce 10 offspring. Standard Gaussian Mutation operator [16] is applied on the 10 offspring. These 10 offspring replace the neural networks not selected as parents. There are 300 generations used for coevolution of neural networks. Two crossover rates (1.2% and 2.4%) are investigated with two initial values of mutation parameter (i.e. 0.05 and 0.1).

notable for the formation of stable niches (i.e. stable groups) during the evolutionary process. The individuals of each group in a generation have similar characteristics and the results show that the individuals in a group continuously evolve during the coevolutionary process. Anaconda [6] is a checkers playing neural network which is evolved using competitive coevolution. An Initial population of random neural networks coevolve using the competition between the current members of the population where those ANN who perform best in the current generation survive for reproducing next generation while rest of members are killed. The authors have used Evolutionary Programming to coevolve the neural networks in a competitive environment and the strongest neural network, Anaconda, is rated at expert level according to a tournament conducted at website www.zone.com.

2.2.2 Co-Evo-Det In this coevolutionary model, neural networks are evolved using a pool of fixed deterministic AI players where evolution of neural networks continues until at least one neural network in a given population beats all of the fixed opponents or there is no improvement in the scores of the best neural network for 10 consecutive generations. All games in this coevolutionary approach are played at 1-ply by both neural networks and fixed opponents. All deterministic AI players are very different in playing and strength from each other.

This paper investigates the effectiveness of fitness function of two coevolutionary approaches, so called Co-Evo-Self and Co-EvoDet. In Co-Evo-Self, initial populations, containing random neural networks, are coevolved by playing against each other. The weights of neural networks are evolved using a Genetic Algorithm. In Co-Evo-Det coevolutionary approach, a population of random neural network is coevolved by playing against 10 different deterministic AI players. Thus we are able to investigate whether a strong fitness function for coevolving a population of game playing neural network is effective.

Each neural network in a population of 15 neural networks plays as a white and black against the 10 fixed AI players in a generation. The scoring method for calculating fitness value of each neural network is same as used in Co-Evo-Self. At the end of tournament in a generation. All neural networks are ordered in descending order according to their final score in the tournament against 10 fixed AI players.

2. EXPERIMENTAL DESIGN 2.1 Neural Network Design The neural network design is based on the work of [6][18]. In the initial experiments of coevolution, a population of 15 artificial neural networks is created where weights of ANN are generated uniformly over [-0.2, 0.2] at random. Each neural network has two hidden layers where the first hidden layer contains 40 hidden neurons and second hidden layer contains 10 hidden neurons. Every hidden neuron has a sigmoid activation function [11] while every ANN has 64 input units and one output unit. The output unit has linear activation function. A location-based input encoding scheme [14] is used where each input unit is associated with a square of board. If the corresponding square of an input unit has a black piece on the board then its value is assigned as '1’, if square is occupied with white piece the input value for this unit is ‘-1’ and value ‘0’ is used for empty square. When a board is represented to a neural network for evaluation, a scalar value is given by the output node which is considered as the value of that board for the on play player (whose pieces are represented by positive values). The higher the value of a board by an ANN means the better board position for the black player. Therefore each neural network represents a Virus playing strategy.

The top 5 scoring neural networks are selected as parents. Twopoint crossover is applied on the selected parents to reproduce 10 offspring. Standard Gaussian Mutation operator [16] is applied to the 10 offspring. These 10 offspring replace the 10 non surviving neural networks. The number of generations, crossover rates and mutation parameters are same as used in Co-Evo-Self.

3. RESULTS The experiments of Co-Evo-Self model are run on Pentium III 880 MHz using C#.Net running under Windows XP. The best neural network (i.e. highest scoring ANN in the 300th generation) from each coevolutionary experiment is selected to play as a black and white against 10 fixed hand-crafted deterministic AI players used in the Co-Evo-Det. The results of this tournament are shown in table 3.1. These 10 AI players were unseen for expert ANN during their training. These scores of coevolved neural networks are calculated using the scoring scheme used in finding the fitness of coevolving neural networks during the coevolutionary process. The results show that the best coevolved neural networks from different coevolutionary experiments could not beat many of the fixed players. However the coevolved strategies with higher crossover rates have better performance than those evolved strategies with lower crossover rates. The coevolutionary process with mutation parameter of 0.05 (which has been used in the work of [6] [18]) produces stronger neural networks than coevolution with mutation parameter of 0.1. Furthermore, these all neural networks could not beat the simple piece counter when playing as black and white against it and only one coevolved neural network (with crossover rate of 2.4% and mutation parameter of 0.05) has a draw game when playing as black against piece counter.

2.2 Coevolutionary Models 2.2.1 Co-Evo-Self In Co-Evo-Self model, a population of artificial neural networks is coevolved by playing successive games between different neural networks of a given population. A Genetic Algorithm is used to evolve the connection weights of the coevolving neural networks. Each chromosome is a list of all connection weights and bias of an ANN. During the coevolutionary process each ANN plays as black and white against 10 other neural networks

127

each other so a best coevolved neural network, which beats all of them as a black and white player, has somehow generalised all the strategies that the ten opponents have. In order to see whether these coevolved neural network has just got the knowledge of those moves which are good to beat all of the 10 opponents or they have learning to beat unseen opponents, we arranged another tournament between these coevolved neural networks and previously unseen fixed AI players. These coevolved neural networks always beat the unseen players which the best coevolved neural network from Co-Evo-Self could not do in all 10 games against them. These results also reveal that the coevolved neural networks against 10 fixed strategies not only learn the moves to beat their trainers but also learn to beat the opponents which they did not see during the coevolutionary process. We also conducted a tournament between two best coevolved neural networks from Co-Evo-Self and two from CoEvo-Det where coevolved neural networks from second approach beated as black and white all of the neural networks from first coevolutionary approach.

Table 3.1. Summary of results of tournament among four coevolved neural networks from Co-Evo-Self and 10 fixed hand-crafted AI players Crossover Rate

Mutation Parameter

Score

1.2%

0.05

-3

2.4%

0.05

-1

1.2%

0.1

-16

2.4%

0.1

-12

The results obtained from the Co-Evo-Det are summarized in table 3.2 where the performance of the best of the coevolved neural networks against 10 hand-crafted AI players is shown. All experiments are run on Pentium IV 1.2 GHz using C#.Net running under Windows XP. The results of Co-Evo-Det show that the evolving strategies (i.e. ANN) converged very fast to a point in the problem search spacer where they can beat all of the opponents both as white and as black. Figure 1 shows the profile of the fitness of an evolving ANN who learns to beat all opponents in 51 generations.

4. CONCLUSION A comparison of two different fitness function of coevolution is presented where first fitness function is dependent on strength of the member strategies of a population while the second fitness function is dependent on the strength of a group of deterministic AI players. The result in our case demonstrate that by making fitness function more competitive, a better generalization can be achieved and a more competitive fitness function will help in building stronger computer game players.

25 Fitness

20 15

In future, it would be interesting to explore more about methods to make stronger fitness functions in coevolution. There is also need to make fitness function change according to the change in the strength of evolving strategies. The effectiveness of the genetic operations along with fitness function also needs more investigation.

10 5 0 0

10

20

30

40

50

60

Generation

5. REFERENCES Fig 1: Profile of the fitness of a best performing neural network with Co-Evo-Det.

[1] P.J. Angelin and J.B. Pollack, “Competitive Environments Evolve Better Solutions for Complex Tasks”, in the proceedings of 5th International Conference on Genetic Algorithms (GAs-93), 1993, pp. 264-270.

Table 3.2. Summary of results of tournament among four coevolved neural networks from Co-Evo-Det and 10 fixed hand-crafted AI players.

[2]

Crossover Rate

Mutation Parameter

Score

1.2%

0.05

20

2.4%

0.05

20

1.2%

0.1

20

2.4%

0.1

20

R. Axelrod, “The Evolution of Strategies in the Iterated Prisoner’s Dilemma”, Genetic Algorithms and Simulated Annealing, in Lawrence Davis (ed.), Morgan Kaufmann, 1997, pp. 32-41.

[3] M. Campbell, Jr. A.J. Haone, F-h. Hsu, “Deep Blue”, Artificial Intelligence, Vol.134, 2002, pp.57-83. [4] P.I. Cowling, M.H.Naveed, M.A. Hossain, “A Coevolutionary Model for the Virus Game”, in the proceedings of IEEE Symposium on computational Intelligence and Games 2006, Reno/Lake Tahoe, University of Nevada, USA, 22-24 May 2006.

Table 3.2 reveals that the best evolved neural networks in CoEvo-Det always beat all of the ten opponents which were used in the fitness measure for the coevolution of these playing strategies. As we know that the ten opponent AI players are very different

[5] P.I., Cowling, “Board Evaluation for the Virus Game”, in the proceeding of IEEE 2005 Symposium on computational Intelligence and Games (CIG’05), Graham Kendall and Simon Lucas (editors), 2005, pp. 59-65.

128

[6] D.B., Fogel, and K. Chellapilla, “Verifying Anaconda’s expert rating by competing against Chinook: experiments in co-evolving a neural checkers player”, Neurocomputing, 2002, Vol.42, pp.69-86. [7] C., Igel and M. Husken, “Empirical Evaluation of the Improved RPROP Learning Algorithms”, Neurocomputing, Vol. 50C, 2003, pp.105-123. [8] J.H. Miller, “The Coevolution of automata in the repeated prisoner’s dilemma”, Journal of Economics Behavior and Organization, Vol.29, 1996, pp.87-112. [9] D.E., Moriarty and R. Miikkulainen, “Discovering Complex Othello Strategies Through Evolutionary Neural Networks”, Connection Science, Vol.7 No.3, 1995, pp. 195-209 [10] M. Riedmiller and B. Heinrich, “A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm”, in the proceedings of IEEE International conference on Neural Networks, 1993, pp.586-591. [11] D.E. Rumelhart, J.L. McMlelland and the PDP Research Group, “Parallel Distributed Processing”, Exploration in the Microstructure of Cognition, Vol. 1, 1986, MIT Press [12] R.E. Smith and B. Gray, “Co-Adaptive Genetic Algorithms: An Example in Othello Strategy”, in the proceeding of The Florida Artificial Intelligence Research Symposium, 1994. [13] A.L. Samuel, “Some Studies in Machine Learning using the Game of checkers”, IBM Research and Development Journal, 1959, pp. 211-229. [14] G.J. Tesauro, “Temporal Difference Learning and TDGammon”, Communications of the ACM, Vol. 38, No. 3, 1995, pp. 56-68. [15] M. Buro, “The Othello Match of the Year: Takeshi Murakami vs. Logistello”, ICCA Journal, Vol. 20, No.3, 1997, pp.189-193. [16] X. Yao and Y. Liu, “Fast Evolutionary Programming”, in the proceedings of 5th annaual conference on Evolutionary programming, 1996, pp. 451-460. [17] M. Mitchell, “An Introduction to Genetic Algorithms”, MIT Press [18] Cowling, P.I., Naveed, M.H., and Hossain, M.A., “A Coevolutionary Model for the Virus Game”, in the proceedings of IEEE Symposium of Computational Intelligence and Games 2006, University of Nevada, Reno, Nevada, USA.

129

A Serendipitous Mobile Game Helen Clemson

Paul Coulton

Reuben Edwards

Infolab21 Lancaster University, Lancaster LA1 4WA, UK

Infolab21 Lancaster University, Lancaster LA1 4WA, UK

Infolab21 Lancaster University, Lancaster LA1 4WA

[email protected]

[email protected]

[email protected]

games online or through their TV [2]. This means there are opportunities for researchers and developers to explore new game genres that target the unique features emerging on the mobile phone platform.

ABSTRACT Whilst there are a number of location sensing games emerging for mobile phones there are few examples of social proximity based games that are effectively position independent and whose game play is invoked by serendipitous encounters. In this research we have extended the existing mobile phone phenomenon of Bluejacking, which allows spontaneous social encounters, into the design of a novel mobile game that creates instigates an exciting ludic scenario. The game is called ‘mobslinger’ which is based on the premise of a wild west, quick draw, ‘shoot-em-up’ In this game the phone scans continuously searches for other game players and when found instigates a ‘shootout’ were players ‘draw’ their phone and ‘shoot’ their opponent . This paper presents the design, and subsequent implementation, of the game which takes account of the limitations of the platform whilst maintaining the social nature of phone use to successfully create the serendipitous social encounters.

One particular feature that is seen to drive innovation and thus location based games are, despite initial skepticism, increasingly being seen as an important new genre. We have already seen a small number of implementations location based games for mobile phones [11] and these numbers are increasing all the time. Some of these games use proximity, in the sense that the user’s location is close to either another player or a real or virtual artifact within in the game. However, none rely solely upon the proximity between either a player and another player, or the player and a game artifact, irrespective of the physical location. The most notable example of this type of game play, although not mobile phone based, is that of Pirates [3] which was an adventure game using Personal Digital Assistants and RF proximity detection. Players completed piratical missions by interacting with other players and game artifacts using RF proximity detection. One of the interesting aspects that emerged was the stimulated and spontaneous social interaction between the players. One of the research motivators for this project was to produce a game that provided the opportunity to bring this type of provide similar interaction but of a more serendipitous nature to mobile phone users. We believe this serendipity can be best achieved by removing the requirement for a central game server, as utilized in Pirates, and utilize a proximity detection scheme that initiates a dynamic peer to peer connection between the mobile users when the enter into the close vicinity of each other but require no prearrangement for the meeting.

Categories and Subject Descriptors: J. Computer Applications, J.7 Computers in Other Systems-Consumer General Terms: Design, Experimentation, Human Factors. Keywords: Mobile, Bluetooth, Social Software, Mobile Games.

1. INTRODUCTION With mobile phone subscriptions growing from just over 11 million in 1990 to an expected 3 billion by 2008 [10] there can be little doubt that mobile phones are becoming an integral part of the fabric of our daily lives. However, the majority of phone use is still voice centric and in many countries although data services are growing [8], many manufacturers and operators are still trying to identify application and services that excite the user. Mobile Games are already a success story with revenues exceeding € 1.7 billion in 2005, although, traditional games publishers have displayed a relative weakness in leveraging their success from the console market [12]. This is no doubt due to the fact that games on mobile phones appeal to a much wider consumer demographic, who have shown little interest in the driving and first person shoot-em-ups titles that dominate the console market, and arguably have more in common with those players utilizing

In terms of proximity detection the obvious choice is Bluetooth which despite previous predictions of its demise is in fact increasing its growth, with Nokia predicting a year-on-year increase of 65% in 2006. In fact there are already a small number of mobile Bluetooth proximity applications which are often described as Mobile Social Software (MoSoSo) and can be viewed as evolutions of Bluejacking. This is a phenomenon where people exploit the contacts feature on their mobile phone to send messages to other Bluetooth enabled devices in their proximity [9]. Bluejacking evolved into dedicated software applications such as Mobiluck1 and Nokia Sensor2 which provided a simpler interface, and in the case of Nokia Sensor, individual profiles that could used to initiate a social introduction. More complex systems

130

1

www.mobiluck.com.

2

www.nokia.com/sensor

such as Serendipity [7] have tried to introduce a greater degree of selection into the process in that a server tries to match users before making the introduction. The are relatively few games that have exploited Bluetooth in this way but one such is You Who3 from Age0+ which provides a simple game premise to help initiate a meeting. After scanning for other users running the application and ‘inviting ‘ a person to play the game, the first player acts as a ‘mystery person’, who then provides clues about their appearance to the second player, who builds up a picture on their mobile phone screen. After a set number of clues have been given, the players’ phones alert, revealing both players’ locations and identities. Obviously, the game play is quite limited and effectively non-competitive, which is unlikely to result in repeated game play, therefore, it is closer to the Nokia Sensor than a game.

vicinity who are also running the mobslinger application. Once detected, a countdown timer is initiated on both phones which alert the user by sounding an alarm and vibrating the phone. The user then has to ‘draw’ their mobile and enter the randomly generated number which has appeared on the screen as quickly as possible. The person with the fastest time is the winner and the loser is ‘killed’, which means their application is locked out from game-play for a set period of time. The overall game logic is shown in Figure 2 although the game is playable in a number of different modes which we will discuss in the following paragraphs both in terms of the operation and their relative merits.

The game described in this paper provides significant opportunities for both serendipitous social interactions with additive and competitive game play. The game draws from the familiar, which is always a good way of gaining acceptance for a new game, using the concept of a Wild West quick-draw gunfighter which we have called ‘mobslinger’ [5], as shown in Figure 1.

Figure 2. mobslinger state diagram

2.1 Quick Draw This is the basic mode of the game for two mobslingers at a time. Once the phones have detected each others presence the first mobslinger to ‘draw’ their mobile and enter the correct code wins, as shown in Figure 3.

In the forthcoming sections, we shall explore the design and implementation of the game, provide a rationale for the choices made and highlight the design challenges overcome particularly in regard to the Bluetooth implementation. Finally, we shall define future enhancements that will facilitate large scale user experiences to be evaluated before drawing our overall conclusions.

This mode is intended to promote serendipitous social interaction between two players, who would generally be unknown to each other, in the form of a ludic greeting. The loser is locked out of the game for a set period and unable to interact with other players. This may be a disadvantage for larger social groupings and therefore has been addressed in a subsequent mode. The optimum length of the lock out is something that can only truly be ascertained from the large scale user feedback and at present we have fixed this at 2 hours.

2. GAME DESIGN

2.2 Blood Bath

Figure 1. mobslinger splash screen on a Nokia 3230

This is the large scale battle mode (commonly referred to by the design team as playground mode); in which two or more mobslingers score points by beating randomly selected targets to the draw over a set time period (Note: in this mode there is no lockout period). After the time period has expired the game is ended and the mobslingers’ high score is displayed on the phone. The mobslingers can then compare scores to ascertain the winner

The basic game premise is simple to understand and operate, which is an essential feature in any game [1]. The relatively low cost of mobile phone games means they are often very quickly discarded by users if they cannot quickly engage with game-play [6]. Mobslinger runs as a background application on Symbian S60 smartphone which periodically scans for other users in the 3

www.age0.com/you-who/

131

and allows them an element of schadenfreude4. Because of the intensity of the game play the constant noise may be irritating in some social situations and this mode is probably best played outside or in a regimented social setting. This mode does not have the spontaneity of Quick Draw and needs to be initiated by a preformed social grouping.

2.5 Top Gun We have also included a ranking system into the game to satisfy the desire of some experienced players being able to differentiate themselves from novices. The basic player starts out with a one star rating and this is increased in stages until they reach a 5 star rating as shown in Figure 3. The levels are based upon games played, kills made and then their kill/die ratio. The levels are built up from satisfying a defined minimum for each of these criteria for each advancement level.

3. GAME IMPLEMENTATION Having defined the design and required operation of the game we shall now discuss the implementation which presents a number of challenges despite the apparent simplicity of the game premise. The basic requirement for the game is that there are two or more devices equipped with Bluetooth receivers and implementation issues are:

Figure 3. mobslinger ‘Draw’ screen in duel and outlaw modes

how are they to be connected?

x

how are they to be detected?

x

and how does one device ensure it only communicates with another device that is running the mobslinger application?

In order to overcome these issues in Symbian and gain an understanding of what the hardware and software requirements are for supporting Bluetooth, its important to understand the Bluetooth stack. Therefore, in the following section we analyze the Bluetooth stack with reference to the creation of the necessary implementation framework for the mobslinger application.

2.3 Last Man Standing This is a less intensive battle mode than Blood Bath and is for two or more mobslingers. The aim of the game is to be the ‘last man standing’ after a series of quick draw encounters within a group. Duels are triggered at randomly timed intervals, so are likely to prove less irritating for the general public, and create an air of anticipation amongst the mobslingers. The random time also means that the group can be engaged in the general social proceedings of the evening and the game effectively provides a humorous aside. The game play may be lengthy and players knocked out in the early stages may lose interest, although players in search of the more intense experience may opt for Blood Bath mode. As with Blood Bath mode, this requires an initiation by a preformed social grouping.

3.1 Bluetooth Stack The Link Manager Protocol (LMP), Baseband and Radio layers form the hardware component of the Bluetooth stack [4]. The radio layer is the physical Bluetooth radio transceiver. The baseband layer establishes connections and controls the transmission of data over the radio layer. A number of protocols are required to provide support for Bluetooth communications across the stack of which the link manager protocol will manage the behavior of the wireless link, and provide authentication and security services.

2.4 Outlaws

The Host Controller Interface (HCI) is the connection between the hardware and software components of the Bluetooth stack. The Logical Link Control and Adaptation protocol (L2CAP) supports the data connection between upper and lower stack protocols [4]. It receives the data from the application layer, in our example the mobslinger application, and segments the data into a Bluetooth format. It also reassembles the incoming data thus allowing the upper layers to use the received Bluetooth data.

This is the team mode of the game where groups of mobslingers join forces to create ‘Outlaw’ gangs who then go out in the hope of combating other teams. The Outlaw gangs are generally comprised of two or more mobslingers and offer a mode of play where there is a greater opportunity for tactics to develop. The game play is similar to Last Man Standing, although the random interval between duels is shorter, and only mobslingers of an opposing set of Outlaws are selected to play against each other. In this mode, challenges are made and accepted between gangs. In Figure 3 we show the ‘Draw’ screen for Outlaw mode where the player rating is replaced by the gang names. In essence this mode can be viewed as preformed social groups engaging in the spontaneous encounters of the Quick Draw mode.

4

x

Radio Frequency Communications (RFCOMM) is a transport protocol that emulates the serial ports over the L2CAP protocol. The Service Discovery protocol (SDP) allows the discovery of services, such as applications, which are offered by other Bluetooth devices. Finally the application we develop utilizes these protocols to provide Bluetooth communication between devices.

Amusement at the misfortune of others.

132

Service discovery can now be implemented as we have a means of advertising a service and a list of devices that use Bluetooth. The service discovery is implemented by searching for patterns in each device, such as the availability of the service, the port it is advertised on and a description of the service. Once a match is found a connection to the socket server can be made and a new socket opened to allow connection to the device using its address. If the connection is successful data can be sent and received over the socket.

The Symbian OS Bluetooth API provides support for the RFCOMM and L2CAP layers in the Bluetooth stack. It is based on the sockets API which allows a device to act as a server and have remote client devices connected to it. Once connected the two devices may send and receive data then disconnect when they have finished. Before a connection can be made, the Bluetooth device must locate another Bluetooth device with the same service. The Service Discovery Protocol (SDP) will implement this process and in the following section we shall discuss its operation.

3.2.2 Client/Server Architecture Each device has its own role in the Bluetooth connection process and we need to determine which functionality each will provide.

3.2 Bluetooth Service Discovery Protocol The service discovery protocol has two purposes; to discover other devices and their services and advertise its service. Symbian S60 provides two API’s allowing developers to manipulate the SDP; these are: x

the Service Discovery Database,

x

and the Service Discovery Agent.

The server is responsible for setting up the security for the connection and advertising a service. A connection to the socket server is made and a ‘listening’ socket is opened. This socket can be used to return a port to listen on, then sit and wait for incoming requests. It is advisable to use some form of security for sending the data over the Bluetooth connection. The S60 API allows the user to set the security in terms of authorization (using a passkey), encryption and authentication.

In the following sections we shall look at how to discover other devices, how to advertise a service and detect a service.

The client device is responsible for searching for the devices and services available. It also initiates the connection between the devices. Data can be sent and received by streaming the data to the socket, which can be stored in a buffer and used accordingly.

3.2.1 Device Discovery By using the sockets API we can utilize its protocol searching facilities to search for information on the BTLinkManager protocol. This will provide information such as the address of the protocol family, the socket type, security information and message size. A host resolver can then use this information to provide an interface to the host name resolution service which will obtain the name of each device from an address. Basically this means the API will search for a device and return the information in the form of either an address or name of the device it has found.

Now we have all the means of implementing a Bluetooth application we can take a look into the logic behind the mobslinger application.

3.3 mobslinger Game Logic The games premise works on two or more players, mobslingers, who have to draw their phones when a Bluetooth connection is made, and ‘shoot’ the other person. As much of the game is similar across the modes in this paper will look principally at the logic behind the basic two-player version.

The device searcher needs to find all the available Bluetooth devices in the local vicinity. Therefore we need the search to keep searching until it has found all the available devices. This can be done using Active Objects. An active object is a Symbian OS concept that allows the programmer to handle asynchronous requests in the same thread [13]. Therefore we can start the search for a device then set out thread ‘active’. Once the device search completes, it automatically calls a function called RunL() which will handle the result. Information on the device found can now be stored and the search started again. Once all the devices are found, the active object will alter its status parameter so we know the device discovery process is complete. Now a list of all the devices is stored, we can traverse this list, searching for a service on each device.

Firstly, the game runs in the background of the phone so the user is not aware of the Bluetooth connection process. Once a connection is established this triggers an event to bring the game to the foreground. A random number is displayed on screen and the player must quickly type this in their phone. Once the player has input the correct key this will send a message to the other device. At this point we come across the first major design decision as there is only have one socket that can be used send/receive data in a half duplex mode. Therefore only one device can send at a time and each device must be set in a sending or receiving mode so that it can deal with the incoming request. There are two possible solutions to this problem:

The service discovery protocol database allows the publication of available services. Basically we can add a new record to the database with details of the service, in this situation the mobslinger, and allow other devices to access it. As other devices need to access the database, Symbian OS implements the SDP database as a server to which our application must connect and provide a session to. A connection to the SDP server is made before a connection to the database is established. A new record can be created and attributes, such as the name, description and ID of the service can be added. Each of the devices will need to know these attributes.

x

create and open another socket to allow simulated fullduplex communication;

x

or allow one device to send and the other to work with a timed system.

It was decided that the timer would provide the simplest solution and would work reliably. Once the two devices have connected, only the client is able to send data. The server sits and waits to receive. Therefore, if the

133

complex or flashy graphics, it does provide a uniquely mobile experience. This mobile experience is not merely in the mobility of the device itself, but also encompasses the social aspects of community, and the ability to engage in very playful serendipitous encounters.

client presses the correct button first, it sends a message to the server and both phones display the end result screen. However, if the server presses first, a message will be displayed on its screen informing the user it is the winner (if it has not already received a message from the client device). The client does not yet know that the server has won, so it uses a timer. The timer is activated once the connection is made, and after 10 seconds (if a key has not been pressed), will display a ‘loser’ note. However, it is possible that neither user has pressed the keys in time so there is a ‘draw’ state. The client device sends a test message to the server asking for its status (if it has pressed a key). The server replies with its state and the appropriate message can be displayed. The game is then reset and sent to the background.

6. ACKNOWLEDGMENTS Our thanks to Nokia for the general provision of software and hardware to the Mobile Radicals research group at Infolab21 at Lancaster University which was used in the implementation of this project.

7. REFERENCES

The next challenge faced is deciding which mobile device should be the server and which should be the client. We cannot hardcode a mobile to be of certain type as this will cause incompatibility between many mobslingers. Although it could be that the game is implemented such that a player chooses ‘a side’ to be on (client or server to the device), however for greater compatibility and an ‘every man for himself’ approach and created an alternative solution which was to randomly alternate the phone between client and server mode. A random number is generated to determine which mode to set the device, then another random number is generated to determine the length of time in which to stay in that mode. The cycle continues until a connection is established.

[1] Bateman, C., and Boon, R., 21st Century Game Design, Charles River Media, September, 2005. [2] BBC News, Games firms face challenges ahead, 9th January 2006, http://news.bbc.co.uk/go/pr/fr//1/hi/technology/4587902.stm. [3] Björk, S., Falk, J., Hansson, R., and Ljungstrand, Pirates! Using the physical world as a game Board, In Proceedings of Interact 2001, IFIP TC.13 Conference on Human-Computer Interaction, Tokyo, 2000. [4] Bray,J., and Sturman,C., Bluetooth: connect without cables, Prentice Hall Inc, Upper Saddle River, NJ 07458, 2001. [5] Clemson, H., Coulton, P., Edwards, R, and Chehimi, F., “Mobslinger: The Fastest Mobile in the West”, 1st World Conference for Fun 'n Games, June 26-28, 2006, Preston, UK, pp 47-54.

3.4 moblsinger UI Mobslinger provides the user with more thrill and excitement than the average mobile application although the UI is very minimal. Vibrating alerts and sounds are used to entice the user; such as gun shot sounds when the user clicks fire and vibrating alerts when a duel request is received. The graphics are generally not seen but care has been taken to ensure that they become more elaborate the higher the mobslinger progresses in the game, thus giving a ranking system and bragging rights to players with new shoot screens.

[6] Coulton P., Rashid O., Edwards R. and Thompson R. “Creating Entertainment Applications for Cellular Phones”, ACM Computers in Entertainment, Vol 3, Issue 3, July, 2005. [7] Eagle, N and Pentland, A., Social Serendipity: Mobilizing Social Software, IEEE Pervasive Computing, Special Issue: The Smart Phone. April-June 2005. pp 28-34. [8] Ipsos, Mobile Phones Could Soon Rival the PC As World’s Dominant Internet Platform, 18th April 2006, http://www.ipsos-na.com/news/pressrelease.cfm?id=3049.

4. FUTURE DEVELOPMENT Whilst many mobile games become uninspiring for the developers once the excitement of the design process is complete, this game offers a whole new set of research opportunities. While the game is popular amongst our research group there are a number of different aspects related to each of the modes that can only be answered by trials involving significant numbers of users. We are therefore trying to expand the implementation of mobslingers to other mobile platforms such as UIQ and Windows Mobile so that we can deploy the software across a wide a range of users as possible. Although many may suspect that J2ME would be an obvious solution there are significant differences between the different levels of support for Bluetooth within different phone models and it has been discounted for now. Once a greater number of applications have been completed we will then be able to comprehensively investigate the social interactions the game presents.

[9] Jamaluddin J., Zotou N., Edwards R., and Coulton P., Mobile Phone Vulnerabilities: A New Generation of Malware, IEEE International Symposium on Consumer Electronics, , Reading, UK,ISBN 0-7803-8527-6, ISCE_04_124 pp 1-4, 2004 [10] Niccolai, J., Nokia ups forecast for worldwide mobile phone sales, IDG News Service, March 30, 2006. [11] Rashid, O., Mullins, I., Coulton, P., and Edwards, R., Extending Cyberspace: Location Based Games Using Cellular Phones, ACM Computers in Entertainment, Vol 4, Issue 1, 2006. [12] Screen Digest press release for "Wireless Gaming Business Models: Opportunities for publishers, developers and network operators." 7th December 2005, http://www.screendigest.com/.

5. CONCLUSIONS

[13] Stichbury, J., Symbian OS Explained: Effective C++ Programming for Smartphones, Wiley, October 2004.

Whilst the mobslinger game concept may appear simple, particularly as it does not require significant user input or display

134

Creating Realistic Collaboration for Action Games Ismo Puustinen and Tomi A. Pasanen Gamics Laboratory Department of Computer Science University of Helsinki Finland http://gamics.cs.helsinki.fi

ABSTRACT Many popular computer games feature conflict between a human-controlled player character and multiple computercontrolled opponents. Computer games often overlook the fact that the cooperation between the computer controlled opponents needs not to be perfect: in fact, they seem more realistic if each of them pursues its own goals and the possible cooperation only emerges from this fact. Game theory is an established science studying cooperation and conflict between rational agents. We provide a way to use classic game theoretic methods to create a plausible group artificial intelligence (AI) for action games. We also present results concerning the feasibility of calculating the group action choice.

Categories and Subject Descriptors I.2 [Artificial Intelligence]: Distributed Artificial IntelligenceMultiagent systems; J.4 [Social and Behavioral Sciences]: Psychology

General Terms Algorithms, Theory

Keywords Game Theory

1.

INTRODUCTION

Most action games have non-player characters (NPCs). They are computer-controlled agents which oppose or help the human-controlled player character (PC). Many popular action games include combat between the PC and a NPC group. Each NPC has an AI, which makes it act in the game setting in a more or less plausible way. The NPCs must appear intelligent, because the human player needs to retain the illusion of game world reality. If the NPCs act in a non-intelligent way, the human player’s suspension of

135

disbelief might break, which in turn makes the game less enjoyable. The challenge is twofold: how to make the NPCs act individually rationally and still make the NPC group dynamics plausible and efficient? Even though the computer game AI has advanced much over the years, NPC AI is usually scripted [10]. Scripts are sequences of commands that the NPC executes in response to a game event. Scripted NPCs are static, meaning that they can react to dynamic events only in a limited way [6]. Group AI for computer games presents additional problems to scripting, since the NPC group actions are difficult to predict. One way to make the NPCs coordinate their actions is to use roles, which are distributed among the NPCs [12]. However, this might not be the optimal solution, since it means only the distribution of scripted tasks to a set of NPCs. Game theory studies strategic situations, in which agents make decisions that affect other agents [2]. Game theory assumes that each agent is rational and tries to maximize its own utility. A Nash equilibrium [7] is a vector of action selection strategies, in which no agent can unilaterally change its strategy and get more utility. A Nash equilibrium does not mean that the agent group has maximal utility, or that it is functioning with most efficiency. It just means that each agent is acting rationally and is satisfied with the group decision. If a NPC group can find a Nash equilibrium, two important requirements for computer game immersion are fulfilled: each NPC’s actions appear individually reasonable and the group seems to act in a coordinated way. This requires that the NPC has a set of goals, which it tries to attain. The NPC gets most utility from the group actions which take it closer to its goals. The NPC’s goals are encoded in a utility function. Defining the utility function is the creative part of utilizing game theory in computer game design, since Nash equilibria can be found algorithmically. In our research we study the creation of well-working utility functions for a common class of computer action games. We note that similar studies have carried out by Levy and Rosenschein [3] for the predator-prey (NPCs-PC) problem domain. However, their paradigm was classical: how to organize movements of the predator agents optimally and precicely when they got near the prey. In our simulation the focus is different. We allow the NPCs to act individually rationally in a way that can be, if necessary, against other agents and their goals. This permits realistic-looking cooperation, which in turn leads to a greater immersion for the player. It is easy to make a NPC in a computer game to be

more dangerous for the PC: the NPC can be made, for instance, faster or stronger. The problem of making the NPCs appear life-like is much more difficult. In addition to this goal, we consider also the problem of finding a suitable Nash equilibrium in reasonable time.

2.

GAME THEORY

Game theory helps to model situations that involve several interdependent agents, which must each choose an action from a limited set of possible actions. Choosing the action is called playing a strategy. One round of strategy coordination between agents is called a game. The game can be represented as a set of game states. A game state has an action vector with one action from each agent. Thus a game has kn game states, if there are n players in the game with k possible actions each. Each agent gets a utility from each game state: the game states are often described as utility vectors in a matrix of possible agent actions. If an agent’s strategy is to play a single action, it is called a pure strategy. Sometimes an agent gets a better expected utility by selecting an action randomly from a probability distribution over a set of actions; this is called a mixed strategy. The set of actions an agent plays with probability x > 0 is the agent’s support. As stated before, a strategy vector is a Nash equilibrium only if no agent can unilaterally change their action decision and get a better utility. Nash [7] proved that every finite game has at least one Nash equilibrium. When a pure strategy Nash equilibrium cannot be found, a mixed strategy equilibrium has to exist. If we find a Nash equilibrium, we have a way for all participating agents to act rationally from both outside and group perspective. Finding a Nash equilibrium from a game state search space is non-trivial. Its time complexity is not known [8]. The n-player time complexity is much worse than the 2player case [4]. The problem is that the players’ strategy choices are interdependent: whenever you change one variable, the utilities for each other player also change. The problem of finding all Nash equilibria is also much more difficult than the problem of finding a single Nash equilibrium. How can we know if the Nash equilibrium we found is good enough? It also turns out to be quite difficult. Determining the existence of a Pareto-optimal equilibrium is NP-hard [1]. Even finding out if more than one Nash equilibrium exists is NP-hard. However, some places in the search space are more likely to have a Nash equilibria, and that heuristic is used in a simple search algorithm to find a single Nash equilibrium [9]. The search algorithm is based on the heuristic that many games have an equilibrium within very small supports. Therefore the search through the search space should be started at support size 1, which means pure strategies. In real-world games a strategy is often dominated by another strategy. A dominated strategy gives worse or at most the same utility as the strategy dominating it. Therefore it never makes sense to play a dominated strategy. The search space is made smaller by using iterated removal of dominated strategies before looking for the Nash equilibrium in the support.

3.

PROBLEM SETTING

A typical action game can be described as a set of combat encounters between the PC and multiple enemy NPCs. The

136

NPCs usually try to move close to the PC and then make their attack. The abstract test simulation models one such encounter omitting the final attack phase. The purpose of the simulation is therefore to observe NPC movement in a game-like situation, in which the NPCs try to get close to the PC while avoiding being exposed to the PC’s possible weapons. A starting point for the simulation is described in Figure 1. The game area consists of squares. In the figure the PC is represented by ’P’ and the NPCs by numbers from 1 to 3. ’X’ represents a wall square and ’.’ represents open ground. ............................. ............................. .............XX.............. ......1.....X.....P.......... .........XXXX................ ...2..XXXXX.................. .......XX.................... ............................. ............................. ....X........................ ....XXX..............XXXXX... ...........X..........XXXX... .......................X..... ......................3...... Figure 1: A starting point for test simulation. The test simulation has a set of rules. It is divided into turns, and the NPCs decide their actions each turn. The PC doesn’t move in this simulation. Each NPC has a set of five possible actions, which can be performed during a single turn: { left, right, up, down, do nothing }. A NPC cannot move through the walls, so whenever a NPC is located next to a wall, it’s action set is reduced by the action that would move it into the wall. The PC cannot see through walls. Two NPCs can be located in the same square, but a NPC cannot move to the square occupied by the PC. In the test simulation all NPCs choose their actions simultaneously from their action sets. However, the Nash equilibrium search is not made by the NPCs themselves, but by a central agency. This is justified, since the chosen strategies form a Nash equilibrium. Even if the computing was left to the NPCs, their individual strategy choices would still converge to a Nash equilibrium. All ”intelligence” of the NPCs is encoded in the utility function, which means that all targeted behavioral patterns must be present within it. Levy and Rosenschein [3] used a two-part utility function with the predator agents: one part of the function gives the agents payoff for minimizing the distance to the prey agent and the other half encourages circulation by rewarding blocked prey movement directions. Following the same principle, the utility function that we use is made of terms that detail different aspects of the NPCs’ goals. Each term has an additional weight multiplier, which is used to control and balance the amount of importance the term has. We found the following terms to be the most important in guiding the NPCs’ actions in the simulation: aggression, safety, balance, ambition, personal space and inertia. The final utility function is the sum of all the terms weighted with their weight multipliers. The terms are de-

tailed below. Aggression means the NPC’s wish to get close to the PC. The term’s value is −xDi , where Di is the NPC i’s distance from the player and x is an additional constant multiplier representing the growing of aggression when the NPC gets nearer the PC. Safety represents the NPC’s reluctance to take risks. When a NPC is threatened, it gets a fine. The fine amounts to −ri /k, where ri is the severity of the threat that the NPC encounters and k is the number of NPCs who are suspect to the same threat. If k is zero, no NPCs are threatened, and the term is not used. In the test simulation a NPC was threatened, if it was on the PC’s line of sight. The divisor is used to make it safer for one NPC, if several NPCs are subjected to the same threat from the PC. Balance is the NPC’s intention to stay approximately at the same distance from the PC as the other P NPCs. The value of balance is −di , where di = |Di − ( D−i )/k| and Di denotes the NPC i’s distance from the PC as earlier, P and ( D−i )/k denotes the average P distance of all other NPCs from the PC. Now |Di − ( D−i )/k| gets bigger as the NPC’s distance from the PC gets further from the average distance. The term is needed to make the NPCs try to maintain a steady and simultaneous advance towards the PC, and to prevent the NPCs from running to the PC one by one. The term is not used if there are no other NPCs in the game. Ambition means the NPC’s desire to be the first attacker towards the PC. If the NPC is not moving towards the PC, ambition value is 0. If no NPC is going towards the PC, the ambition value is tx, where t it the amount of turns in which no NPC has moved towards the PC and x is constant. Personal space is the amount of personal space a NPC needs. The term has value xi , which is the NPC’s distance to the nearest other NPC, if the distance is below a threshold value x . If xi > x , the term has value x instead of xi . This term is needed to avoid the NPCs packing together and encourage them to go around obstacles from different sides. Inertia is the NPC’s tendency to keep to a previously selected action. The term makes the NPCs appear consistent in their actions. If the NPC is moving in the same direction as in the previous game turn, it gets bonus x. If the NPC is moving in an orthogonal direction, it gets the bonus x2 . Inertia helps to prevent the NPCs from reverting their decisions: if ambition drives the NPCs from their hiding places, inertia keeps them from retreating instantly back into safety. Since the utility function provides the information about which actions the NPC values, all other necessary AI functions must be implemented there. The test simulation required implementation of A* algorithm for obstacle avoidance: all measured distances towards the PC or other NPCs are actually shortest-path distances. Game-specific values can also be adjusted in the utility function. For instance, the game designer may want to change the game difficulty level mid-game by tweaking the game difficulty parameters [11]. These adjustments must be made within the utility function, otherwise they have no effect in NPC action selection. The simple search algorithm is deterministic by nature, and therefore the Nash equilibrium found from any given starting setup is always the same. If a mixed strategy is found, randomness follows implicitly, because the action is randomly selected from the probability distribution. How-

137

ever, the algorithm is biased towards small supports for efficiency reasons, and therefore tends to find pure strategies first. The pure strategies are common with the game simulation setting, since the NPCs are rarely competing directly against one another. The game designer may want to implement randomness in the utility function by adding a new term, error, which is a random value from [0 . . . 1]. The weight multiplier can be used to adjust the error range.

4. RESULTS The simulation yielded two kinds of results: the time needed to find the Nash equilibrium during a typical game turn and the NPCs’ actions using the utility function described in section 3. The simulation run began from the setting in Figure 1. Because the computer game AI is only good if it gives the player a sense of immersion, the evaluation of the utility function’s suitability must be done from the viewpoint of game playability. However, no large gameplay tests were organized. The utility function goodness is approximated by visually inspecting the game setting after every game turn. Figure 2 shows the game board on turn 3. The ’+’-signs represent the squares that the NPCs have been in. NPC 2 began the game by moving south, even though its distance to the PC is the same in both the northern and southern route around the obstacle. This is due to the fact that NPCs 1 and 2 were pushed away from each another by the term personal space in the utility function. ............................. ............................. .............XX.............. ......+++1..X.....P.......... .........XXXX................ ...+..XXXXX.................. ...+...XX.................... ...+2........................ ............................. ....X........................ ....XXX..............XXXXX... ...........X.........3XXXX... .....................++X..... ......................+...... Figure 2: Turn 3. NPC 1 has begun to go around the obstacle by North and NPC 2 from South. NPC 3 has moved into cover. Figure 3 on turn 10 has all NPCs in place to begin the final stage of the assault. None of the NPCs are on the PC’s line of sight. The NPCs 3 and 2 have reached the edge of the open field before NPC 1, but they have elected to wait until everyone is in position for the attack. The term safety is holding them from attacking over the open ground. Game turn 12 is represented in Figure 4. The NPCs have waited one turn in place and then decided to attack. Each NPC has moved simultaneously away from cover towards the PC. In game theoretic sense, two things might have happened. The first possibility is that term ambience makes one NPC’s utility from moving towards the PC grow so big that attacking dominates the NPC’s other possible actions.

............................. ...........+++1.............. ...........+.XX.............. ......++++++X.....P.......... .........XXXX................ ...+..XXXXX.................. ...+...XX.................... ...+++++2.................... ............................. ....X........................ ....XXX..............XXXXX... ...........X.........3XXXX... .....................++X..... ......................+......

to lessen the threat value, if at least one NPC is already in hand-to-hand combat with the PC.

Figure 3: Turn 10. All members of the NPC team have arrived to the edge of the open ground. Therefore the NPC’s best course of action is to attack regardless of the other NPCs’ actions. When this happens, the sanction from term safety diminishes due to the number of visible NPCs, and it suddenly makes sense for the other NPCs to participate in the attack. The other possibility is that the algorithm for finding Nash equilibria has found the equilibrium, in which all NPCs move forward, before the equilibrium, in which the NPCs stay put. ............................. ...........++++1............. ...........+.XX.............. ......++++++X.....P.......... .........XXXX................ ...+..XXXXX.................. ...+...XX.................... ...++++++2................... ............................. ....X........................ ....XXX..............XXXXX... ...........X........3+XXXX... .....................++X..... ......................+...... Figure 4: Turn 12. NPC team decides to attack. All NPCs move simultaneously away from cover. NPCs succeed in synchronizing their attack because the Nash equilibrium defines the strategies for all NPCs before the real movement. Synchronization leaves the human player with the impression of planning and communicating enemies. If the NPCs had ran into the open one by one, stopping the enemies would have been much easier for the human player, and the attack of the last NPC might have seemed foolhardy after the demise of its companions. Figure 5 shows the game board on turn 21. NPCs 1 and 3 have moved next to the PC. NPC 2 has found new cover and has decided not to move forward again. The situation seems erroneous, and it is true that the NPC 2’s actions seem to undermine the efficiency of the attack. However, this can be interpreted to show that NPC 2 has reexamined the situation and decided to stay behind for it’s own safety. One way to resolve the situation is to change the term safety

138

............................. ...........++++++++.......... ...........+.XX...1.......... ......++++++X.....P.......... .........XXXX.....3.......... ...+..XXXXX.......+.......... ...+...XX2........+.......... ...+++++++........+.......... ..................+.......... ....X.............+.......... ....XXX...........+..XXXXX... ...........X......++++XXXX... .....................++X..... ......................+...... Figure 5: Turn 21. NPCs 1 and 3 have reached the PC. NPC 2 has gone into hiding once more. Creating a usable utility function is quite straightforward if agent’s goals can be determined. Balancing the terms to produce the desired behavior patterns in different game settings can be more time-consuming. Each different video game needs a specific utility function for its NPCs, since for instance the distance measurements are done in different units. Also the game designer may want to introduce different behavior to different NPC types, which is done by creating a utility function for each NPC class within the video game. The test simulation used the McKelvey et al. [5] implementation of the previously mentioned simple search algorithm. The algorithm’s time complexity limits the simulation’s feasibility when the number of agents in the game grows larger. We measured the time needed for finding a single Nash equilibrium in a Macintosh computer equipped with a 2.1 GHz processor. The extra calculations in the utility function (such as A* algorithm) were left out. The needed time was calculated using the three-agent setting described in Figure 1 and the previously detailed utility function. The measurements were also made using six NPCs, whose starting positions are detailed in Figure 6. Both games were run for 25 turns, and the experiment was repeated ten times. The games had therefore 250 data points each. The test results are presented in table 1. Agents 3 6

min. 42 ms 97 ms

max. 225 ms 2760 ms

avg. 71 ms 254 ms

median 62 ms 146 ms

Table 1: Experimental results for the time needed to find the first Nash equilibrium in attack game. The results show that in a three-player game the Nash equilibrium was found in average within 71 milliseconds. This can be still feasible for real video games. In six-player games the worst calculation took almost three seconds, which is far too long for action games. Still, the median in sixplayer game was only 146 milliseconds, which may still be acceptable. In both games a mixed strategy was never needed:

............................. ............................. .............XX..P........... ......1.....X................ .........XXXX................ ...2..XXXXX...............5.. .......XX.................... .................4........... ............................. ....X........................ ....XXX..............XXXXX... ...........X..........XXXX... ....6..................X..... ......................3...... Figure 6: The starting point for the six-player simulation. the Nash equilibria were always found in supports of size 1. The worst times were measured in the first turn. When a NPC had several dominated actions or was next to a wall, the search was faster, because the search space was reduced.

5.

CONCLUSION

Using game theoretic methods in action games seems to be a promising new approach to do a more plausible NPC group AI. If the agents try to find a Nash equilibrium, their individual decisions are rational. If the agents’ utility functions are designed to find cooperative behavior patterns, the group seems to function with a degree of cooperation. Having agents do their decisions based on agents’ internal valuations helps agents maintain their intelligent-looking behavior in situations, where a scripted approach would lead to nonsatisfactory results. This might make the game designer’s work easier, since all encounters between the PC and a NPC group need not be planned in advance. The problem in this approach is the time complexity of finding Nash equilibria. Finding one equilibrium is difficult, and finding all equilibria is prohibitively expensive. Still, our results indicate that modern computers with a good heuristic might be able to find one Nash equilibrium relatively fast, especially if the NPCs’ action sets are limited and the number of NPCs is small. If the Nash Equilibria cannot be found soon enough, it is up to the game designer to decide the fallback mechanism. The scripted approach is one possibility, and another is to use a greedy algorithm based on the utility functions. A greedy algorithm would not waste time on calculating the possible future actions of the other NPCs, but would assume that the NPCs stayed idle or continued with a similar action as in the previous game round. In the future we wish to incorporate the demonstrated method into a real action game instead of a simulation. Having real human players to witness the change of behavior in the NPCs will help us measure how the different NPC AI schemes affect the overall playing experience.

6.

REFERENCES

[1] V. Conitzer and T. Sandholm. Complexity results about Nash equilibria. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI-03), pages 765–771, 2003.

139

[2] P. K. Dutta. Strategies and Games: Theory and Practice. MIT Press, 1999. [3] R. Levy and J. S. Rosenschein. A game theoretic approach to the pursuit problem. In Proceedings of the Eleventh International Workshop on Distributed Artificial Intelligence, pages 195–213, Glen Arbor, Michigan, February 1992. [4] R. D. McKelvey and A. M. McLennan. Computation of equilibria in finite games. In H. Amman, D. Kendrick, and J. Rust, editors, Handbook of Computational Economics, volume 1, pages 87–142. Elsevier, 1996. [5] R. D. McKelvey, A. M. McLennan, and T. L. Turocy. Gambit: Software tools for game theory, version 0.2006.01.20, 2006. [6] A. Nareyek. Review: Intelligent agents for computer games. In T. A. Marsland and I. Frank, editors, Computers and Games, volume 2063 of Lecture Notes in Computer Science, pages 414–422. Springer, 2000. [7] J. Nash. Equilibrium points in n-player games. In Proceedings Of NAS 36, 1950. [8] C. H. Papadimitriou and T. Roughgarden. Computing equilibria in multi-player games. In SODA, pages 82–91. SIAM, 2005. [9] R. Porter, E. Nudelman, and Y. Shoham. Simple search methods for finding a Nash equilibrium. In D. L. McGuinness and G. Ferguson, editors, AAAI, pages 664–669. AAAI Press / The MIT Press, 2004. [10] S. Rabin. Common game AI techniques. In S. Rabin, editor, AI Game Programming Wisdom, volume 2. Charles River Media, 2003. [11] P. Spronck, I. G. Sprinkhuizen-Kuyper, and E. O. Postma. On-line adaptation of game opponent AI with dynamic scripting. Int. J. Intell. Games & Simulation, 3(1):45–53, 2004. [12] M. Tambe. Towards flexible teamwork. J. Artif. Intell. Res. (JAIR), 7:83–124, 1997.

Gophers: Socially Oriented Pervasive Gaming Sean Casey, Duncan Rowland Lincoln Social Computing Research Centre Department of Computing and Informatics University of Lincoln, Brayford Pool, Lincoln, LN6 7TS, UK

[scasey|drowland]@lincoln.ac.uk through which they can communicate with the gophers. Furthermore, it adds a scoring/jury system based upon player performance, the ability to scale between large and small gaming groups and the notion of cyclic agent lifecycles.

ABSTRACT Gophers is an open-ended gaming environment which relies on location data, user generated content and player interactions to shape gameplay. It seeks to investigate social collaboration within localised and distributed gaming communities, the potential of pervasive gaming as a technique to collect useful geospatial data about the physical world and additionally, use of novel peerjudging methods to allow self-governing of the game world. In this paper, we introduce the game in its current state and provide an overview of early test results.

The game aims to investigate the collection of useful geospatial data through pervasive gameplay, storing both geographically tagged textural and photographic data. Additionally, the research will assess the possibility of using game agents to complete realworld tasks in a collaborative way and suggests peer judging methods which can be used to determine the success of these tasks. The study presented does not attempt to reach any definitive conclusions and instead introduces an ongoing project, which ultimately aims to make important contributions to these research areas.

Categories and Subject Descriptors H.5.3 [Information Interfaces and Presentation]: Group and Organisation Interfaces – computer-supported cooperative work. I.2.6 [Artificial Intelligence]: Learning – knowledge acquisition. K.8.0 [Computing Milieux]: Personal Computing – games.

2. PLAYER EXPERIENCE The Gophers experience revolves around gopher agents that have been given a specific task or mission to perform. A gopher is best considered as a physical item which can be found, picked up and dropped. When not residing on a cell-phone, these entities live at an approximate location in the real world, denoted by a cell mast coverage area. They remain at this location until being picked up or summoned.

General Terms Design, Experimentation, Human Factors.

Keywords Pervasive, social, collaborative, mobile gaming, context, location, cell-id.

1. INTRODUCTION Gophers explores the use of the mobile phone for providing a socially-oriented entertainment experience. The game is based on the Hitchers [6] platform, originally written as the basis of a game created at the University of Nottingham that utilised cell-ID location data to provide a digital hitchhiking experience. Hitchers introduced the notion of beings (hitchers) that jump onto a player’s phone, prompt the player to answer their question and then leave – using location information based upon cell mast id. Gophers extends this model to incorporate task-based gameplay, where agents (more personable gopher creatures) are created and given real world tasks to perform. These tasks must be completed through the collaborative help of other players. The system offers players a number of engaging and varied interaction methods,

Figure 1. A player pauses to photograph their situation The gameplay goal for participating players is to score as many points as possible. There is a publicly accessible Gophers website displaying a live leader-board, through which players can rank their performance against others. An in game economy, based on points, encourages players to engage in the game. Players need to accumulate enough points to create their own gophers. Points are awarded for the following actions: creating a gopher that goes on to complete its mission, helping someone else’s gopher in its mission, playing the Gopher Guessing Game, participating in Jury Service, and moving around the physical world. Two other interactions with Gophers are also

140

possible, specifically providing gophers with gossip and supplying images of their current surroundings. These result in an engaging dialogue with the gopher (rather than points) that is, in itself a reward. Below, player-gopher interactions are described in further detail:

Besides containing information collected during the current task, blogs from previously assigned tasks are also available. It was observed in testing that players became attached to gophers and wanted them to exist after task completion; the cyclic lifecycle detailed in Figure 2, allows for such gopher longevity.

Gopher Creation: A player creates a Gopher by paying a set number of points and providing it with a name, image (what the agent looks like) and task for it to accomplish. The task is a statement assigned to the gopher, which must be satisfied through the help of player interactions. Although the player decides the nature of the tasks, completion of them will rely on the mediums of photos, text input and location data.

3.1 User Generated Content A strong theme when designing the game was utilisation of user generated content. Gophers defines the environment/ constraints within which the game is played, whilst users supply the narrative and game content – a truly open ended approach, which is less expensive or restrictive than use of predefined game content and promotes game ownership by players. Nevertheless, it is important that users do contribute this content, to ensure interesting gameplay; tests indicate that players are more then willing to supply this.

Guessing Game: Gophers can play a location sensitive wordguessing game with their hosts, where players provide a word describing their current location. If the word has previously been entered near their location, the player and all other players who entered the word will receive points.

3.2 Useful Data from Play

Photo Album: Allows user to take photos using their phone and supply them to the Gopher. Photos are tagged with the user’s current location and held in the gopher’s blog. The gopher responds by showing a photo it has previously collected in a nearby location.

Projects such as The ESP Game [1], Peekaboom [2] and more recently Google’s Image Labeller [7] have sparked mainstream interest in harnessing games for useful data collection. By design, Gophers aims to collect image and tag locations. Throughout a gopher’s lifetime it collects a wealth of potentially useful data, in the form of geospatially and temporally linked photographs, descriptive tags and textual information. A visualisation tool to graphically display this data is under development (see Figure 3). The cyclic gopher lifecycle ensures this collection of knowledge is not lost after a mission is completed and instead, continues to grow with the game. This provides interesting content for the blogs and leads to a more engaging Gossip/Photo and Jury Service dialogue.

Gopher Gossip: A user can also pass a piece of gossip to the gopher (a single sentence). As with photos, the new gossip is location tagged and the gopher replies with a historical piece of gossip acquired nearby. Gopher Assist: A gopher’s task is completed in stages by (ideally several) players interacting with the agent and providing information related to the mission. Each time they can help, players supply the gopher with empirical evidence through an interaction (i.e. Gossip or a Photo). When a player believes the Gopher has fully completed its task, they submit it to be judged by Jury Service. Jury Service: Players are selected for Jury Service to determine if a Gopher has completed its task and to reward participants with points. A player can determine the status of a gopher’s task by browsing the gopher’s online blog. Walkabout: A final method whereby players can gain points is running the game in walkabout mode. In this non-interactive game phase, points are accrued every time the phone moves into a new cell-id coverage area.

3. KEY DESIGN DECISIONS

Figure 3. Visualisation of geospatial data

Throughout its life a gopher keeps track of any information it receives in the form of a blog. This can be interpreted as a storyboard that evolves as the gopher’s narrative progresses.

3.3 Technicalities of Guessing Game The game design is based upon The ESP Game [1]. A player receives points for matching any previously guessed words within 5 hops of their cell-id (in similar manner to Hitchers’ Adaptive Search Method [6]). Points are awarded in a ‘dartboard’ style, where the number awarded relate to the proximity of the matched word from the current location and number of times the word has been matched (the more a word is matched, the less points are awarded). This diminishing point system encourages both original and precise guessing. The game also employs anti-cheating mechanisms, which firstly prevent players guessing the same word in a similar location and secondly discourage players ‘pairing up’ to tactically enter the same guesses.

Create gopher Assign Task

PlayerGopher Interactions

Task Complete

Task Assessment, Points Awarded

retask

Figure 2. The cyclic gopher lifecycle

141

3.4 Task Assessment through Jury Service Determining the success and perceived difficulty of a gopher’s mission is a highly subjective matter. Resolving this is achieved through Jury Service. Jury service can be seen as a method of letting the gaming community self-orchestrate and moderate gameplay. A number of users (the last 5% to participate) are selected to act as a panel of jurors. They are instructed to visit a webpage, where the trial is held over a 24 hour period. After logging into the page, each juror is presented with the gopher’s blog and must independently decide whether the mission was completed, how difficult they perceived it to be and rank the users who helped the gopher most. Once the trial is finished, the gopher’s creator is awarded credits relating to the mission difficulty and users who helped in the completion of the task are awarded credits relative to their juried ranking. To encourage participation and accuracy, judges are given credit for the consistency of their answers.

Figure 4. Finding a gopher and viewing its task

4. TECHNICAL DETAILS There is an emerging trend in pervasive games which make use of location data to draw the physical and digital worlds yet closer. Location-specific data has been used for a variety of purposes, such as determining relative and absolute locations [4], triggering gameplay events [5] and superimposing gaming environments over the real world [9]. In the case of gophers, location data is harnessed to determine relative world location between game items.

3.5 Retask/Recycle Once assessment is complete, a gopher returns to the player who originally created them, for retasking. This process allows the player to assign a new task the gopher and re-release it. Retasking can also be initiated prematurely if the player wishes to alter the assigned task (for example, if a gopher reaches a point where no players can complete its task). A gopher retains all knowledge, previous tasks and results/points of those tasks; all which are viewable in its blog.

Numerous games, such as Savannah [4], employ GPS to provide location data. GPS is an established system for high accuracy location data, but unusable for use indoors or in dense city environments. Furthermore, purchasing GPS modules for mobile phones is prohibitively expensive and in our situation, unnecessary. We elect to use mobile phone cell mast positioning, since only coarse location accuracy is required and it offers a cost-free solution which is available on mobile phones without modification.

3.6 Retire When a player tires of one of their created gophers and do not wish to retask, they are able to retire it. The gopher is removed from the game, but its blogs remain for reference.

4.1 Cell Masts for Location Positioning and Distance Estimation

3.7 Gopher Location

Each cell mast has a unique id. The mobile phone (Nokia series 60 in our case) is able to freely read the identifier of the mast to which it is currently connected [6],[8]. This unique identifier can be used to associate gophers, game events and data to a physical location. Additionally, as players move around in the physical world and encounter new masts, they contribute to a dynamically evolving server side graph of interconnected cell masts. Through analysis of this graph, it is possible to approximate the number of hops (and therefore relative distance) between locations.

As players move, the Hitchers platform keeps track of cell-ids that are visited. Walkabout mode encourages player movement and in turn promotes the building of the node graph containing cellphone mast ids (described in Technical Details).

3.8 Gopher Movement Gophers can be found via a location-aware search program. The search returns a list ordered with respect to distance, measured in terms of the number of network hops to each gopher. This distance is reflected as an element of gameplay; when summoned, a more distant gopher will take longer and be more costly to pick up than a nearby one. The time taken is (10 minutes x number of hops) for connected cells, or 1 day for disconnected.

5. PRELIMINARY TEST AND DISCUSSION A preliminary proof of concept 4 day technical test was run between the 20th and 23rd June, where an early release of the game was trialled on various models of Nokia Series 60 phone. These were distributed to both technical and non-technical players. The exercise was designed as an initial test to assess usability of interface, effectiveness of gameplay mechanics and reliability of the software. Prior to the trial, players were given a set of instructions and short introduction to playing the game. Each was given 500 points to create gophers and some introductory gophers were artificially created, to stimulate the gameplay. Afterwards, players were interviewed and questionnaires completed according to the responses. Through analysis of questionnaires, a number of points of interest were noted:

This presents the player with a trade-off between searching for nearby gophers, or being prepared to wait for a more desirable gopher to be transported. The key to this technique is it allows for players from 2 physically isolated gaming communities (on disconnected network graphs) to interact and can scale between sparse and dense player distribution, whilst retaining concept of distance. Through this, massively cross-cultural games could become possible. Once acquired, a player can interact with the gopher and help complete its tasks. When finished with the gopher, the player drops it off at the phone’s real-world location, where it remains dormant until being summoned by another user.

142

(i) Technical Issues: Technical bugs existed, which caused crashing and unexpected behaviour at certain points.

capture their gophers in a state of hibernation. Their collection of gophers is retained on their phone, without them becoming bored and escaping.

(ii) Hoarding Gophers: Some players held large collections of gophers on their phones and would either forget about them, or be reluctant to drop them. This prevents gophers completing missions and leads to less interesting mission logs and a reduced collection of knowledge. Restricting the number of gophers a player can hold and introduction of a ‘boredom threshold’ intend to reduce this.

Mast flipping [6] (the phenomenon of a phone switching between overlapping cells) affected operation of game elements, particularly the guessing game. Originally, the game was designed to match players responses to words guessed previously in the current cell. Flipping meant that two players could guess in the same location but be connected to different masts. This led to some frustrating results in the initial tests and the game was redesigned around this seam, utilising the revised dartboard method (described in Game Design). This introduces an element of luck into the game, with players attempting to overcome the flipping and score a ‘bullseye’.

(iii) Lack of Feedback: If a player received no feedback or reward, they were less interested in utilising certain game elements. In particular the guessing game (guesses were rarely matched as the word database was too small) and gossip (insufficient gossip was collected during a gopher’s lifetime to provide player with feedback). This led to an overhaul of the guessing game, which previously only considered words at the current cell-id and the concept of retasking gophers to allow for larger knowledge base.

6. CONCLUSION Full scale trials of the revised game are now under preparation. The first will take place in and around the city of Lincoln over the coming weeks. This will assess the gameplay mechanics and data collection possibilities on a larger scale. Further tests are planned in which the game will be simultaneously trialled over a 2 week period in various cities around the country and rolled-out to the general public via a web download. This will allow the game to reach a larger, more widely distributed user base, where the gameplay is designed to naturally flourish.

(iv) Gameplay Concepts: Some players did not understand the concept of dropping gophers; responses such as “I don’t want someone taking credit for my gopher…” and “Do I lose a gopher when dropping it?” were noted. (v) Social Collaboration: Playing gophers increased social collaboration outside of the game. One player and her boyfriend played the game together for example, while another discussed game tactics with a group of friends.

6.1 Workshop Outcomes

(vi) Need for Complex Tasks: Tasks assigned during testing tended to vary broadly, between entertaining challenges to more meaningful data collecting missions. Examples included “Collect photos of dogs before Thursday” and “Take photos from high buildings”. Users found creation of tasks to be one of the most enticing gameplay aspects. Originally, these were restricted to a single line descriptor, which resulted in some slightly ambiguous descriptions and constrained the variety and complexity of conceivable tasks. This led us to allow for more intricate tasks (possibly requiring some ‘detective work’) by enabling users to define sets of subtasks or steps.

In participating in the workshop, we hope to ascertain future directions and uses for the research findings. It will also be worthwhile to conduct a critique of the gameplay mechanics. If time and facilities permit, the workshop would also provide the ideal setting for a small-scale trial of the game.

7. ACKNOWLEDGEMENTS This work was made possible through the Hitchers framework developed at the University of Nottingham’s Mixed Reality Lab. We thank Prof. Steve Benford and Dr Adam Drozd for their advice during development and for the loan of the lab's phones.

(vii) Evidence of Cheating: Opportunistic players will use any method to win! There was evidence of players copying guessed words from blogs to succeed in the guessing game.

8. REFERENCES [1] von Ahn, L., Dabbish, L. Labelling Images with a Computer Game. In Proceedings of the ACM CHI 2004, (Vienna, Austria). ACM Press, 2004, 319-326.

These issues are being used to inform the on-going design and development of Gophers and further studies are currently being prepared.

[2] von Ahn, L., Liu, R., Blum, M. Peekaboom: A Game for Locating Objects in Images. In Proceedings of the SIGCHI conference on Human factors in computing systems (CHI ’06) (Montréal, Québec, Canada). ACM Press, 2006, 55-64.

5.1 Reflection on the Seams in Hitchers and Gophers

[3] Bell, M., Chalmers, M., Barkhuus, L., Hall, M., Sherwoord, S., Tennent, P., Brown, B., Rowland, D., Benford, S. Interweaving Mobile Games with Everyday Life. In Proceedings of the SIGCHI conference on Human factors in computing systems (CHI ’06) (Montréal, Québec, Canada). ACM Press, 2006, 417-426.

The use of cell-ids appears an unrealistic measure of travel time; rural areas only receive sparse coverage of mobile phone masts, whereas urban areas are densely packed with transmitters [8]. Alternatively, this could be interpreted as an exploitable seam [3],[6], where urban gophers take longer to travel the same physical distance than country dwelling gophers. This is synonymous to the real world, where travelling around congested city streets is a far more time consuming process than cross country travel.

[4] Benford, S., Rowland, D., Flintham, M., Drozd, A., Hull, R., Reid, J., Morrison, J., Facer, K. Life on the Edge: Supporting Collaboration in Location-Based Experiences. In Proceedings of the SIGCHI conference on Human factors in

Further technological seams were revealed through testing; it emerged that through turning their phone off, a player is able to

143

computing systems (CHI ’05) (Portland, Oregon, USA). ACM Press, 2005, 721-730.

http://images.google.com/imagelabeler/. [8] LaMarca, A., Chawathe, Y., Consolvo, S., Hightower, J., Smith, I., Scott, J., Sohn, T., Howard, J., Hughes, J., Potter, F., Tabert, J., Powledge, P., Borriello, G., Schilit, B. Place Lab: Device Positioning Using Radio Beacons in the Wild. In Proceedings of Pervasive 2005, (Munich, Germany). Springer Verlag, 2005.

[5] Bjork S., Falk J., Hansson R., Ljungstrand P. Pirates! Using the Physical World as a Game Board. In Proceedings of Interact 2001 IFIP TC.13 Conference on Human-Computer Interaction, (Tokyo, Japan). IOS Press, 2001. [6] Drozd, A., Benford, S., Tandavanitj, N., Wright, M., Chamberlain, A. Hitchers: Designing for Cellular Positioning. To Appear in Proceedings of Ubicomp, 2006.

[9] Piekarski, W., Thomas, B. ARQuake: the outdoor augmented reality gaming system. In Communications of the ACM 45(1), 2002, 36-38.

[7] Google Image Labeller.

144

Creating an AI-Test Platform Benoit Chaperot

Colin Fyfe

School of Computing The University of Paisley Scotland

School of Computing The University of Paisley Scotland

[email protected]

[email protected]

ABSTRACT In this paper we explain changes made to a motocross game to allow other researchers to develop and experiment their own artificial intelligence techniques in the game. The game is split between an executable and AI DLL’s.

Keywords Motocross game, software architecture, Artificial Intelligence, Dynamic Link Library, AI, DLL

1.

INTRODUCTION

Motocross The Force is a motocross game featuring terrain rendering and rigid body simulation applied to bikes and characters. In [2] and [3] we have investigated the use of artificial neural networks (ANN’s) to ride simulated motorbikes in this computer game. The game offers a very good environment to experiment with advanced artificial intelligence and data mining techniques; although the control of the bike is assisted by the game engine, turning the bike, accelerating, braking and jumping on the bumps involve complex behaviours which are difficult to express as a set of procedural rules, and make the use of advanced AI techniques very appropriate. Different kinds of ANN’s have been tested to control the motorbikes: feed forward multi layered perceptrons, radial basis functions networks and self organising maps. While experimenting with these different kinds of networks, some difficulties were noticed: • The AI source code is in the middle of the game engine code; this makes it difficult to read and maintain; many files have to be modified to change the kind of AI in use in the game; files can not be swapped between researchers due to intellectual property issues.

Direct comparison between different kinds of AI is not possible. In the research community, it happens often that one researcher presents a new algorithm or AI or data mining technique, and presents the technique in the context of one particular problem or application. The applications used are often very different. It appears that there is a need in the research community for common platforms to evaluate and benchmark individual AI techniques. This has been discussed at the CIG06 conference in Reno, USA. For data mining it is common practice to use standard datasets to evaluate and compare different classification techniques (see for example [4] ). For video games, there seem to be a need for more common platforms to compare AI techniques. Splitting the motocross game, between the game engine on one side and the AI on the other side, allows for easier AI code maintenance and implementation, and changes the game into a common open platform for many developers and researchers to test and compare different techniques. In this paper we detail how we have split the game, and how to implement new AI for the game.

2.

CHANGES TO THE ARCHITECTURE

The game has been split between the game engine on one side and the AI on the other side, the changes in architecture are detailed below.

2.1

The Original Architecture

• Only one kind of AI can be tested at a given time.

Figure 1: Original game architecture. 1. The game is composed of one executable and some data files.

145

2. The AI source code is in the middle of the game engine code; this makes it difficult to read and maintain. 3. There are some intellectual property issues in that only the authors can implement an AI for the game. 4. Only one kind of AI can be used at a time in the game to control motorbikes. Each motorbike can use its own data file. It is not possible to directly compare AI techniques.

2.2

The New Architecture

Figure 3: Screen shot taken from the game, the white crosses on the left hand side represent WayPoints. Figure 2: New game architecture.

3.3 1. The game is made of one executable, some DLL’s and some data files. DLL stands for Dynamic Link Library. Typically DLL’s provide one or more particular functions and a program accesses the functions by creating either a static or dynamic link to the DLL. In the context of the new architecture for the motocross game, each DLL implements one kind of AI and is linked dynamically. 2. The AI source code is separated from the game engine code, with one small DLL project per AI; this makes it easy to read, implement and maintain.

Game Engine

There is one game engine, it is implemented by the executable. It is everything but the AI, and is responsible for updating the simulation.

3.4

AI

There is one AI per computer controlled bike. Each AI can be written to or read from an AI data file. Each AI makes use of an AI DLL. More than one bike can share the same DLL. An AI is trained to make a decision given a situation.

3.5

Situation

3. The AI part is separated from the game engine and is open source. Anyone can implement an AI for the game.

The situation is the general state of the bike, position, orientation and velocity relative to the ground.

4. Many kinds of different AI’s can be used at a time in the game to control motorbikes. Each motorbike can use its own DLL file and its own data file. It is possible and easy to directly compare AI techniques.

These are commands and are the same as the controls for the human player:

3.

GAME CLASSES AND STRUCTURES

Before implementing AI for the game, it is important that the user has a good understanding of the various classes and structures in use in the game.

3.1

WayPoint

WayPoints are markers positioned on the centre of the track, every metre along the track, and are used: • To give course information to computer controlled bikes, i.e. position, direction and width of the track. • To monitor the performance: a bonus can be given to a computer controlled bike for passing a WayPoint.

146

Decision

• Accelerate, brake. • Turn left, right. • Lean forward, backward.

3.7

Track

A track is a course over which races are run. Typically a track is of variable width along its course. The tracks are marked using WayPoints.

3.2

3.6

SampleData

SampleData is the main structure used for communication between the game engine (executable) and the AI DLL. Typically the game engine fills the situation fields of the structure and pass the structure to the AI DLL; the AI DLL fills the decision fields of the structure, given the situation, and passes it back to the game engine; the game engine updates the state of the corresponding computer controlled bike and the simulation accordingly.

3.8

Training Set

A training set is a structure used for the training of AI. A training set is a collection of SampleData’s made from the recording of a human player playing the game. Each sample contains a situation and the corresponding human player’s

decision. AI’s are trained to make the same decision as the human player, given a situation.

3.9

Training Functions

Terrain

Structure used to give ground height information.

3.10

• Generate AI From Training Set: this function is called every time the user wants an AI to learn from a training set. The DLL loads and processes the training set.

Weight

A Weight is an AI parameter that is to be optimised using for example Genetic Algorithms. Typically a weight is a connection strength between two neurons in an ANN.

3.11

Genetic Algorithms

Training is considered as an optimisation. GA’s are used to improve the AI’s performance by modifying AI weights. GA are implemented by the executable.

4.

4.1.2

These functions are typically used for training the AI using back propagation techniques.

• Generate AI From Training Set Update: this function is called once per game update: the DLL updates the training or generation of an AI from a training set; typically there are 25 backpropagation iterations per game update and 100 game updates per second. • Is Generating AI From Training Set: returns true if the AI is currently training from training set.

4.1.3

IMPLEMENTING NEW AI

Weights Functions

These functions are typically used for training the AI using genetic algorithms techniques. • Put Weights • Get Weights • Set Generation Figure 4: Communication between the game engine executable and the AI DLL. An important feature of the architecture is that the executable calls DLL functions, for example for decision making, but the AI DLL’s can also call executable functions, for example to obtain more information about a situation, before making a decision. It is a two way communication process.

4.1

DLL Functions

In order to be recognised by the game executable each AI DLL must be placed in a particular folder and implement a set of functions; these functions can be grouped into categories.

4.1.1

General Operation Functions

The general operation functions are: • Creation: this function creates an AI and returns a void pointer on the newly created AI to the executable. • Destruction • Decision Making: this function returns a decision to the executable, given a situation. • Render: this function is called every time the game is rendered; this gives the opportunity to the AI DLL to display AI information; this is mainly used for debugging purposes. After AI creation, the executable keeps a void pointer to the AI and passes it as a parameter to all subsequent calls to the AI DLL.

147

• Get Generation • Save Weights • Get Number Of Weights

4.1.4

Version Functions

• Get AI Name: returns AI DLL name, one name per AI DLL. • Get AI Version: returns version of AI implemented. • Get Debug: returns true if this is a Debug version of AI DLL.

All these functions were found useful to carry out our experiments. To make the architecture simpler, an AI DLL is required to implement all these functions in order to be recognised by the executable. If some functions are not needed (for example the user does not want to use GA), then the user can simply create empty functions. The executable and AI DLL’s make use of the DirectX library for vector and matrix structures and operations.

4.2

Executable Functions

The executable makes the following functions available to AI DLL’s.

4.2.1

WayPoint Functions

AI DLL’s can make calls to the executable to obtain the following interpolated information about WayPoints: • Transformation matrix • Position

• Direction

3. The AI is created using the matching AI DLL CreateAI function, and all future AI function calls will call the matching AI DLL functions.

• Width The information is interpolated between WayPoints. The functions take two parameters, a WayPoint index, with one WayPoint every metre along the track, and a distance in metres along the track from this WayPoint.

4.2.2

Drawing Functions

AI DLL’s can make calls to the executable to draw the following kind of primitives on the screen: • 3D Vertices • 3D Lines

5.

ILLUSTRATIVE RESULTS

It was feared that this new AI system would run slower than the original AI system because of the high number of calls between the executable and the DLL’s. It actually runs faster, despite the high number of calls between the executable and the DLL’s, because the AI code is now tidier and smaller than before. Two different kinds of AI have been implemented using the new architecture, feed forward multi layered perceptrons (MLP), and self organising map (SOM). The source code is inspired by [1]. Source code and a free version of the game will be made available at the following address:

• Text http://cis.paisley.ac.uk/chap-ci1

• Rectangles These functions are used mainly for debug purposes and take a colour as one of their parameters.

4.2.3

Track Functions

AI DLL’s can make calls to the executable to obtain information about tracks and terrain: • Height: a function returns the terrain height at a given position. • Track creation: the DLL can load tracks; this is useful when processing training sets; a training set can contain training data from more than one track. • Track unique identifiers: the DLL can check that a track has not changed since the time the training set was generated.

4.2.4

Other Functions

• Get Version: returns version of the executable. • Forward transform: a function returns a space centred at the origin of the motorbike; the Z axis points up and the Y axis follows the horizontal velocity direction. This space is more convenient than bike space to represent and transform world objects in relation to the bike.

4.3

Operation

It is possible to easily compare in real time the two kinds of AI. The two kinds of ANN’s used the same number of weights and were trained using the same training data. It appears that the MLP performs better but the SOM is less computational intensive. The ANN’s behave differently and make different decisions and mistakes. Increasing the number of weights for the SOM networks enables for better performance but the performance is always less than that of MLP networks. The main problem with the SOM’s is that their internal operation prevents them from differentiating between important and not so important inputs. The networks fail to make decisive decisions.

6.

Future work may include the creation of a Motocross The Force Cup, where many different AI’s will compete at racing bikes in the game. Other work may involve the use of large scale distributed processing to optimise existing AI using genetic algorithms.

7.

The new AI system works as follow:

CONCLUSION

This new architecture for the game allows us to easily develop and experiment with AI techniques. Source code and a free version of the game will be made available soon. We hope that many researchers will join us at developing new and innovative AI.

REFERENCES

[1] M. Buckland. http://www.ai-junkie.com/. Technical report, 2005.

1. The executable loads all DLL’s contained in a given directory; if the DLL implements all functions described above, and versions match, then it is kept loaded, otherwise it is unloaded. 2. Each computer controlled bike loads its own AI data file; each AI data file is to be associated with an AI DLL. The association between data files and DLL’s is done by matching the name contained in data file header with the names given by the DLL’s.

148

[2] B. Chaperot and C. Fyfe. Motocross and artificial neural networks. In Game Design And Technology Workshop 2005, 2005. [3] B. Chaperot and C. Fyfe. Improving artificial intelligence in a motocross game. In CIG06, 2006. [4] D. Michie, D. J. Spiegelhalter, and C. C. Taylor, editors. Machine learning, neural and statistical classification. Ellis Horwood, 1994.

A Novel 3D Game API for Symbian OS Smartphones Fadi Chehimi

Paul Coulton

Reuben Edwards

Infolab21 Lancaster University, Lancaster LA1 4WA, UK +44 (0)7731395946

Infolab21 Lancaster University, Lancaster LA1 4WA, UK +44 (0)1524 510393

Infolab21 Lancaster University, Lancaster LA1 4WA, UK +44 (0)1524 510392

[email protected]

[email protected]

[email protected]

customer base of over 126 million gamers [2]. If this number is compared to the 3 billion expected mobile users in the same time frame [3], and the $11 billion to be generated from mobile games, we notice an obvious disparage. The proliferation of mobile games is relied on a number of factors including audience demographic and technical challenges. Whilst a number of challenges can be identified in this paper we are concerned with those technical issues that affect 3D mobile games in particular. And in the following paragraphs we shall explore some that particularly affect games development.

ABSTRACT Mobile phones are becoming one of the major personal entertainment devices amongst the general public. We are currently seeing a paradigm shift from the traditional voicecentric applications to those that incorporate music, pictures, video, games, internet browsing, etc. Although mobile games represent a significant portion of the mobile entertainment market, they still have some way to go before they reach the revenues generated form the downloads of ringtones, music, and wallpapers. This is due to a number of factors such as the wide ranging demographic of mobile phone users and restrictions of the mobile platforms. Of these platform restrictions one of the most problematic issues for many game developers is the lack of adequate support for 3D games development. To alleviate this situation and provide a unified framework we introduce in this paper a novel 3D mobile games API that simplifies the development of 3D games on Symbian mobile phones and allows the production of more feature-rich titles.

One factor is the limited-resources nature of mobile phones compared to those experienced on PC’s and game consoles. Amongst these restrictions the following are the most significant:

Categories and Subject Descriptors J. Computer Applications, J.7 Computers in Other SystemsConsumer.



Limited memory;



Relatively slow processors speed;



Lack of dedicated graphics processors;



Small screen sizes and resolutions;



Limited user input/output interfaces;



Short-operation battery life.

These limitations have a direct correlation to why many current mobile games being merely 2D ports of old arcade games. This has led to many traditional gamers being disappointed with the perceived poor quality, features, and limited 3D experience available in these games.

General Terms Design, Experimentation.

Keywords

A second contributing factor affecting development has been the absence of 3D-graphics-dedicated APIs that are tailored to address these limited devices. Some highly experienced developers and game houses have developed their own 3D APIs [4] but these have not been open or standardized for the general community of mobile game developers. This has recently and partially been resolved by the introduction of the following APIs [3]:

Mobile 3D Games, Mobile Phones, Symbian OS, OpenGL ES, 3D Game Engines, 3D Game API, Game Design, Game Structure.

1. INTRODUCTION The console and PC games industry is expected to reach a worldwide figure of $47 billion by 2010 [1] with an anticipated



OpenGL ES; for Symbian OS, Binary Runtime Environment for Wireless BREW, Mobile Linux and Windows Mobile



Mobile 3D Graphics M3G; for Java 2 Micro Edition J2ME

A final factor to consider is that having more than 20 different mobile platforms and operating systems makes it very difficult

149

object of the game world, or level, is instantiated automatically. However, the object still needs to be constructed by the programmer as a lineartree, quadtree or octree and parameterized with its dimensions and representing data file(s).

for game publishers to port their games and engines/APIs across all platforms; thus, limiting the potential revenue for a game. In this paper we introduce a novel 3D gaming API as a solution to this portability problem which will provide a common upper layer interface for the Symbian platforms and OpenGL ES 1.0 (Embedded Systems). The Symbian OS has been chosen for this games API as it is the leading mobile platform in the market [5] with more than 70 million devices shipped by March 2006 [6]. OpenGL ES 1.0 is the graphics API selected as it is native to the operating system and supported by most new Symbian phones. Whereas, the latter versions of this API, 1.1 and 2.0, are currently still limited to laboratory devices. Nokia has recently released its multimedia phone the N93 which is the first to support OpenGL ES 1.1 specification [7], and many forthcoming models will undoubtedly utilise this standard.

3. THE DESIGN OF THE API Syga-PI3D is still in its evolution and this paper presents the work completed for the first development stage. Two more stages will follow which will be highlighted in subsequent sections. This first stage includes the implementation of the core structure of the API with its main functional features alongside some external help tools. The other stages will include the artefacts and the performance optimisations in addition to an expansion of the building blocks. In this section we discuss the overall design of the API and explain how the external tools cooperate within its framework.

The API developed, which has been termed Syga-PI3D (Symbian Game API 3D), is still under development and in this paper we present the first stage which is of a three-stage development cycle. In the next section we shall introduce Syga-PI3D and its features before section 3 discusses its design and external supporting tools. In section 4 we present the API’s implementation whilst sections 5 and 6 will identify its features indicating those that are completed and those which still have to be implemented. In section 7 we draw our overall conclusion and discuss the future evolution of Syga-PI3D.

3.1 Design Overview As mentioned previously, Syga-PI3D is designed to interface Symbian OS and OpenGL ES hiding their implementation and providing abstract layer for programmers independent from any UI framework, as shown in Figure 1. As we will expand upon in a latter section Syga-PI3D uses the Window Server (WSERV) of Symbian OS directly, without any specific UI-dependencies to build game environments (GameEnv). GameEnv sets the screen size, enables/disables full screen drawing, selects screen colour mode (maximum 64K including alpha channel), sets game popup menu options, and handles user input and events. It also instantiates under-the-hood OpenGL ES environment (OGLESEnv) and the game spaces-tree, associates manually added game objects and models, and manages scene rendering with collaboration between the programmer (manually) and GameEnv (Automatically).

2. SYGA-PI3D IN A NUTSHELL Syga-PI3D is an application programming interface (API) that enables mobile game developers to devlop 3D games for Symbian smartphones without the need to consider: •

the particular coding conventions of the operating system,



the different user interface frameworks,



and the syntax of OpenGL ES.

Applications/Games UI

It acts like an upper-layer interface on top of Symbian OS and OpenGL ES providing API facets and functions that will help in creating feature rich 3D games.

Syga-PI3D OpenGL ES WSERV

To aid developers in addressing phone limitation issues several offline supporting tools have been implemented for Syga-PI3D to help in constructing 3D game environments. These tools are specifically aimed at minimising the excessive processing on these power-hungry devices. The tools are responsible of tasks like building optimized data files, for game objects and characters, partitioning game world spaces, and enabling selective scene rendering.

Symbian OS HARDWARE

Figure 1. Integration of Syga-PI3D with Symbian OS Platform and OpenGL ES API OGLESEnv initializes OpenGL states by the programmer calling GameEnv API functions to set up lighting properties, shading modes, material colours, perspective frustum, hidden face removal, clipping, culling and other 3D effects. Default settings are used if no values are specified by the programmer. An alternative way to set up the states is by coding OpenGL commands in one of the pure virtual functions that need to be implemented in any Syga-PI3D game.

Syga-PI3D implements many game-design requirements that would be expected for any 3D game. For instance, the API provides implementations for; collision detection algorithms, texturing and textures controlling, space partitioning procedures, rendering management systems, various 3D effects, models loading and optimization, and networking facilities for multiplayer games. The programmer need only instantiate objects of these systems in his/her game and the API will do their specified tasks with minimal input required from his/her side. For example, when a game environment is set up, a spaces-tree

OGLESEnv always enables a camera which has a default position at origin (0,0,0) and orientation towards the negative zaxis (0,0,-100). Note that the right-hand coordinate system is

150

GameEnv. This will allow the API to manage the object/model throughout the game. The loading will start behind the scene without the programmer needing to initiate parsing and data manipulation. Note that all loading processes are performed asynchronously and optimised against failure.

used where the positive x axis is the thumb, positive y axis being the index finger and positive z axis being the middle finger outwards the page. The camera API has been added to Syga-PI3D since cameras are required features in any 3D game. It is implemented to facilitate different viewing options for various game genres. It enables panning view in single plane, the thirdperson view, in Doom-like games, and a character-following view whereby the camera follows the main character in game, such as driving games.

Figure 2 summarizes this general description of the Syga-PI3D API and identifies how its components integrate with each other for any Syga-PI3D-based 3D mobile game. Symbian Application

GameEnv instantiates a spaces-tree structure to manage polygon distribution in a game for use by rendering and collision detection systems. The game space dimension is specified by the programmer and the partitioning process will be performed by the API automatically. There are three different types of space partitioning supported: •

lineartree where the space is partitioned into linearly linked spaces as an array of spaces,



quadtree where the space is divided into four subspaces recursively until a specific depth is met,



and octree where the space is split in the same manner as for quadtree but into eight subspaces instead.

Symbian OS

Container Class

WSERV

Syga-PI3D API Virtual functions implementation

OpenGL ES

OGLESEnv GameEnv OGL States Camera

Only lineartree and quadtree have been implemented in stage I. The remaining partitioning method will be made available in stage II. Note that programmers using linear partitioning have to construct each space node individually and add it to the linear tree. The API will only construct its root.

Objects/Models

Spaces Division

Attributes

Space data

Rendering

Model data

Collision

Texture loader

Texture(s)

Partitioning

Occlusion

Used by programmer

Used by API

OS level

App level

API level

Figure 2. Syga-PI3D API Underlying Design.

The data required to partition a space and distribute its polygons, for collision detection and scene rendering, is maintained from the .sgd (Symbian Game Data) data file generated by the Space Partitioning Collision Detection (SPCD) tool. This file is parsed in the construction phase of the spaces-tree. The file has to be passed to GameEnv in order to associate it with the global tree.

3.2 External Tools Syga-PI3D facilitates the management of game assets with the support of four offline help tools. The first tool, called Symbian Object Data (SOD), creates static objects’ data files in the .sod format. This format is extracted from OBJ-files with optimized contents such as fitting an object’s data arrays into byte or short types in order to minimize the memory required. It also adds header information such as the sizes of these arrays to eradicate the need to calculate them at runtime. Once created, a .sod file need only be associated with its appropriate object instance in the game in that object’s constructor.

Another design feature implemented by Syga-PI3D, and managed by GameEnv, is the rendering management system. It includes occlusion and scene division and requires interaction with space partitioning. Feasible rendering is a crucial and intricate process that requires intensive research into possible solutions that are optimal for mobile phones. It has not been implemented in stage I and thorough research is being conducted to select an appropriate model.

The second tool termed SPCD (Space Partitioning-Collision Detection) deals with a more critical game structure feature: space partitioning. It divides the game world into spaces in a tree structure with four nodes (quadtree) or eight nodes (octree) in a .sgd file. Should this division be performed at runtime it would be computationally expensive requiring many levels of recursion, which would put a heavy burden on mobile phones’ dynamic memory and processor cycles. Thus, off-loading this requirement from the device is a good practice since it lessens the power drain of the battery.

Last but not least, any game must have characters and objects to create premise for the game play. Syga-PI3D’s design provides support for importing objects and models. “Objects” refer to static objects in a game like elements of a level such as bridges, statues, weapons lying on the ground and power-ups scattered throughout the level. Whilst “models” refer to animated game characters such as the main game characters, competitive characters for other players or even non-player character(s).

Each space node in the tree contains the objects, or models, that belong to it and a set of polygons of the world design that intersect with it. Adding polygons is performed by the SPCD which utilises the game world model data file(s) generated by SOD. The SPCD then extracts the world’s polygons and spreads them in the tree spaces they optimally belong to. However, adding objects and models is done in their construction phase.

Similar to space-partitioning, objects and models are represented with data files and textures generated by external API tools. Objects use the .sod (Symbian Object Data) data file format and models use .smd (Symbian Model Data). The developer needs only to instantiate an object, or a model, and pass its particular data file to its constructor, then associate it to

151

The game entry point starts in CApplicationContainer class, for a particular application called “Application” thus the name CApplicationContainer, where CGameEnv should be instantiated by the programmer. With the aid of functions in CGameEnv he can set game timer, represented in CPeriodic which is a timer API of Symbian OS, set OpenGL states via COGLESEnv which communicates with OpenGL ES API in the operating system level (Figure 1), and selects which tree structure to use, either CQuadSpacesTree or CLinearSpacesTree. The appropriate game space data file created by SPCD will be used here and CParser will generate its data structure. Note that if CLinearSpacesTree is chosen the programmer has to instantiate as many CSpace3D objects as is required in the game.

The GameEnv instance in the game is passed as a parameter to the constructor of the object or model which then traverses its spaces-tree to find where it should add itself. This division of the game world into spaces and distributing polygons and objects helps with collision detection. The player, or the main character in the game, will need to test collision only with polygons and objects/models that exist in its same game subspace. The third and forth tools have yet to be implemented but their functionalities have been identified. The Symbian Model Data (SMD) tool will generate the .smd data files for in-game animated models. The particular format has not been finalized but most likely it will be an optimized version of Quake III MD3 format. The last tool, the OCC, will be used to manage occlusion, occluding objects, scene rendering and scene texturing in games. Both tools will be introduced in stage II of Syga-PI3D.

At this point the game world is ready for action but the particular game objects and models have to be loaded to initiate this action. Each space has a reference to all objects/models that belong to it in a linked list structure. The list is obtained by the spaces base class CBaseSpaces3D which includes other space-specific information like dimension and position each represented with TVector3D objects. (The math in TVector3D has been optimized to meet the mobile platform requirements). From this list objects in any space can track which other objects to detect collision with efficiently. Loading models is not supported in this stage but should behave similar to loading static objects when addressed in stage II.

The external tools are all console-based at the moment and will be associated with graphical user interface in stage II.

4. SYGA-PI3D IMPLEMENTATION Syga-PI3D has been developed with object-orientation in mind and in order to write efficient and reliable games on Symbian platform using Syga-PI3D or any other framework, certain operating system dependent considerations have to be retained. In this section we cover the implementation structure of SygaPI3D and introduce critical Symbian-dependent issues and how these are addressed in the API.

In CApplicationContainer, the programmer has to instantiate CObject3D for each object in the game. The .sod data files are used here for CParser to construct the objects. It has to be noted that objects must adhere to certain complexity limitations to retain performance of games on mobile phones. The maximum number of faces allowed for any object in a game is 2000, but 200 is recommended to maximise speed and memory benefits. Due to the small physical screen sizes on mobile phones this level of details should be sufficient without sacrificing any visual quality.

4.1 Implementation Structure To utilise all features mentioned in Section 3.1, programmers have to instantiate objects of each feature API class. Figure 3 shows the structure and relations of all API classes that have been implemented in stage I. The figure identifies the classes in generic terms without reference to their data types. However, each class, except TVertex3D, has floating-point and fixedpoint implementations. «interface» MS3DGame

A distinctive feature in CObject3D API is the ability for its instances to detect collision automatically within games. There exist collision detection routines in this class that are managed by Syga-PI3D to prevent objects from colliding with each others in their space or for characters to walk through walls (polygons) in that space. Each space, linear, quad or oct, has a linked list structure (CPolysList) holding all the polygons (CPolygon3D) that belong to it. A CPolygon3D object is composed of three vertices of TVertex3D type. The reason for having a different class for vertices and not using TVector3D is relied to the size of a TVector3D object. TVector3D contains three float or fixed (32bit: s15.16) members giving a size of 12bytes for each vertex. This size is not convenient for the low-memory mobile phones since it will result a large size of geometry data arrays. Therefore, TVertex3D was introduced with short type members with a size of 6byte per vertex reducing by this the total size into the half. Note that short will always be used as SOD generates files with arrays of bytes or short integers.

COGLESEnv OpenGL ES

CLinearSpacesTree

CApplicationContainer

CSpace3D

CGameEnv

CBaseSpaces3D

TVertex3D

CPeriodic

CQuadSpacesTree

CParser

CLeafSpace

CObject3D

CPolygon3D

TVector3 CPolysList

Figure 3. UML Model of Syga-PI3D Classes

As shown in Figure 3, MS3DGame is an interface that must be inherited by any Syga-PI3D game’s container class. All events

152

Clean-up Stack which keeps a reference to heap memory used in order to delete it in the case of leaves and prevent it being orphaned. As for error handling, memory management is systematically implemented in Syga-PI3D to provide the appropriate and secure ground for building high quality mobile games.

handling, game initializations, and game loop are manipulated via pure virtual functions of this interface and have to be implemented. The GameInit() function will initialise OpenGL states with default values. However the programmer can amend these values via the helping functions provided by CGameEnv or by hard-coding OpenGL ES commands in GameInit(). The latter method is a flexible design feature of Syga-PI3D which enables experienced developers to set the states that are not addressed by the API, and which opens the door for future compatibility with new versions of OpenGL ES.

4.2.3 Event Handling and Windowing System One of the most critical learning curves in developing applications for mobile phones is the effectiveness and accuracy in handling a large number of events. The presence of radio, shared recourses, asynchronous operations, operating system management and third-party applications all create a rich pool of events to handle. This in not limited to Symbian OS only but it is a common denominator for all mobile platforms.

GameLoop() is another pure virtual function which controls the execution of game components and its CPeriodic timer. Here the programmer will set the number of frames sought after in the game and the API will synchronize its operations to meet that frame rate. If meeting the target is unsuccessful, the default of 25 frames/sec will be used.

In Symbian OS, event handling is part of the windowing system, which Syga-PI3D uses natively. This means the operating system already implements functions to handle user events. Such functions have not been abstracted by our API meaning that programmers have to use them and know where to place eventprone code. Howvere, it will appear in stage II.

The last virtual function has not been developed as yet but must be implemented in games for future use. GameEventHandler() handles user events and sends them to the API for processing. As mentioned earlier, this functionality is already implemented elsewhere in the operating system level and the programmer must use it instead. Abstracting and directing this function to Syga-PI3D has to be resolved and it will require implementing a customised and optimized windowing system that is expected to improve game performance.

4.2.4 Synchronous Vs. Asynchronous Operations Symbian OS allows the usual context switching between threads and its distinctive cooperative multitasking technology: Active Objects. Explaining what and how Active Objects work is beyond the scope of this paper but in a nutshell, Active Objects provide a multi-threading-like behaviour in a process but within a single thread in that process instead of many. Active Objects will share the time-slice of the processor assigned to their thread, or process, by cooperating between themselves in performing a single or set of tasks. They cooperate asynchronously in an eventdriven manner where the operating system searches through the outstanding Active Objects until it finds a completed one and runs its handler function, while the others keep running in the background. This reduces the overhead incurred by threads context switching on the kernel scheduler, memory management unit (MMU) and hardware cache. This is required as the state of the running thread and its memory in use at the time of preemption has to be saved in order to start from that point once the thread’s execution is resumed. Last but not least, the context switching mechanism drains a lot of power and this should be avoided in battery-powered devices like mobile phones [8].

4.2 Symbian-Dependent Considerations Just like any modern multitasking operating system, Symbian OS’s architecture uses many advanced, but classical, constructs including pre-emptive multitasking threads, processes, asynchronous services, and internal servers for serializing access to shared resources [8]. However, Symbian OS has some particular features that have to be considered in order to write effective and robust mobile applications. In this section we highlight some of the major issues and explain how Syga-PI3D makes use of them.

4.2.1 Error Handling At the time of development of the Symbian OS, C++ conventional exception handling was not part of C++ standard. When it was introduced, it was found to add substantial overhead to the size of compiled code and to runtime RAM. Thus, Symbian has implemented a different handling approach represented in Traps and Leaves [8]. This error handling mechanism has been considered in the design of Syga-PI3D and implemented in all its classes. This means that programmers will be safe using the API functions without concern for code crashing. However, if programmer-defined, non-API functions cause an error, or “leave” in Symbian terms, programmers will have to handle these themselves.

Active objects are used in all loading and parsing functions in the API. For instance, LoadOBJ() function in CObject3Df class is implemented using asynchronous Active Objects where the game would not block waiting object data to be loaded from .sod data files.

5. FEATURES IMPLEMENTED As stressed throughout the paper, Syga-PI3D is in its first stage of development. The features of the API are scattered in its three phases allowing the opportunity of testing and bullet-proofing each stage components individually in order to build a solid and robust game API. The features that have been completed in this stage are as follows:

4.2.2 Memory Management Memory leakage is a serious problem on mobiles phones with the limited memory available as it is not possible to simply reboot the mobile phone to mop up any memory leaks [8]. The Symbian OS is practically designed to manage its memory resources if a failure or leave occurs [8]. This is done by implementing a

153

I.

Space partitioning, where a game’s world is subdivided into smaller spaces recursively to enable more advanced features.

resources. These APIs will have on-phone and offphone testing modules to enable testing games’ realtime performance and assets production/integration.

II.

Collision detection, which is empowered by space partitioning.

III.

Environment set-up, including Symbian-related and OpenGL ES properties.

IV.

API core implementation, with integration between it and Symbian OS.

II. Math optimizations, by implementing look-up tables for sin, cos and tan math operations used in games. Their software-only implementations by the operating system are slow. Look-up tables may enhance performance when used instead. Also, the square root operation for fixed point will be implemented to speed up calculations.

V.

Basic 3D effects, like shading, lighting, skybox, camera, transparency, blending, fog, etc.

VI.

External support tools; which will offload memory and processing overhead from mobile phones.

VII.

Materials and texture mapping; limited to 2D texture of JPEG format only. More formats may be supported in the latter stages.

VIII.

Loading objects; OBJ-format used at the current time

IV. Customized windowing system; rather than using robust Symbian’s WSERV framework which contains plenty of useful and effective operations but unneeded in Syga-PI3D, we will implement our customized version with support to the only used operations.

Integrated abstract interfaces, allowing easier use of Symbian’s and OpenGL ES’ functionalities.

V. Texture caching, to limit the load of data transfer through the limited bus bandwidth.

IX.

III. LOD for models and objects; this is very crucial enhancement since it will reduce the number of polygons to render on each frame resulting on less possible flickering displays.

“Particle Systems” which provide effects for fire, rain and water are vital graphics systems to any modern 3D game. Nevertheless, they have been excluded from the current design of Syga-PI3D because of relative limitations and complexity. Supporting such features on mobile phones would drain battery, memory and processing resources quickly since they will be dealt with only in the software level. Some graphics processors like the new PowerVR MBX from Imagination Technologies provide hardware support for these effects [10] but they are not yet widely available on commercial phones. An implementation for a particle system will appear in a future version of Syga-PI3D once it proves feasibility and optimality for mobile phones.

6. FUTURE FEATURES The next two development phases of Syga-PI3D will focus on the visual artifacts in a game and performance optimizations of the API. Stage II will implement the following eye-candy effects: I. Mip-map texturing, but without the ability to control textures’ level of detail (LOD) as it is excluded from the specification of OpenGL ES 1.0 [9]. II. Animated models loading, which adds realism and interactivity to games. III. Lens-flare, blur, and reflection effects; the last two will implement stencil buffers.

7. CONCLUSION

IV. Rendering APIs, for scene management including occlusion systems, occluders detection, height mapping and scene partitioning.

Revenues in the mobile sector are very high in mature markets like in Japan and South Korea where dedicated platforms and sophisticated solutions are widely available. Both mobile service aggregators and mobile manufacturers work together to create the right environment and tools for mobile entertainment services to explode. We have studied the lessons learned from these leading markets and thus produced Syga-PI3D to be the adequate solution for developing feature mobile 3D games in the European market. It will enable developing games more quickly and efficiently, and will facilitate embedding commercial messages and content in games for marketing purposes [11]; thus, opening new doors in front of not only game publishers but also innovative marketers to use the most innovative medium: Mobile Phones.

V. SMD and OCC tools. VI. Octree structure, the last space partitioning criteria. VII. GUI for the external tools. VIII. 2D and text display, for sprites and game counters. IX. Playing audio. X. Networking facilities for multiplayer games, including APIs for Bluetooth, General Packet Radio Service (GPRS), and probably Session Initiation Protocol (SIP) when it becomes available on Symbian OS phones.

8. REFERENCES

Stage III will concentrate on performance issues, testing and optimizations. Up to this point the following are to be covered in this stage:

[1] Hatfield D., “Game Industry Set to Explode”, IGN, News, June 21, 2006, http://uk.pc.ign.com/articles/713/713784p1. html, accessed September 2, 2006

I. Testing APIs; will be added to help programmers debug their games and monitor the used/abused

154

[2] Wegert T., “Gaming 101”, ClickZ News, September 22, 2005, http://www.clickz.com/showPage.html?page=3550216 accessed September 1, 2006

[6] Symbian Ltd, http://www.symbian.com, accessed August 20, 2006

[3] Chehimi F., Coulton P., and Edwards R., Advances in 3D graphics for mobile phones, Proceeding of the 2nd IEEE International Conference on Information & Communication Technologies from Theory to Application, Damascus, Syria, April 2006

[8] Stichbury J., Symbian OS Explained, John Wiley & Suns Ltd, West Sussex, 2005, pp: XII, 13, 29, 111-126

[7] Nokia, http://www.nokia.com, accessed August 23, 2006

[9] Astle D. and Durnil D., OpenGL ES Game Development, Thomson Course Technology, Boston MA, 2004, pp: 45-49 [10] PowerVR MBX Demos, Imagination technologies, http://www.imgtec.com, accessed September 1, 2006

[4] Chehimi F., Coulton P., and Edwards R., Evolution of 3-D games on mobile phones, Proceeding of the IEEE Fourth International Conference on Mobile Business, Sydney, Australia, 11-13 July 2005

[11] Chehimi F., Coulton P., and Edwards R., Mobile advertising: practices, technologies and future potential, Proceeding of the IEEE Fifth International Conference on Mobile Business, Copenhagen, Denmark, June 2006

[5] Evers J., “Symbian Threats Multiply”, ZDNet UK, January 23, 2006, http://news.zdnet.co.uk/internet/security/ 0,39020375,39248514,00.htm, accessed September 3, 2006

155

On Demand TV With Consoles – Prototyping For The PS2 John Sherlock School of Computing and Mathematical Sciences Liverpool John Moores University Liverpool [email protected]

1. INTRODUCTION

ABSTRACT

The underlying idea of this project was to decide whether consoles as a platform could be used and developed into mediums for On-Demand TV, with the creation of a prototype if possible. This paper will detail the research process and areas of development and research which have been undertaken in order to ascertain that the development of consoles as On-Demand TV mediums was & is possible and then the creation of a prototype in order to prove consoles can be used as mediums for On-Demand TV and that respectively within the research the PS2 Linux & PS2 can become mediums for On-Demand TV.

This paper will describe the issues and areas of interest undertaken during the research and developmental process of creating the software application for the PS2, to allow the handling and integration of On-Demand TV within Consoles. To portray and deliver to yourselves and understanding of how the development and research process has been undertaken in order to develop the initial prototype of console On-Demand TV development the following subject areas will be described within the context of this paper.

This paper will describe the issues of development regarding the creation of an application capable of handling the media and media streaming required using a variety of codec’s, the platform of the PS2 Linux and PS2 with regard to the issues and advantages of using these platforms for the initial prototyping of this new medium., the usage of a Multicast Network for stream testing instead of a standard unicast or broadcast network typology and following on describing the use of the Open Source VLC media player for the production of the player with regard made to the issues experienced and the development process to eventually build a prototype within the PS2 Linux environment capable of media streaming and standalone playback support.

This paper will touch upon the issues regarding the application in development, the On-Demand TV platform, the media player in development for the console, the networking and VLC application being used as the underlying development tool for the Media Player. Furthermore, this paper will detail the issues regarding the PS2 infrastructure and how this affected the development, the issues regarding the buffering of the media stream being played with respect made to the limited memory available and solutions posed to solve this problem and the issue of producing a standalone PS2 version of this source code from the version currently within the PS2 Linux.

To conclude we will describe the PS2 infrastructure and explain how this has affected the development of the prototype on the PS2 Linux and will affect the production of a Standalone PS2 version, with regard made to the limited memory resources available.

General Terms Documentation, Performance, Experimentation, Theory.

Design,

Reliability,

2. The Objectives of Development Regarding On-Demand TV & Consoles

Keywords

The objectives for the project are designed to investigate the technologies and development of the infrastructure regarding OnDemand TV services within a selection of console platforms. Currently within the technical industry there are several broadcasts worldwide developing mediums and services to begin offering On-Demand TV to their consumers via existing or newly developed hardware in order to offer a selection of replays of programs or broadcasts on a multitude of devices from palms, tablets and many mobiles devices along with the currently more functional mediums of set-top boxes.

Console Development, Media Streaming, Multicast, PS2 Linux, PS2, VLC, Buffering.

This new age of TV viewing mediums is coming from the current age were consumers are demanding more entertainment access during the parts of their day were they are free, instead of just during the evening. PC’s & TV’s are still being used, as indicated

156

The current stage of the development has the player developed on the PS2 Linux Platform working and capable of playing with playback support streamed or standalone media files in a varying number of formats from a selection of sources up to a capacity of 8mb for buffering which is the issue described further within the paper, looking to be resolved using a revised version of the ffmpeg codec with changes made only to the memory capacity of the buffer in respect to the PS2 architecture. The player currently has the capability of being able to play a selection of files formats as described within the section 2.3 capable of streaming videos with synchronised audio using a X11 Window for the graphical showing of the movie clip being played when required and the interface information bar if live radio is being played. The graphical user interface of the player is currently working on two of the three levels available within the player.

by Amazon’s release of their PC based service and Apple’s release of their TV based service that links to their PC. The emphasis within this project will be to concentrate on the areas surrounding On-Demand TV with concentration on the PS2 as a medium for the prototype and research, with the possibility of evolving the development onto other consoles, such as the PSP, PS3 & Xbox & Xbox360 in the future. The project has researched into the areas of On-Demand TV and Media Streaming with reference to protocols, standards & codec implementation to support most video formats within the prototype player, which is currently still within the development stage, but will or has incorporated the prototype development with the ability to host a connection to the internet via a Multicast network to support media stream and the complexities of buffering along with the ability to playback the source files either from a stream source or standalone source using the appropriate codec with all playback functionality of pause, record and rewind. The main issues within the project currently regard the development of a solution to the main problem facing the production of the prototype, and how this problem has arisen and can be solved by gaining an understanding of the subject area with regard to the areas of media streaming, and then concentrating on the issue regarding the buffering of larger file streams via the FFmpeg codec.

To begin the player can be accessed and controlled using the Remote Control interface using a console or terminal window to control the actions and events within the player, and then the graphical development included the use of the Ncurses interface which uses a textual interface and the keyboard command control system for controlling the events within player within a interface format as depicted below. This interface gives the users of the software a more user controlled interface without the need for code commanding the interface, simple using the keyboard letters for loading play list and playback support of the playing media.

2.1 Prototype Development After initially gaining an understanding that the project to understand whether the a console platform could be used as a development medium for On-Demand TV with the final goal of creating a medium capable of being used within the consoles available to the general public within their homes, hopefully using devices they currently have with little or no additional devices required. The development of the prototype with the final aim of producing a version capable of being played and accessed using a standalone console without any development tools. The project began by producing a prototype using Linux in order to ascertain the approach and procedure necessary for building the source code of the chosen player, of Video LAN Client, VLC. This approach to the development of using the Linux first, allowed me to ascertain what files were necessary to build the player with the necessary codec’s for effective playback of the file formats required for this project and the procedures for installing the codec’s. This enabled me to save time when developing on the PS2 Linux and concentration of the issues and complications the PS2 was causing in the development knowing what was necessary for building the player and codec’s if they were to install probably. This development was completed within a couple of weeks after different development techniques and other media players were tested in order to ascertain how effective the VLC was at performing the necessary functionality of a Media Player. I can confirm the VLC is a highly effective and functional player capable of handling the tasks required of the Console development with the added functionality of being able add and remove different functional properties in order to produce a player with the necessary operational functions to produce an effective player to meet the needs and requirements of the prototype development.

The next stage of interface development which has been developed is the fully graphical interface as depicted below, which gives the player it’s true form and user control using both keyboard and mouse for playback support and selection of files to loaded to play list or played. This interface uses the WxWidgets GUI coding behind the VLC player, to control the loading and modules necessary for loading the graphics templates from the stored images within the player’s resources to the screen when loaded. WxWidgets is a cross-platform tool for the development of interfaces and applications with a native look and fell, as the development libraries within the toolkit uses their own controls for development written within the code instead of emulation or

157

upgrading and installing such as automake and autoconf using for the configuration of the libraries within the players makeup. Further issues came form the lack of some basic development libraries for C programming which were present but wrapped up within the PS2 Linux dedicated development, therefore libraries un-related to the development of the prototype directly would be required to be installed, this would be the cause of errors arising within the development of the VLC Player source code, which would be tracked down as related to missing headers or the inability to read or ascertain the effect of code, especially the referencing of external files for status or data regarding the development when building the code which would initiate the streams.

third party making it a self sustaining toolkit for development without any dependencies. The next stage of the development is to produce a standalone version of the prototype, working solely on normal PS2 with the possibility of connecting one of several storage devices to enable the buffering and storage of recorded data. This development stage will be the final stage, proving fully that consoles can be used as a medium for On-Demand TV within the next generation of television viewing hardware’s. Currently the thinking and research from development forums regarding the PS2 Linux are all saying that the exportation of developed software to the PS2 is impossible, but as will be described further on within the paper, there is I believe hope within several different approaches to solving the problem of impossibility.

In addition to the above, further limitations would arise when discussing the actual resources available to the PS2 Linux within the PS2 which is reported within sources of the official Linux community state, which from experiences I believe to be substantiated is the fact that the Linux OS only has about a 1/3 of the PS2 hardware resources available to the OS making the development process and execution of code a slow process. The build of the source code once configured for the VLC Media Player takes about 4Hrs to complete before installation.

2.2 The PS2 Linux As A Developing Platform As a development platform for the initial prototyping of the player within a console, the PS2 is more than capable of handling the requirements of media streaming and even with the constraints of the Linux OS within the PS2 Linux the PS2 is more than capable and can handle the requirements and demands upon a system the functionality of creating and maintaining a media player designed to support media streaming a live, real time high quality video streams with Bitrates of 350 Kbps.

When compiling the code within the Linux OS the process for compiling is slightly held back due to the limitations of the OS, solely down to the age of the OS development libraries which have caused the need to upgrading and re-installation of key compiling applications in order to ensure and continue the development of the player to the stage of today’s current version. The file structure of all files downloaded as libraries or codec sources, especially when architecture reliant has been controlled by the PS2 Linux OS which is Red Hat Version 6 with i386 architecture.

The advantages of the Linux environment within the PS2, is that even with the constraints of console programming with regard to the different architecture with regard to memory management, this development kit allows for development on the PS2 consoles, in a more usable manner than directly onto a PS2 Dev Kit without the connection to a PC, which would be a nice but costly development approach for this research and prototyping. Therefore the Linux kit gives a very usable platform for development, even with the problems of speed and resources available can become a powerful tool for development when understood is more than capable of producing some clever and high spec work using simple language architecture with a PS2 twist on the code when regarding setting the parameters. Within this development no requirements for actual PS2 Linux code has been used nor required as the development has been written and using directly C code, with the hope when attempting to export code from the Linux Kit to a normal PS2 this would help in the development of a standalone version, as no dependencies are directly linked to the PS2 Linux Dev Kit, just leaving the dependencies of the architecture of OS for accessing the VLC code and executable.

The main constraints of the development process to date has been the PS2 Linux environment itself and the time taken to build and compile code as well as to access and load any files of any size above the range of 5-10mb taking longer and longer to load. Which over the current course of this development has caused many of days were the PS2 has been solely building one file from the morning to the evening before any testing and investigation into the file and development can be undertaken, this is one particular area of the PS2 Linux which I loath, but otherwise over the last several months I’ve enjoyed more and more the development process of this prototype always being tested and challenged along the way by the PS2 and it architecture to install or compile the necessary code to resolve issues being experienced, especially the personal joy when issues are resolved especially the experience and greater understanding of the PS2 learnt during every encounter with issues spanning from libraries to codec irregularities or the initial missing of necessary headers to coding languages.

The Limitations of the PS2 Linux come from several sources when regarding the issue the OS within the PS2 Linux is over four years old and then all of the development libraries are four years old. Therefore the initial problems and issues faced when installing and building any code, which isn’t reliant upon the languages of SPS2 or PS2GL that are supported by the PS2 Linux for actual PS2 development and further fully supported. This initially caused problems when beginning the development of the player in the initial days as new compilers such as gcc needed

2.3 Components Of VLC Media-Player & Support

158

Even thought the player is open source and therefore pre-coded and tested for you, the PS2 Linux shows that nothing can be taken for granted within it development of software as to just going to work or should be easy to develop. There are so many small problems related to destination of source or libraries or missing libraries which are standard upon a normal Linux or Windows platform which simply aren’t present upon this PS2 Linux platform, the problems and issues experienced during the development of this prototype especially regarding the development of a GUI for the prototype have been documented within the prototype build guide, giving a full reference to the problems and solutions faced during the build ready to allow further development or understanding when this development process currently ceases in December. Generally problems have arisen from compilation limitations of the PS2 Linux Environment, especially with modern codecs. Luckily to date these problems have all been overcome and the development of the player has been able to continue to the current stage of development.

The media player created for the prototyping of the On-Demand TV with consoles, has been created with the following media codecs and streaming codecs incorporated, to create a player capable of support as many file formats and possible, within the confines of the original open source code from VLC, (www.videolan.org). The codecs incorporated into the player are mentioned below, with regard to the libraries incorporated into the build of the player and the player’s functionality.

2.4 The Multicast Network & Media Streaming Multicast Networking is the next generation within packet transmission typologies for the internet for either streaming or the send of data. Multicasting will allow for the necessary improvement in bandwidth in order to enable smoother and TV quality of streaming video via the Internet to peoples multiple devices capable of receiving the signal. In regard to hardware, Multicasting requires no further technology than current networks and broadband users currently the only difference between unicast, broadcast and multicast relates to the transfer protocols used within them for sending & receiving packets along with how the packets are sent. Instead of the older and current system as used within today’s internet servers when your search or request anything from the internet your request is broadcasted across the entire internet and the response is broadcasted across the entire internet in order for you to get a response from your request. Know even encryption this service has be found and proven several times to have security falls, when this information has been intersected and used for purposes to which you didn’t intend, therefore a new casting medium was required which naturally was more secure and offered more than just encryption to users, whilst at the same time began to resolve the problem of wasted bandwidth within the internet in order to improve the speed and traffic flow within the network.

When starting the prototyping of the player, the PS2 Linux architecture and compiling tools needed upgrading or installing such as new gcc compiling tools along with autoconf and automake, in order begin the process of building the source code necessary to produce a prototype media player, further changes along the way included the necessity to install a selection of libraries and codecs which in some cases required alterations, recoding or on occasion worked without any need for attention. The libraries installed on the PS2 Linux for the development of the player include the following, Gettext, LibMad, LibMP3Lame, LibMpeg2, FFMpeg, LibA52, LiveMedia, XVideo, WxWidgets, Glib.2.12.1, Pango, Atk, Gtk+2.8.2, X264 & H.264, MP3Lame, Faad, Flac, Vorbis, Ogg, Theora, Cairo, Gcc, Automake, Autoconf, LibXml, Fontconfig, Pkg-Config, Facc and Freetype. The general method of installing, without regard to particular files which would required the architecture of the machine the installation will be ran called within the initial configuration or the configuration of which file and features of the library you specifically wont installed, likewise within VLC, the configuration command is,

./configure --enable-x11 --disable-gtk --enable-sdl -enable-ffmpeg --with-ffmpeg-mp3lame --enable-mad -enable-a52 --enable-dts --enable-libmpeg2 --enable-faad -enable-vorbis --enable-ogg --enable-theora --enable-faac -enable-mkv --enable-freetype --enable-flac --enablelivedotcom --with-livedotcom-tree=/usr/lib/live --enablecaca --enable-skins --disable-kde --disable-qt --enablewxwindows --enable-ncurses --enable-release --enablex264 --enable-mpeg2dec –with-ffmpegtree=/root/vcldeps/deps/F/ffmpeg-20051126/ --disablewxwidgets –disable-skins2

Multicasting works as depicted below using a casting medium which transfers packets across a network, by using the server as an entity host and allows data to be sent to multiple users using the same packet, but instead of multiple packets being sent via the

Following on from the configuration, the standard make, make install and ldconfig to update the system registry is required.

159

The situation of buffering concentrates on the clients machine, and incorporates the event of storing a downloaded portion of the clip being streamed until a set percentage of the clip has been downloaded or the buffer is full before playback begins and the buffer begins to release the store data to the screen whilst continuing to capture the latest streams arriving before they are played to the clients screen. If working correctly, the user shouldn’t experience any delays in viewing and should have a smooth viewing. Buffering also helps to overcome problems if streams are delayed, but instead arrive before required due to the buffers storage of clip portions allowing for playback until the buffer is empty.

server the one packet created is cloned to produce the necessary number of packets required to send data to the users clients requesting the packet. Using this methodology of packet creating, Multicast has the ability of transfer the packets to clients using two different transfer protocols, firstly as depicted, point-tomultipoint and then multipoint-to-multipoint which simply sends packets from a collaboration of connected servers or source destination to the necessary final destination nodes, using the same cloning technique for multipoint transmission. This transfer protocol is technically referred to as Asynchronous Transfer Mode. Generally when referring to Multicasting, we refer to IP Multicasting, which uses clients IP’s as their address for sending data packets directly to them instead of broadcasting across the entire network. The disadvantages to this model of multicasting is associated to the necessity to have a larger state diagram and analysing of the paths available within the network in order to track and maintain a connection between servers and clients than the current IP Unicast model of best effort, which works on attempting to transport packets according to it best assumption of travel and packet sizing. Due to this issue regarding the state of network required, currently multicasting is unable to handle the traffic and demand as currently being placed upon today’s system. Even though this is holding back the development currently of a commercially public wide launch of a Multicast network for Broadband users, the development of Multicast within the realms of private & virtual networks such as intranets within academic institutions and industrial networks are smaller and disassociated from the internet directly. Such developments of this are Multicast services, are referred to as Internet Relay Chat, (IRC) which is more pragmatic and scalable for a larger capacity of smaller groups of connected users. Such as MSN, Yahoo Messenger & Conference calling systems, for either ‘many-tomany’ communications or more private ‘one-to-one’ communications. Finally Protocol for Synchronous Conferencing, (PSYC) is an accommodating text based protocol for delivery of data to a compromising user base, with the application base of chat conferencing, multicasting presence, friend casting, news casting and plain instant messaging.

The problems only arrive if the buffer isn’t capable of handling the file sizes of the stream before becoming maxed or the buffer can’t hold enough data to ensure a smooth playback. This is the issue with this current project, as the PS2 VLC Player’s Buffers aren’t large enough to store sufficient clip data before playback can begin due to the size of the streams being transmitted pushing the 8mb limit of the buffer currently incorporated within the player. The problem concerns the buffering capacity allocation made between ffmpeg, the controlling codec for playback of mpeg’s and streaming media, and the PS2 Linux Environment. During recent tests in order to establish the current working threshold of the player regarding the maximum file size the buffers are capable of handling, before system calls are made to attain further memory aren’t either received or understood as no attempt by the PS2 Linux OS to resolve the problem is logged, only the actions of the ffmpeg codec attempting to alter the stream’s size and resource demands to allow for playback to resume. Unfortunately the codec’s attempts are futile and no further playback is resumed during the three occasions the codec’s alters the size of the stream before the stream is closed, therefore I believe the only solution available to resolve the issue is to edit the codec’s buffering code, which directly controls the size of the buffer created in order to allow the player to support larger streams for playback according to buffer sizes set to enable playback of streams with bit rates of at least 150 Kbps to a maximum of around 400 Kbps for video streaming.

This allows for reduced bandwidth consumption allowing for improved bandwidths generally across the network and the integration of larger files being streamed between viewer and source of a quality matching current TV standards, instead of pixilated images, due to the cloning of the packet allowing for the same packet being sent to multiple users without the need of casting multiple packets which with the file size required for smooth, high quality streaming of live TV would undoubtedly cause congestion with the network.

2.5 The Possible Solutions To Enable The Prototype To Work On A Standalone PS2 Currently from researching the available techniques for exporting work from PS2 Linux to normal PS2, the information being found all points to fact that it impossible to remove a development off the Linux Platform and produce a working version on a normal PS2 environment without the use of software or tools which are illegal or breach the copyright laws regarding your PS2 such as the home brewing software loaders which require a patch to remove security protocols off the PS2 to allow downloading your memory card and application execution from the memory card along with plug & play support for external memory storage via USB devices.

Media streaming has the same principles as video or audio streaming via the Internet to a PC or any other device as it does within On-Demand TV Streaming. The only differences within the system arise from the situation of the actual data being streamed and how the data is supplied. Within On-Demand, streaming data is stored with cold sites already available and formatted for playback and transmission without any need for Digitizing or encoding, as would be required for the transmission of live broadcasts or VoIP.

During the course of the prototype development many different solutions for producing a method that would allow for the export of the prototype from the PS2 Linux to the PS2 have been

160

deliberated over in order to ascertain whether they are possible and could actually be used to create a standalone PS2 version, detailed below is a selection of some of these ideas regarding the aim of exporting the prototype on a medium capable of being loaded on any PS2 with or without Linux and the hardware which accompanies and Linux Kit.

and played on any PS2 within the country and allowing them to watch streaming videos as long as the have a network adapter connected to their PS2 to download the streams from any source. Following on from this development there is always the possibility of editing the prototype from the PS2 to be incorporated to work upon other consoles on the market, such as the PS3 using their internal Linux OS, Xbox 360 using their XNA development software and Nintendo Wii using their Development Kits rumored to be offered on sale 12months after the launch of the Wii console this year.

The ideas currently in thinking are, The produce of a virtual kernel which would be stored upon a memory card and could be accessed and loaded when the PS2 boots or upon request when browsing the memory card, which would along with an external storage either a CD or USB Device use the memory card program to load a virtual boot loader similar to the Linux Boot sequence except the boot would load the PS2 to a status window capable of loading a GUI, therefore this would require the loading a simple OS with no support for any other applications other than the media player which would be available for execution upon request from a Interface Menu which would load offering either internet access or the player or load a startup script once the boot sequence had been completed which would load the player. The player would need to be stored upon an external medium or device as the memory card couldn’t handle the file size, therefore the use of either a CD is required to store the players executables and libraries or a more viable option and external USB device which would all for the device to be used as a medium for buffering & caching the streams being played within the player, as the PS2 internal memory couldn’t be accessed via these methods.

3. ACKNOWLEDGMENTS I would like to thank and acknowledge Dr Abdennour El-Rhalibi & Dr Mark Price for their help & guidance throughout this project current development and future development.

4. REFERENCES [1] Producing Interactive TV by Annesa Hartman [2] Get Streaming by Joe Follansbee [3] Streaming Media by Tobias Kunkel [4] The Business of Streaming & Digital Media by Dan Rayburn & Michael Hoch [5] http://en.wikipedia.org/wiki/Streaming_media [6] http://www.ps2.geek.nz/site.dyn/Specs/ [7] http://www.datel.co.uk/products.asp [8] http://ffmpeg.mplayerhq.hu/index.html [9] http://research.scea.com/research/research.html [10] http://www.videolan.org/doc/vlc-user-guide/en/ch06.html [11] http://forum.videolan.org [12] http://wiki.videolan.org/index.php/Additional_Interfaces [13] http://wiki.videolan.org/index.php/Multicast [14]http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/ip multi.htm [15] http://www.videolan.org

Another option to resolve the problem of exporting the prototype, could be initiate a PS2 + PC link, which would required the PS2 to be loaded to a state capable of being able to receive data from a cross-over network connection between the PS2 & PC which would allow the PS2 to be used as a medium for showing the medium being played, whilst the PC handles all of the actions regarding connection, streaming, buffering and playback support. Another option which is possible more viable, would be to use the interface for accessing the internet via a PS2 currently, to access their community page and replace the destination IP to link to the HTTP Interface Connection of a VLC Player which would give full playback support and the ability to connect to any available stream and view the relevant images and video within the HTTP interface window. Still this option would required the source CD, used to load the network connection within a PS2 to include the compiled source of the player in order to be able to load the HTTP interface for VLC.

2.6 The Future The next stage of development with the prototype currently lies at producing a standalone PS2 version capable of being executed

161