IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016) 12-12-2016
DOI: 10.1109/SAPIENCE.2016.7684145
Game Balancing with Ecosystem Mechanism Wen Xia
Bhojan Anand
School of Computing National University of Singapore Singapore
[email protected]
School of Computing National University of Singapore Singapore
[email protected]
Abstract—To adapt game difficulty upon game character’s strength, Dynamic Difficulty Adjustment (DDA) and some other learning strategies have been applied in commercial game designs. However, most of the existing approaches could not ensure diversity in results, and rarely attempted to coordinate content generation and behaviour control together. This paper suggests a solution that is based on multi-level swarm model and ecosystem mechanism, in order to provide a more flexible way of game balance control. Keywords-component; Game balance control; particle swarm optimization; procedural content generation
I.
INTRODUCTION
Behaviour control, Procedural Content Generation (PCG) and game balancing are playing important roles in modern game AI system design. Behaviour control enables the computer-controlled elements appear to make smart decisions when the game has multiple choices for a given situation [1]. PCG refers to the creation of content automatically through algorithmic means [2]. Game balancing is the fine-tuning phase in which a functioning game is adjusted to be deep, fair, and interesting [3]. As the quality requirements for game industry arise rapidly, observable techniques have been consecutively applied such as the Radiant AI used in The Elder Scrolls V: Skyrim (Bethesda Softworks, 2011) [4] and the neuro-evolutionary training in Supreme Commander 2 (Square Enix, 2010) [5]. In order to enhance player experience by adjusting game difficulty levels [6], several researches have been done in the last decade [7]- [11]. One of the most effective strategies is the dynamic difficulty adjustment (DDA) systems, which numerically [7] or structurally [8], adjusting game objects based on the understanding of players. However, when the game world is complex, it is not easy to map the player preferences and pleasures to content, gameplay, or genres from player-log [16]. Besides, when a game needs to evolve both behaviours and contents, it may cause extra burden for developers to figure out how to handle potential inconsistency conflicts among different learning processes. Ecosystem research is an integrated study of producers, consumers, and decomposers in relation to climate, topography, and soils in space and time [14]. One of the key rules of an ecosystem is “survival of the fittest”.
The idea of this paper is inspired by the following two similarities among the game system and ecosystem: l Diversity of agents. l Most of the agents only need simple behaviours and short-term goals. We propose an online learning strategy upon Particle Swarm Optimization (PSO) in a multi-level structure, which contains both the functionalities of behaviour controller and content generator. It can be easily designed and managed by game developers, and generate results with diversification. The remaining sections of this paper are arranged as follows: Section II discusses about related works of DDA techniques, PSO algorithms and PSO’s applications in games. Section III explains the methodology of our design. Section IV is about implementation experiments. Conclusions are presented in the section V. II. A.
DDA Techniques
DDAs have either of the two goals: l Adjust behaviour rules for game objects. l Generate adaptive game contents. We will discuss about them separately. 1) Behaviour Control Dynamic scripting controls the behaviours of an agent by adjusting weight values of the probabilities when choosing a rule set from rule-bases [6, 17, 18]. It is effective and robust, but lacks in diversity because the result is highly tended to be static and convergent [19]. In addition, dynamic scripting fails at some points because the scripts are made from a rule database hard-coded by the developers, thus limits how far the AI system can progress [36]. Artificial Neural Network (ANN) is inspired by the mechanism of brains. It is often very effective, but can be computationally expensive, especially for more complex networks [20]. Most of the neural algorithms cannot easily be adapted incrementally but would generally require complete retraining online [37]. In multi-objective evolution, ANN highly depends on reliable fitness information, which is a problem when evaluations are noisy [21]. Genetic Algorithm (GA) is inspired by genetic mutation in biology. It must restart evolution from very start if new traits
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
RELATED WORKS
2 are added into consideration [22]. Although GA and PSO on average yield the similar effectiveness (solution quality), GA is less computationally efficient [42]. 2) PCG Experience-driven PCG aims at obtaining higher satisfactory in variability, reliability and quality, as well as making the generation process more controllable [2]. The results show that this approach could generate pretty accurate contents for the tested players [2, 23]. However, there are still several open problems, such as interruption of gameplay experience in data collection and optimization of the algorithms in personalized content generation [27]. Search-based PCG suggests another approach by setting evaluation function for candidate contents, and gradually chooses proper contents instead of simply accepting or rejecting [24]. Although substantial progress has been made [25, 26], this technique still encounters a number of challenging problems such as content representation, content space management, content quality evaluation, content generation efficiency and so on [24, 27]. Based on the results of experience-driven PCG and searchbased PCG, the learning-based PCG suggests a three-stage framework which includes development stage, public test stage and adaptive stage [27]. However, developers need to annotate games in data-driven methods. Thus it inevitably involves developers’ efforts, and it is non-trivial for developers to choose the correct content features that can elicit different affective/cognitive experience for players of different types [27]. B.
PSO and Its Applications in Games Particle swarm optimization (PSO) is a group-based stochastic optimization algorithm that was originally introduced by Kennedy and Eberhart [12,13]. It focuses on finding an optimal solution from individual’s own experience as well as the knowledge of what the neighbors have obtained [15]. The solution of a PSO is regarded as a particle, of which has its own position (state) and velocity (disposition), as well as the ability to gain information from neighbors (local or global, decided by the topology used). Because of its efficiency in solving some complex optimization problems, PSO is considered as a robust and popular technique. However, PSO has highly centralized pattern in which the swarm diversity decreases very fast [28]. This often results in premature convergence if the global best particle corresponds to local optima. Besides, PSO is easy to be trapped by evolution stagnation if population size is too small, or inefficient converging speed if population size is overlarge [29]. To solve these problems, pretty much advanced topologies have been researched, such as MultiSwarm Optimization (MSO) and its variants [30, 32, 33], Waves of Swarm Particles (WoSP) [31], hierarchical swarm optimization model and its variants [45, 46]. PSOs are widely used in games for estimation of player’s position [34], path planning [35, 38], board game AI evolution [39, 40], Iterated Prisoner’s Dilemma (IPD) game strategy improvement [41], etc.. In some cases, PSOs are used as the
training method for ANN models. For example, in [43], the hybrid PSO-ANN is used to settle tower defensing cannons on the map and in [44], PSO is used to train ANN controllers, such that the character actions of two teams can perform in a balanced way. In summary, we conclude that so far there exist very few game AI system designs that consider both behaviour-wise and content-wise learning functionalities together. Most researches apply learning upon play-log, such as average health [6], death rate [9], shooting accuracy and so on. Distinguishing to what extent a play-log should be is not easy, whereas categorizing these data and responding them correctly in the game may cause even more workload. There are few effective methods that can show satisfactory diversity in results. So we aim to build a system that can evolve behaviour and content together, whereas using a different approach which is easier to be designed than play-log approach, as well as showing diversity in results. III.
We assume that the game objects are simple in actions and large in amount. For the games where the game objects have very complex behaviour logic or small in amount, our approach may not be applicable. There are three primary goals for our design: l Contain behaviour control and PCG with one architecture. l Goal-oriented, which means there is no need to map the detail relations among actions and contents. l For same conditions, maintain diversity by showing different results in both behaviours and contents. The system is search-based. It is inspired by the natural hierarchical system mentioned in [46]. But instead of building levels upon actual agents, we do it upon functionalities. The core structure of our system is illustrated in Fig. 1. In one Multi-Level Swarm Model (MLSM), there are two levels. Level 1 aims at behaviour control. It contains all Ecosystem Implementation Models (EIMs). One EIM is considered as one learning swarm which contains one type of game objects. One game object will act as one particle of the corresponding swarm. An EIM has its own performance measure rules and goals, and it will evolve its behaviours from all available actions in the action base that are assigned to the corresponding game object. Level 2 is the EIM manager which aims at content generation. The whole EIM manager will be considered as one swarm, where different particles correspond to different combinations of game object numbers and types. The two levels are learning alternatively through a scheduled flow process. The learning process is illustrated in Fig. 2. In Fig. 2, one learning cycle of level 1 is one iteration, and one learning cycle of level 2 is one generation. Iterations are sub-sections of generations. n refers to the number of iterations that one generation contains, and k refers to the
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
SYSTEM DESIGN
3 threshold that triggers the first learning cycle of EIM manager. Both n and k are manually set, and their values are influenced
by how stochastic and complex the game environment is.
Fig. 1. Multi-level swarm model (MLSM)
Fig. 2. The learning flow of MLSM IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
4 The learning algorithm that we use to train each individual component is standard PSO [15]:
Where t is the current time step, i means individual, g means neighbor, v is disposition into next time step, p is the best state, x is the current state, ω is constant inertia weight which is equal to 0.729 [46], c1 and c2 are cognitive weights which are both equal to 1.496 [46], r is a positive random number from 0 to 1. The goal of learning components must be kept within the corresponding available boundaries. This restriction suits the practical needs of game development where the developers want to take control of how strong a game object should be like.
IV. A.
Game Scenario The tests are conducted upon a customized 2D shooting game scenario developed with Unity 3D engine. The scenario is about the situation where the player is being attacked by different kinds of enemies. The battleground size is fixed. The re-spawning locations of all enemies are randomized within the battlefield at start of each game round. One game round is considered as one learning generation. The balancing goal of enemies is to attack the player such that when one game round ends, the HP of player will become 0. The positive or negative remaining HP will be calculated as error, which is based on its percentage ratio comparing to the maximum HP of the player. A sample screen capture is shown in Fig. 3. In Appendix Section, Table I is about the settings of game environment, Table II is about the action settings of game objects, and Table III is about the test case settings.
Fig. 3. Screen capture of testing game scenario
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
IMPLEMENTATION EXPERIMENTS
5 B.
Testing Goals When designing the test cases, we considered the following two objectives: 1) Adaption We set different maximum HP and armor of the player, in order to prove that our approach is able to adapt when the properties of the player change. 2) Diversification We need to prove that the results of behaviours and content types will not always be similar if we repeat same test cases again.
C.
Adaption Results We repeated every test case for three times as shown in Fig. 4 (labeled by A, B and C), where each case has 11 learning generations. From test case 1 to test case 3, the armor of the player is increasing. From test case 4 to test case 6, the maximum HP of the player is increasing. The results show that the starting errors of all tests are stochastic, but the errors drop down significantly during the first few generations. Except the tests 1B, 3B and 4C, all the others become steady within the range of -5% to 5% after the 8th learning generation.
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
6
Fig. 4. Adaption results of game balance goal
For the less satisfactory tests 3B and 4C, although the fluctuation appears unstable, the overall trend of errors is still reducing, and their results at the generation 11 are not bad. But for the test 1C, early convergence can be observed which we consider this test as failure. Both the unstable fluctuation and early convergence phenomenon are caused by the standard PSO learning algorithm that we use. The strategies to improve this learning algorithm are out of the scope of this paper, which on the other hand, have been researched by many existing works mentioned in Section II. Besides, the results also show that when the armor and maximum HP of player character increases, the error fluctuation becomes stable but the learning speed becomes slow. Such phenomenon is caused because when the strength of player increases, each action of game objects will bring less influence on the total outcome. As the result, with same
amount of weight adjustment, the learner will ring less changes on the errors. D.
Diversification Results The diversification results are presented according to two parts: l Game object type diversity (content generation). l Game object behaviour diversity (behaviour control). The results have shown pretty good diversity on both sides. For example, in content generation results of test cases 4, 5 and 6, when the same test cases were repeated, the numbers of each type of game objects were different as shown in Fig. 5. Behaviour control values of test case 1A are depicted in Fig. 6 (x, y and z correspond to the weights that control game object actions) as an example for behaviour diversity.
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
7
Fig. 5. Diversification results in content generation
l
l
l
retraining when new rules are added. The training time and error can be reduced by using advanced training algorithms rather than standard PSO, or by applying different inertia weight strategies rather than a constant value. If a game object only interacts with player’s character for a short time, it might obtain imprecise feedback. A possible solution is to disable its learning mechanism if the object is isolated. The current experiments only adapt to the property change of player’s character. We have plans to consider other parameters such as, player’s skill proficiency change in our future experiments.
APPENDIX TABLE I. THE COMMON SETTINGS OF ALL TEST CASES Generation Learning Start Threshold (k) 3 Number of Iterations per Generation (n) 10 Time Period per Iteration 10 s Number of Generations per Test Case 11 Number of Enemy Types 3 Number of Action Types per Enemy 3 Range of Number of Particles per Individual EIM 4 to 12 Number of Particles in EIM Manager 3
Fig. 6. Diversity result in behaviour control
V.
CONCLUSION AND FUTURE WORKS
In this paper, based on standard PSO algorithm and ecosystem mechanism, we proposed a multi-level swarm structure, which can achieve game balance adaption on both content generation and behaviour control, whereas presenting a certain degree of diversity in results. This approach is easier to be organized and modified than most of existing play-log based works, because it does not require the developers to figure out feature mapping rules. It only requires the developers to ensure that the goals are within the evolution boundaries. The experiment shows satisfactory results in most of the adaption tests and good results in all diversification tests. It is able to adapt towards balance goal when the property strength of the player’s character is changed. Based on observation, we found the system can be improved further by solving the following problems, which we defer to future work. l Our approach cannot handle the situation where aggregation is required. We cannot aggregate different goals for one object. l Similar to many existing GAs and PSOs approaches, the system has extendibility problem and requires
TABLE II. THE ACTION SETTINGS OF ALL GAME OBJECTS Action 1 Action 2 Action 3 Interval Period Damage Damage Damage Range per Hit Bot 2 7 15 0.2 to 3 s Ghost 3 11 31 0.2 to 3 s Succubus 5 15 53 0.2 to 3 s TABLE III. THE SETTINGS OF TEST CASES Player Maximum HP Player Armor Test Case 1A to 1C 400000 0 Test Case 2A to 2C 400000 1 Test Case 3A to 3C 400000 2 Test Case 4A to 4C 500000 1 Test Case 5A to 5C 600000 1 Test Case 6A to 6C 700000 1
REFERENCE [1] Schwab, B. (2009). AI game engine programming. Cengage Learning, pp. 2-31, 261, 303-315, 335-336. [2] Yannakakis, G. N., & Togelius, J. (2011). Experience-driven procedural content generation. Affective Computing, IEEE Transactions on, 2(3), 147-161. [3] Jaffe, A. B. (2013). Understanding Game Balance with Quantitative Methods(Doctoral dissertation). [4] Matt Bertz. The Technology Behind The Elder Scrolls V: Skyrim. January 17, 2011. Retrieved from: http://www.gameinformer.com/games/the_elder_scrolls_v_skyrim/b/x box360/archive/2011/01/17/the-technology-behind-elder-scrolls-v-
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016
8
skyrim.aspx [5] Yannakakis, G. N. (2012, May). Game AI revisited. In Proceedings of the 9th conference on Computing Frontiers (pp. 285-292). ACM. [6] Arulraj, J. J. P. (2010, September). Adaptive agent generation using machine learning for dynamic difficulty adjustment. In Computer and Communication Technology (ICCCT), 2010 International Conference on (pp. 746-751). IEEE. [7] Yannakakis, G. N., & Maragoudakis, M. (2005). Player modeling impact on player’s entertainment in computer games. In User Modeling 2005 (pp. 74-78). Springer Berlin Heidelberg. [8] Booth, M. 2009. The AI Systems of Left 4 Dead. Keynote, Fifth Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE '09). Stanford, CA. October 14--16, 2009. [9] Conroy, D., Wyeth, P., & Johnson, D. (2011, November). Modeling player-like behaviour for game AI design. In Proceedings of the 8th International Conference on Advances in Computer Entertainment Technology (p. 9). ACM. [10] Jennings-Teats, M., Smith, G., & Wardrip-Fruin, N. (2010, June). Polymorph: dynamic difficulty adjustment through level generation. In Proceedings of the 2010 Workshop on Procedural Content Generation in Games (p. 11). ACM. [11] Jaffe, A. B. (2013). Understanding Game Balance with Quantitative Methods(Doctoral dissertation). [12] Eberhart, R. C., & Kennedy, J. (1995, October). A new optimizer using particle swarm theory. In Proceedings of the sixth international symposium on micro machine and human science (Vol. 1, pp. 39-43). [13] Kennedy, J., & Eberhart, R. (1995, November). Particle swarm optimization. InNeural Networks, 1995. Proceedings., IEEE International Conference on (Vol. 4, pp. 1942-1948). IEEE. [14] Specht, R. L. (2011). Development of ecosystem research. ISRN Ecology,2011. [15] Kennedy, J., Kennedy, J. F., Eberhart, R. C., & Shi, Y. (2001). Swarm intelligence. Morgan Kaufmann, pp. 290 - 296. [16] Charles, D., Kerr, A., McNeill, M., McAlister, M., Black, M., Kcklich, J., ... & Stringer, K. (2005, June). Player-centred game design: Player modelling and adaptive digital games. In Proceedings of the Digital Games Research Conference (Vol. 285, p. 00100). [17] Spronck, P. H. M. (2005). Adaptive game AI. UPM, Universitaire Pers Maastricht. [18] Spronck, P., Ponsen, M., Sprinkhuizen-Kuyper, I., & Postma, E. (2006). Adaptive game AI with dynamic scripting. Machine Learning, 63(3), 217-248. [19] Szita, I., Ponsen, M., & Spronck, P. (2009). Effective and diverse adaptive game AI. Computational Intelligence and AI in Games, IEEE Transactions on,1(1), 16-27. [20] Simpson, R. (2012). Evolutionary Artificial Intelligence in Video Games. [21] Schrum, J., & Miikkulainen, R. (2008). Constructing Complex NPC Behaviour via Multi-Objective Neuroevolution. AIIDE, 8, 108113. [22] Michael Martin. Using a Genetic Algorithm to Create Adaptive Enemy AI. August 30, 2011. Retrieved from: http://www.gamasutra.com/blogs/MichaelMartin/20110830/90109/Usi ng_a_Genetic_Algorithm_to_Create_Adaptive_Enemy_AI.php [23] Shaker, N., Yannakakis, G. N., & Togelius, J. (2010, October). Towards Automatic Personalized Content Generation for Platform Games. In AIIDE. [24] Togelius, J., Yannakakis, G. N., Stanley, K. O., & Browne, C. (2011). Search-based procedural content generation: A taxonomy and survey. Computational Intelligence and AI in Games, IEEE Transactions on, 3(3), 172-186. [25] Liapis, A., Yannakakis, G. N., & Togelius, J. (2011, August). Neuroevolutionary constrained optimization for content creation. In Computational Intelligence and Games (CIG), 2011 IEEE Conference on (pp. 71-78). IEEE. [26] Liapis, A., Yannakakis, G. N., & Togelius, J. (2012, July). Limitations of choice-based interactive evolution for game level design. In Proceedings of AIIDE Workshop on Human Computation in Digital Entertainment.
[27] Roberts, J., & Chen, K. (2015). Learning-Based Procedural Content Generation.Computational Intelligence and AI in Games, IEEE Transactions on, 7(1), 88-101. [28] Li, F., & Guo, J. (2014). Topology Optimization of Particle Swarm Optimization. In Advances in Swarm Intelligence (pp. 142149). Springer International Publishing. [29] Zhao, X., Liu, Z., & Yang, X. (2014). A multi-swarm cooperative multistage perturbation guiding particle swarm optimizer. Applied Soft Computing, 22, 77-93. [30] Röhler, A. B., & Chen, S. (2012, June). Multi-swarm hybrid for multi-modal optimization. In Evolutionary Computation (CEC), 2012 IEEE Congress on (pp. 1-8). IEEE. [31] Hendtlass, T. (2005, September). WoSP: a multi-optima particle swarm algorithm. In Evolutionary Computation, 2005. The 2005 IEEE Congress on(Vol. 1, pp. 727-734). IEEE. [32] Xu, X., Tang, Y., Li, J., Hua, C., & Guan, X. (2015). Dynamic multi-swarm particle swarm optimizer with cooperative learning strategy. Applied Soft Computing, 29, 169-183. [33] Niu, B., Zhu, Y., He, X., & Wu, H. (2007). MCPSO: A multiswarm cooperative particle swarm optimizer. Applied Mathematics and Computation, 185(2), 1050-1062. [34] Bererton, C. (2004, July). State estimation for game AI using particle filters. InAAAI workshop on challenges in game AI. [35] Ma, Q., & Lei, X. (2009). Application of improved particle swarm optimization algorithm in UCAV path planning. In Artificial Intelligence and Computational Intelligence (pp. 206-214). Springer Berlin Heidelberg. [36] Hoekstra, C. Adaptive Artificially Intelligent Agents in Video Games: A Survey.UNIAI-06. [37] Charles, D., & McGlinchey, S. (2004). The past, present and future of artificial neural networks in digital games. In Proceedings of the 5th international conference on computer games: artificial intelligence, design and education. The University of Wolverhampton (pp. 163-169). [38] Yan, X., Wu, Q., Hu, C., Yao, H., Fan, Y., Liang, Q., & Liu, C. (2014). Robot Path Planning based on Swarm Intelligence. International Journal of Control and Automation, 7(7), 15-32. [39] Abdelbar, A. M., Ragab, S., & Mitri, S. (2004, July). Coevolutionary particle swarm optimization applied to the 7× 7 Seega game. In Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on (Vol. 1). IEEE. [40] Conradie, J., & Engelbrecht, A. P. (2006, May). Training bao game-playing agents using coevolutionary particle swarm optimization. In Computational Intelligence and Games, 2006 IEEE Symposium on (pp. 67-74). IEEE. [41] Wang, X., & Lin, Y. (2012). PSO Algorithm for IPD Game. International Journal of Computer Science & Information Technology, 4(4), 23. [42] Hassan, R., Cohanim, B., De Weck, O., & Venter, G. (2005, April). A comparison of particle swarm optimization and the genetic algorithm. InProceedings of the 1st AIAA multidisciplinary design optimization specialist conference (pp. 1-13). [43] Huo, P., Shiu, S. C. K., Wang, H., & Niu, B. (2009). Case indexing using PSO and ANN in real time strategy games. In Pattern Recognition and Machine Intelligence (pp. 106-115). Springer Berlin Heidelberg. [44] Fang, S. W., & Wong, S. K. (2012). Game team balancing by using particle swarm optimization. Knowledge-Based Systems, 34, 9196. [45] Janson, S., & Middendorf, M. (2005). A hierarchical particle swarm optimizer and its adaptive variant. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 35(6), 12721282.[46] Chen, H., Zhu, Y., Hu, K., & He, X. (2010). Hierarchical swarm model: a new approach to optimization. Discrete Dynamics in Nature and Society, 2010. [46]Kentzoglanakis, K..,& Poole, M. (2009). Particle swarm optimization with an oscillating inertia weight. ACM.
IEEE International Conference on Data Mining and Advanced Computing (SAPIENCE 2016)
DOI: 10.1109/SAPIENCE.2016.7684145 - Publised:12-12-2016