Therefore, the replay file logs rich information about the game, such as chat, map
name, players' name, and all actions performed by each player from the time.
In-Game Action List Segmentation and Labeling in Real-Time Strategy Games Wei Gong, Ee-Peng Lim, Palakorn Achananuparp, Feida Zhu, David Lo and Freddy Chong Tat Chua
Abstract—In-game actions of real-time strategy (RTS) games are extremely useful in determining the players’ strategies, analyzing their behaviors and recommending ways to improve their play skills. Unfortunately, unstructured sequences of ingame actions are hardly informative enough for these analyses. The inconsistency we observed in human annotation of ingame data makes the analytical task even more challenging. In this paper, we propose an integrated system for in-game action segmentation and semantic label assignment based on a Conditional Random Fields (CRFs) model with essential features extracted from the in-game actions. Our experiments demonstrate that the accuracy of our solution can be as high as 98.9%.
I. I NTRODUCTION Real-time strategy (RTS) game is a popular online game genre in which players gather necessary resources to train army units and build bases to fight against opponents. To maximize the chance of winning in RTS games, one needs to learn the right strategies. Hence, it is not a surprise many players turn their attention to game commentaries that explain the dos and don’ts of game playing using previously recorded games for illustration. For example, Sean ‘Day9’ Plott, a famous commentator of the highly popular RTS game StarCraft II1 , has attracted over 45 millions views on his YouTube Channel2 . From the RTS game developer perspective, to attract and retain players, it is equally important to help their players understand the games and learn the right strategies to play well. Today’s RTS games, however, have only limited capacity to offer such assistance to their players other than a good help menu and user guide. The current best way for players to learn the games is still by trial and error, a potentially tedious and time-consuming process. Fortunately, the recent advances in database systems and collaborative web sites, have made it possible to record massive amount of data generated by real RTS games (often known as game replays), and share them at game data repository websites. One can then apply data analytics to discover game strategies and to evaluate their effectiveness in improving gaming performance and experience. The knowledge gained can be further used to guide individual players to play better and help game developers design better games. Wei Gong, Ee-Peng Lim, Palakorn Achananuparp, Feida Zhu, David Lo and Freddy Chong Tat Chua are with the School of Information System at Singapore Management University. 80 Stamford Road, Singapore (email:
[email protected],
[email protected],
[email protected],
[email protected],
[email protected],
[email protected]) 1 http://sea.blizzard.com/games/sc2/ 2 http://www.youtube.com/user/day9tv/
In this paper, we analyze the in-game actions of a popular RTS game, StarCraft II: Wings of Liberty, developed and released by Blizzard Entertainment. StarCraft II (or SC2) is selected because (1) the popularity of SC2 makes it easy to find highly active players for the purpose of human annotation; (2) SC2 replay files are publicly available for downloading from several websites; (3) the highly complicated gameplay in SC2 has given rise to a wealth of sophisticated strategies from its players; and (4) comparing to the original StarCraft3 , SC2 is a newer game with a greater variety of gameplay mechanics. Hence, the game is not yet fully understood by the players and the gameplay strategies are still evolving, which makes SC2 more interesting and challenging for analysis. Even as this paper focused on SC2 only, the same problem and proposed techniques will be applicable to other popular RTS games such as League of Legends4 , and DOTA25 . A replay file in SC2 records in temporal order all actions performed by the players, including mouse movements and keyboard inputs. These mouse and keyboard actions are recorded as atomic actions, such as training a unit, selecting a building or attacking some facility. The sequence of timestamped actions performed by one player in a game is called an action list. While unstructured action lists are hardly adequate to reveal the player’s intention and strategy, it is much easier when consecutive atomic actions are grouped into logical segments assigned with meaningful semantic labels. Our objective is thus to partition each SC2 action list into segments and give each segment a label which represents the collective meaning of the actions in this segment. Knowing these action segments and their semantic labels will give us an idea how the player executes her strategy in the game. Formally, given an action list L = (a1 , a2 , ..., am ) with m actions performed by one player, a segmentation S of L is defined as k non-overlapping segments, denoted as S = {s1 , s2 , ..., sk }, where for 1 ≤ i ≤Pk, si = k (abi , abi +1 ..., abi +li ), 1 ≤ bi < bi+1 ≤ m and i=1 li = m − k. Let A = {α1 , α2 , ..., αh } be a set of h unique labels. Our problem is to find all the segments s1 , s2 , ..., sk , and assign a label αsi ∈ A for each segment si , 1 ≤ i ≤ k. In this problem formulation, we assume that the label set A is given. To obtain this label set, we actually need some expert knowledge about the SC2 game which is covered in Section 3 http://sea.blizzard.com/games/sc/ 4 http://lol.garena.com/playnow.php/ 5 http://www.dota2.com/
III-B. The task of segmentation and labeling sequential data has been studied in many fields, including bioinformatics [6], natural language processing [2] and speech recognition [14]. The segmentation techniques include Hidden Markov Models (HMMs) [15], Maximum Entropy Markov Models (MEMMs) [11], and Conditional Random Fields (CRFs) [9]. Since CRFs are known to outperform both HMMs and MEMMs on a number of real-world sequence segmentation and labeling tasks [9], [13], we use CRFs to automatically segment and label game action lists in our framework. Several challenges are unique in this in-game action list segmentation and labeling task. Firstly, to the best of our knowledge, there are no publicly available labeled action lists to be used for training. The manual process of labeling takes much time and effort and the labeling agreement between annotators in our experiment has been shown to be far below what we considered to be useable. Secondly, the noise level in the raw in-game action lists is very high. This prevents accurate segmentation and labeling. Our experiments show that 80% of actions in the actions lists can be considered as spams. The spam actions are generated in various ways, such as accidental keypress, repeated use of trivial game commands to inflate personal statistics, i.e. number of actions performed per minute, etc. Finally, we need to identify the features to be used in segmentation and labeling. None of the above tasks has been studied for the segmentation task in the past. In our literature survey, we found several prior works on RTS games focusing on discovering the build order based strategies [20], [21] and incorporating them into the artificial intelligence (AI) game agents [12], [22]. However, the above works make assumption about the way the games will be played and only focus on a small subset of in-game actions (build order). There is relatively little work on mining game strategies from the entire set of in-game actions, and using the mined strategies to explain the game matches. In [5], action list segmentation and labeling was also studied but the work focused on fixed length segmentation approach over the in-game actions related to build order. The work assumes equal length segments of 30 seconds each, and applies HMMs to learn the segment labels. Unlike our work below, this segmentation method is not data-driven and does not consider human annotated ground truths in training and evaluation. Action Lists
Human Annotation
Training Data
Features Extraction
Training
CRF Model
A. Contributions The in-game action segmentation and labeling task is novel. By identifying and addressing the associated challenges, we contribute to game analytics research as follows: 1) We proposed a framework (as shown in Figure 1) to solve the action list segmentation and labeling problem. This framework has two phases, namely model training and model application. In model training, we collect raw action lists, recruit annotators to segment and label action lists so as to obtain the training data, extract representative features from the training data, and train a CRF model. In model application, features are extracted from a new action list, which is then segmented and labeled by the trained CRF model. 2) We have developed a prototype annotation tool known as Labelr6 to collect segmentation and labeling data from SC2 players so as to derive the ground truth data. 3) We have devised a simple heuristics to filter spurious actions from the action lists. We show that our trained CRF model achieves high accuracy in the segmentation and labeling task. B. Outline of Paper The outline of this paper is as follows. Section II describes the in-game data of SC2. Section III introduces the dataset we collected. Section IV describes our proposed CRF based segmentation and labeling method. Section V presents our experimental results. Section VI concludes by summarizing our work and presenting the future work. II. OVERVIEW OF S TAR C RAFT II A. StarCraft II Mechanics SC2 is a military science fiction RTS game developed by Blizzard Entertainment. In SC2, players observe the game actions from a topdown perspective. They are required to collect resources, develop technologies, build up their army and use them to destroy the opposing player’s. Players can choose to play as one of the three unique races, Terran, Protoss, and Zerg when game starts. Each race is designed to be played differently with its own sets of units and buildings. When the game starts, players issue commands to the units and buildings under their control in real-time through mouse clicks and keyboard shortcuts. This is different from turnbased games, such as chess, where players take turn to move their pieces. Resource
Mineral
New Action List
Features Extraction
Fig. 1.
CRF Model
Segmented and Labeled Action List
Framework for Our Approach
Producible
Unit
Gas Worker
Fig. 2.
Combat
Objects in SC2
6 http://202.161.45.174/sc2annotation
Building
Production
Figure 2 shows the types of objects in SC2. There are two kinds of resources: minerals and gas. Minerals are the basic form of currency required in every training, building, researching and upgrading actions. Gas is the rarer form of currency used in the training and construction of more advanced units and buildings as well as upgrading technology. There are three types of units: worker, combat, and production unit. Workers are responsible for collecting resources and constructing buildings. Although they can also attack other units, they are much weaker than the regular combat units. Combat units are comprised the main army of the players. Each combat unit has its own strengths and weaknesses against different types of units. There exists a special type of units, called production units, such as Zerg larva, which are static and only used to produce other units. Buildings are mainly used as the production facilities to train different units. Some building can also be used to upgrade certain technology, improving the efficiency of the economy and the effectiveness of combat units. The mechanics in SC2 requires players to perform both micro-management (micro) and macro-management (macro) actions. The micro actions are those involving tactical movement, positioning, and maneuvering of individual or groups of units to maximize their damage on opponents and minimize the damage received. On the other hand, the macro actions involve improving and maintaining the overall economic and military profile of the players. These include a constant production of worker and combat units, upgrading the technology levels, etc. Highly skilled players are typically those who multitask effectively between micro-management and macro-management. Competitive online multiplayer mode is what makes SC2 a highly popular game. In this mode, players can find other human players who are near their skill level to compete against using Battle.net, an official online multiplayer platform developed by Blizzard. At the end of each multiplayer game, players’ performance will be evaluated automatically by Battle.net and skill ratings will be awarded to the players. Similar to chess, players’ skills are categorized into different leagues. The leagues ranked from the lowest to the highest include Bronze, Silver, Gold, Platinum, Diamond, Master, and Grandmaster. Battle.net’s matchmaking system will try to pair players of comparable skill levels to play against each other. In this paper, we define a game match to be an instance of SC2 game played by at least two human players, such that this match ends with a win or lose outcome for each player. As shown in Figure 3, when a game is over, all game match data can be saved as a replay. The replay file records everything that SC2 in-game renderer needs to replay the corresponding game. Therefore, the replay file logs rich information about the game, such as chat, map name, players’ name, and all actions performed by each player from the time the game starts till it ends. After collecting the replay files, we can use the replay parsers such as phpsc2replay [19] and sc2gears [1] to extract all the action data in them.
Players
Game
Fig. 3.
Replay
Replay Parsers
In-game Actions, Players' names...
Collecting replays and action lists
TABLE I A PARTIAL ACTION LIST PERFORMED BY ONE PLAYER FROM A SC2 REPLAY
Time (mins) 0:00 0:00 0:00 0:00 0:01 0:02 ... 0:07 0:08
Player Action Select Command Center (10288) Train SCV Train SCV Train SCV Select SCV x6 (1028c, 10290, 10294, 10298, 1029c, 102a0), Deselect all Right click; target: Mineral Field (100b4) ... Select Command Center (10288), Deselect all Right Click; target: Mineral Field (10070)
B. Action List Representation Given a replay R, an action list L = (a1 , a2 , ..., am ) is the sequence of actions performed by a player p in R in temporal order. m is the number of actions in L. Each action ai in L has a corresponding timestamp ti , where t1 ≤ t2 ≤ ... ≤ tm . A partial example action list is shown in Table I. Each line in this table represents an action which includes timestamp and the corresponding action. Each action consists of action type and target. For example, from the first action in Table I, ‘Select Command Center (10288)’, we can identify the action type as ‘Select’, and the target as ‘Command Center’ (which is a type of buildings in the Terran race). In this action, the alphanumeric string (10288) is the ID for previous object (Command Center). In SC2, every object has a unique ID. In addition, certain actions are automatically generated by a game client. For example, we can see ‘Deselect all’ at the end of 0:01 (minutes) and 0:07 select actions, which means that if a player is selecting something at a certain time, the game client will automatically deselect all the objects that the player selected before. After selecting the command center, the player trains three SCVs (which are the workers of the Terran race) at time 0:00, selects six SCVs at time 0:01, and right clicks a mineral field at time 0:02. After some other actions, the player then selects the command center at time 0:07, and right clicks a mineral field at time 0:08. C. Action List Segments Consider the action list in Table I. Although we understand the meaning of each single action, we still do not know what does this player really want to do, such as what is the difference between the actions at time 0:02 and time 0:08 (both are right clicking a mineral field)? Why does the player control SCVs? What is the player’s purpose of selecting a command center? If we group actions at time 0:01 and 0:02,
and group actions at time 0:07 and 0:08, at this point in the game, the player tried to develop her economy by training more workers and ordering them to harvest minerals. As new workers are produced from the command center, they automatically go to a specific mineral field set by the player’s rally point. Thus, to make sense of an individual, we must also consider the context in which the action takes place. This is similar to natural language understanding, where the meaning of a word can be inferred from other neighboring words. In the example, we may group actions at time 0:01 and 0:02 into a segment, and assign a label ‘mine’. For the segment containing actions at time 0:07 and 0:08, the appropriate label is ‘set rally point’. III. DATASET AND S EGMENT L ABELING A. Data Collection Several websites are dedicated to host SC2 replays. These sites are used by SC2 players to share selected gameplays with the public. We developed a web crawler to collect replays from GameReplays.org7 , a SC2 replay repository site. A total of 866 replays of SC2 games played by one player against another player (1 vs. 1) were downloaded, from which we obtained 1732 action lists. B. Manual Segmentation and Labeling 1) Human Annotation Tool: A 20-minute long SC2 game normally contains thousands of in-game actions. This makes labeling the action lists quite cumbersome for human annotators. To facilitate the process, we have developed a webbased human annotation tool called Labelr. As shown in Figure 4, Labelr provides necessary web userinterface functions for an annotator to organize, segment, and label action lists with minimum effort. For example, the annotator can simply assign the start and end points to split the action sequence. Then, she can select from a default list (training, building, mining, etc.) or create an appropriate label for the corresponding segment. To aid the annotator in making sense of some complex in-game sequence, Labelr also provides a link to download the actual binary replay file which can be played back in the game client. All annotation data from each annotator are stored in a relational database. 2) General Players Annotation: Two business-major undergraduate students who have extensive knowledge in SC2’s gameplay mechanics and have been competing in high level online leagues (Diamond & Platinum) for over a year were hired to label the data using Labelr. They each labeled five common action lists. From their feedback, we learned that segmentation and labeling on action list is time consuming. On average, it took an annotator three hours to label a 20minute game. This is because the SC2 action list is very long: the average game length of the five labeled action lists is 23.4 minutes involving an average of 4628 actions. To segment and label one action list, annotators have to go through all 7 www.gamereplays.org/starcraft2/replays.php?game=33
Fig. 4.
Part of screenshot of Labelr
TABLE II R ATERS AGREEMENT, PRESENTED BY F LEISS ’ K APPA (a) Kappa of replay annotation Action list id Kappa value 1 0.43 2 0.21 3 0.42 4 0.32 5 0.41 avg 0.36 (b) Kappa interpretation Kappa value Interpretation