Logic Metaprogramming in Games - LAMSADE

72 downloads 0 Views 57KB Size Report
[Pell 94] Pell B.: A Strategic Metagame Player for General Chess-Like Games. ... J.: Realization of a Program Learning to Find Combinations at Chess. Com-.
Logic Metaprogramming in Games Tristan Cazenave Laboratoire d'Intelligence Artificielle, Département Informatique, Université Paris 8, 2 rue de la Liberté, 93526 Saint Denis, France. [email protected]

Abstract. Introspect is a general logic metaprogramming system that has been applied to multiple games and planning domains. We present the main tools used to generate program. We also describe some successful applications. In these applications, Introspect has been used to write programs that write selective and efficient search programs.

1 Introduction Introspect is a general logic metaprogramming system that has been applied to multiple games and planning domains. It generates programs using first order predicate logic and logic metaprogramming. The generated programs are transformed by Introspect in C++ and are integrated in the search engine of games and planning C++ programs to speed up the search. To generate programs in a given domain, it needs a theory of the domain in a first order predicate logic language. The theories have to be declarative, and the generated programs are also declarative. Some ordering occurs in the last steps of the process, just before the compilation in C++ in order to have fast generated programs. Knowledge about the moves to try enables to select a small number of moves from a large set of possible moves. It is very important in complex games and planning domains where search trees have a large branching factor. Knowing the moves to try drastically cuts the search trees. Logic metaprogramming has been used in Introspect to automatically create the knowledge about interesting and forced moves, only given the rules about the direct effects of the moves [Cazenave 98a,b]. From a more general point of view, metaknowledge itself can be very useful for a wide range of applications [Pitrat 90]. The second section describes logic metaprogramming in games and planning domains. The third section uncovers applications of Introspect.

2 Logic metaprogramming in games and planning domains Our metaprograms write programs that enable to safely cut search trees, therefore enabling large speedups of search programs. In our applications to games, metarules are used to create theorems that tell the interesting moves to try to achieve a tactical goal (at OR nodes). They are also used to create rules that find the complete set of forced moves that prevent the opponent to achieve a tactical goal (at AND nodes). Metaprogramming in logic has already attracted some interest [Hill & Lloyd 94, Barklund 94, Pettorossi & Proietti 96]. More specifically, specialization of logic program by fold/unfold transformations can be traced back to [Tamaki & Sato 84], it has been well defined and related to Partial Evaluation in [Lloyd & Shepherdson 91], and successfully applied to different domains [Gallagher 93]. The parallel between Partial Evaluation and Explanation-Based Learning [Pitrat 76, Mitchell & al. 86, Dejong & Mooney 86, Pell 94, Ram & Leake 95] is now well-known [Van Harmelen & Bundy 88, Etzioni 93]. As Pitrat [Pitrat 98] points it, the ability for programs to reason on the rules of a game so as to extract useful knowledge to play is a problem essential for the future of Artificial Intelligence. It is a step toward the realization of efficient general problem solvers. In Introspect, two kinds of metarules are particularly important : impossibility metarules and monovaluation metarules. Other metarules such as metarules removing useless conditions or ordering metarules are used to speed-up the generated programs. Impossibility metarules find which rules can never be applied because of some properties of the game, or because of more general impossibilities. An example of a metarule about a general impossibility is the following one : impossible(ListAtoms):member(N=\=N1,ListAtoms),var(N),var(N1),N==N1. This metarule tells that if a rule created by the system contains the condition 'N=\=N1' and the metavariables N and N1 contain the same variable, then the condition can never be fulfilled. So the rule can never apply because it contains a statement impossible to verify. This metarule is particularly simple, but this is the kind of general knowledge a metasystem must have to be able to reason about rules and to create rules given the definition of a game. Some of the impossibility metarules are more domain specific. For example the following rule is specific to the game of Go : impossible(ListAtoms):member(color_string(B,C),ListAtoms),C==empty. It tells that the color of a string can never be the color 'empty'. The other important metarules in our metaprogramming system are the monovaluation metarules. They apply when two variables in the same rules always share the same value. Monovaluation metarules unify such variables. They enable to simplify the rules and to detect more impossible rules. An example of a monovaluation metarule is :

monovaluation(ListAtoms):member(color_string(B,C),ListAtoms), member(color_string(B1,C1),ListAtoms), B==B1,C\==C1,C=C1. It tells that a string can only have one color. So if the system finds two conditions 'color_string' in a rule such that the variables contained in the metavariables 'B' and 'B1' are equals, and if the corresponding color variables contained in the metavariables 'C' and 'C1' are not the same, it unifies C and C1 because they always contain the same value. Impossible and monovaluation metarules are vital rules of our metaprogramming system. They enable to reduce significantly the number of rules created by the system, eliminating many useless rules. For example, we tested the unfolding of six rules with and without these metarules. Without the metarules, the system created 166 391 568 rules by regressing the 6 rules on only one move. Using basic monovaluation and impossible metarules shows that only 106 rules where valid and different from each other. Other metarules of importance are reordering metarules that enable to find a good static ordering for the conditions of the generated declarative rules, and simplification metarules that enable to remove some useless conditions of the generated rules.

3 Applications This section describes the implemented application domains of Introspect. 3.1 The game of Go The target concepts are the tactical subgoals of the game of Go : Remove/save a string, Connect/Disconnect two strings. Each of these target concepts is defined using rules in predicate logic. For example the target concept for the tactical goal removestring is defined using this rule: removestring (S, I, friend) :- color (S, enemy), move (I, enemy), numberoflibertiesbeforemove (S, 1), liberty (I, S), legalmove (I, enemy). Thousands of rules are created by using the rules of the game to specialize the tactical goals. Example of a simple created rule used to find connections between strings of stones :

connect (S1, S2, I, C) :- color (C), color (S1, C), color (S2, C), oppositecolors (C, C1), liberty (I, S1), liberty (I, S2). This rule tells that if an intersection I is a liberty of strings S1 and S2 that have the same color C, playing a stone of color C at I enables to achieve the goal Connect between the two strings.

∆ Figure 1

The created rule previously given as example applies to the move marked ∆ that connects the two black strings in the Figure 1. The initial target concept defining the connect goal is: connectedaftermove (S1, S2) :- color (C), color (S1, C), color (S2, C), oppositecolors (C, C1), elementofaftermove (I, S1), elementofaftermove (I, S2). The rules of the game used to specialize the target concept in this example are: elementofaftermove (I, S) :- liberty (I, S), color (S, C), move (I, C). connect (S1, S2, I, C) :- move (I, C), connectedaftermove (S1, S2). There are different predicates to describe the board after the move and the board before the move. This is to prevent side effects to happen, and to provide a way to stop unfolding easily. Therefore, generated rules are theorems of the game of Go: when they conclude that a move achieves a goal, it is always true. The Go program that uses the rules resulting of the metaprogramming has good results in international competitions (6 out of 40 in 1997 FOST cup [Fotland & Yoshikawa 97], 6 out of 17 in 1998 world computer Go championship). The game of Go is known to be the most difficult game to program [Van den Herik & al. 91]. Experiments in solving problems for the hardest game benefit to other games, and to other planning domains, especially if these experiments use general and widely applicable methods as it is the case for our metaprogramming methods.

3.2 Abalone and Go-Moku Introspect has generated knowledge for subgoals of other games. It has been applied to the generation of rules for ejecting balls at the game of Abalone [Cazenave 96a]. It has also been applied to the generation of rules for the game of Go-Moku, the rules generated for this application enable to reduce drastically the number of moves to try in many positions, and therefore enable to solve the game faster than previous implementations. 3.3 Economic Simulation games Introspect has generated agents for economic simulation games, in single agent problems [Cazenave 96a,b] and in competitive environment [Cazenave 98c]. To generate programs for these games, Introspect is given the goals to achieve and an economic theory expressed in first order predicate logic. The novelty of the generation of program for competitive environments is that it can forecast moves of the opponent using rules about forced moves to prevent a goal to be achieved, and choose its moves according to this knowledge. 3.4 Agent planning in architecture simulation Introspect has been applied to an architecture simulation tool [Cazenave 97]. The simulation tool that has been optimized by Introspect has been used to choose the configuration of the “Stade de France” that was built for the World Soccer Cup in France. It has also been used to test various urban configurations such as Rail Stations or Roads Configurations in french cities. A problem encountered in these simulation is that simulating realistic agents behaviors is time consuming, especially in simulations containing thousands of agents. Another problem is that making agents more complicated and more realistic makes the maintaining of the program harder, and also make changes in the program difficult to handle. The solution we found to overcome these two problems is to automatically create efficient and realistic agents from a declarative description of their behaviors. The goal of the method that Introspect has optimized is to find the move of each pedestrian in the simulation. It is called very often and it is a time consuming method. Introspect is given four types of rules: rules describing the simulation, rules about the goals to achieve, metarules about the monovaluation of some predicates, metarules about the impossibility of some other rules. All the rules are expressed in predicate logic. It specializes the rules about the goals and transforms the resulting logic program in a C++ program that can be linked to the simulation main module. The C++ program written by Introspect is ten times faster than the original handcoded C++ program written by professional programmers. The main reason for the success of this approach is that hand-coded programs have to be maintainable and simple so that the programmer can understand them, whereas our system does not

have this limitation. The clarity of a hand-made program is sometime at the price of its efficiency. Our system writes long and unclear programs, but they are faster than hand-coded programs because all the calculation that can be made at compilation time have been made. Meanwhile, the agents can be modified more easily than in an hand-coded C++ program, because they are represented declaratively in a logic program. Thus, our approach enables to write faster agents simulations, and also enables to modify agents behaviors in an easier way than by directly modifying the C++ code of the agent. 3.5 Agent planning in simulated robotic soccer Introspect has been applied to the generation of programs that play simulated robotic soccer [Bourdon 98]. Three types of agents are defined : defenders, attackers and goal keeper. Each agent is defined by a set of goals it tries to achieve. These goals are specialized using Introspect. The generated programs enable to know some moves in advances the situations following some actions, and to decide which action to take based on these previsions. 3.6 Introspect Given a particular domain and little knowledge about this domain, Introspect can be used to generate itself most of the metaprograms it needs to generate programs in this domain [Cazenave 99]. It creates for itself impossibility, monovaluation, reordering and simplification metarules.

4 Conclusion Logic metaprogramming complex games and planning domains is considered as an interesting challenge for Artificial Intelligence [Pell 94, Selman & al. 96, Pitrat 98]. Moreover it has advantages over traditional approaches to such problems: metaprograms automatically create the rules that otherwise take a lot of time to create, and the results of the search trees developed using the generated programs are more reliable than the results of the search trees developed using traditional heuristic and hand-coded rules. The metaprogramming methods presented here have been applied successfully in many games and planning problems. The use of programs generated by Introspect to solve other similar problems are currently under investigation. Using logic metaprogramming as described in this paper is particularly suited to automatically create complex, efficient and reliable programs in domains that are complex enough to require a lot of knowledge to cut search trees.

5 References [Barklund 94] Barklund J. : Metaprogramming in Logic. UPMAIL Technical Report N° 80, Uppsala, Sweden, 1994. [Bourdon 98] Bourdon A. : Generation de programmes pour la synthese d’une equipe de robots footballeurs. DEA d’Informatique de Dauphine, 1998. [Cazenave 96a] Cazenave, T.: Système d’Apprentissage par Auto-Observation. Application au Jeu de Go. Ph.D. diss., Université Paris 6, 1996. [Cazenave 96b] Cazenave, T.: Learning to Manage a Firm. First International Conference on Industrial Engineering Applications and Practice, San Diego, 1996. [Cazenave 97] Cazenave T. 1997. Automatically Improving Agents Behaviors in an Urban Simulation. Second International Conference on Industrial Engineering Application and Practice, San Diego, 1997. [Cazenave 98a] Cazenave T.: Metaprogramming Forced Moves. Proceedings ECAI98, Brigthon, 1998. [Cazenave 98b] Cazenave T.: Controlled Partial Deduction of Declarative Logic Programs. ACM Computing Surveys, Special issue on Partial Evaluation, volume 30, n°3es, September 1998. [Cazenave 98c] Cazenave T.: Program Generation for Firms Simulation in Competitive Environment. Strategic Valuation of Firms'98 Conference, Paris 1998 [Cazenave 99] Cazenave T.: Metaprogramming Domain Specific Metaprograms. Reflection99, July 1999. [Dejong & Mooney 86] Dejong, G. and Mooney, R.: Explanation Based Learning : an alternative view. Machine Learning 1 (2), 1986. [Etzioni 93] Etzioni, O.: A structural theory of explanation-based learning. Artificial Intelligence 60 (1), pp. 93-139, 1993. [Fotland & Yoshikawa 97] Fotland D. and Yoshikawa A.: The 3rd fost-cup world-open computer-go championship. ICCA Journal 20 (4):276-278, 1997. [Gallagher 93] Gallagher J.: Specialization of Logic Programs. Proceedings of the ACM SIGPLAN Symposium on PEPM’93, Ed. David Schmidt, ACM Press, Copenhagen, Danemark, 1993. [Hill & Lloyd 94] Hill P. M. and Lloyd J. W.: The Gödel Programming Language. MIT Press, Cambridge, Mass., 1994. [Lloyd & Shepherdson 91] Lloyd J. W. and Shepherdson J. C.: Partial Evaluation in Logic Programming. J. Logic Programming, 11 :217-242., 1991. [Mitchell & al. 86] Mitchell, T. M.; Keller, R. M. and Kedar-Kabelli S. T.: Explanationbased Generalization : A unifying view. Machine Learning 1 (1), 1986. [Pell 94] Pell B.: A Strategic Metagame Player for General Chess-Like Games. Proceedings of AAAI'94, pp. 1378-1385, 1994. ISBN 0-262-61102-3. [Pettorossi & Proietti 96] Pettorossi, A. and Proietti, M.: A Comparative Revisitation of Some Program Transformation Techniques. Partial Evaluation, International Seminar, Dagstuhl Castle, Germany LNCS 1110, pp. 355-385, Springer 1996. [Pitrat 76] Pitrat J.: Realization of a Program Learning to Find Combinations at Chess. Computer Oriented Learning Processes, J. C. Simon editor. NATO Advanced Study Institutes Series. Series E: Applied Science - N° 14. Noordhoff, Leyden, 1976. [Pitrat 90] Pitrat, J.: Métaconnaissance - Futur de l’Intelligence Artificielle. Hermès, Paris, 1990. [Pitrat 98] Pitrat, J.: Games: The Next Challenge. ICCA journal, vol. 21, No. 3, September 1998, pp.147-156, 1998.

[Ram & Leake 95] Ram, A. and Leake, D.: Goal-Driven Learning. Cambridge, MA, MIT Press/Bradford Books, 1995. [Selman & al. 96] Selman, B.; Brooks, R. A.; Dean, T.; Horvitz, E.; Mitchell, T. M.; Nilsson, N. J.: Challenge Problems for Artificial Intelligence. In Proceedings AAAI-96, 13401345, 1996. [Tamaki & Sato 84] Tamaki H. and Sato T.: Unfold/Fold Transformations of Logic Programs. Proc. 2nd Intl. Logic Programming Conf., Uppsala Univ., 1984. [Van den Herik & al. 91] Van den Herik, H. J.; Allis, L. V.; Herschberg, I. S.: Which Games Will Survive ? Heuristic Programming in Artificial Intelligence 2, the Second Computer Olympiad (eds. D. N. L. Levy and D. F. Beal), pp. 232-243. Ellis Horwood. ISBN 0-13382615-5. 1991. [Van Harmelen & Bundy 88] Van Harmelen F. and Bundy A.: Explanation based generalisation = partial evaluation. Artificial Intelligence 36:401-412, 1988.