Learning Strategies for Explanation Patterns: Basic ... - Semantic Scholar

4 downloads 0 Views 101KB Size Report
proposed strategies have been further developed into 21 specific chess .... we use some abbreviations: W for White, B for Black, and "e" for the endgame stage.
In M. Veloso & A. Aamodt (Eds.), Case-Based Reasoning Research and Development, Proceedings of the First International Conference on Case-Based Reasoning, ICCBR-95, Lecture Notes in Artificial Intelligence 1010), pp. 491-500, Berlin: Springer-Verlag, 1995.

Learning Strategies for Explanation Patterns: Basic Game Patterns with Application to Chess Yaakov Kerner Department of Mathematics and Computer Science Bar-Ilan University 52900 Ramat-Gan Israel [email protected]

Abstract. In this paper we describe game-independent strategies, capable of learning explanation patterns (XPs) for evaluation of any basic game pattern. A basic game pattern is defined as a minimal configuration of a small number of pieces and squares which describes only one salient game feature. Each basic pattern can be evaluated by a suitable XP. We have developed five gameindependent strategies (replacement, specialization, generalization, deletion, and insertion) capable of learning XPs or parts of them. Learned XPs can direct players' attention to important analysis that might have been overlooked otherwise. These XPs can improve their understanding, evaluating and planning abilities. At present, the application is only in the domain of chess. The proposed strategies have been further developed into 21 specific chess strategies, which are incorporated in an intelligent educational chess system that is under development.

1 Introduction Case-based reasoning (CBR) means adapting solutions of known cases to solve unknown cases. A potential domain is two-player board game playing (or in short, game playing) since human players use extensive knowledge in their playing. However, little research has been done on CBR in game playing in general and in evaluation of game positions in particular. Case-based playing programs have been constructed for the game of eight-puzzle by Bradtke & Lehnert (1988) and for the game of othello by Callan et al. (1991). A case-based program for evaluation of chess positions is described in Kerner (1994). Most game playing programs do not make the evaluation process explicit. That is, there is a deficiency of evaluative comments concerning given positions. Current systems do not supply any detailed evaluative comments about the internal content of the given positions. An exception is the case-based system described in Kerner (1994), which evaluates chess positions. In this paper we propose game-independent strategies capable of learning new explanation patterns (XPs) for evaluation of any basic game pattern. A basic game pattern is defined as a certain minimal configuration of a small number of pieces and

squares which describes only one salient game feature. At present, the application is only in the domain of chess, which is known as an excellent test-bed for development of ML algorithms. Lessons learned in this research can be useful in other domains. Therefore, the learning strategies have been further developed into 21 specific chess strategies which are part of an intelligent educational chess system that is under development. At present, our model does not use search trees. Therefore, it is not useful to strong players. However, the proposed XPs should be helpful to weak and intermediate players wishing to improve their play in general and their understanding, evaluating and planning abilities in particular. This paper is organized as follows: Section 2 addresses basic game patterns. Section 3 deals with evaluation of game positions. Section 4 presents XPs as suitable data structures for explanation and evaluation of basic game patterns. Section 5 describes learning strategies of XPs in previous case-based systems. Section 6 shows the strategies for the learning of XPs. Section 7 details an illustrative example. Section 8 summarizes the research and proposes future directions. In the Appendix we give a glossary of the main chess concepts mentioned in the paper.

2 Basic Game Patterns It is known that human players use extensive knowledge in playing board games. A psychological study by de Groot (1965) shows that chess players base their evaluations of chess positions on patterns (typical positions) gained through experience. The player's patterns guide him either in deciding which move to play or rather which strategy to choose in a given position. Simon (1974) estimates that a master has a repertoire of about between 25,000 and 100,000 patterns. In order to evaluate any game position, we have designed a game-independent evaluation tree that includes common positional features concerning evaluation of game positions. A tree containing a few hundred features concerning evaluation of chess patterns has been constructed with help of strong chess masters (see Kerner, 1994). Figure 1 presents the three highest levels of this tree that fit all board games. static evaluation special patterns drawing patterns

unspecial patterns

winning patterns

threats piece

mobility

area material

Fig. 1. The evaluation tree

Each feature at each level (except for the last level) is further divided into subfeatures. Each leaf (a node at the last level) in this tree represents a unique basic pattern (examples in Kerner, 1994). Each basic pattern has two suitable explanation patterns (one for White and one for Black) that contain several important comments concerning the discussed pattern. This tree is used primarily for two main tasks: (1) traversing the tree to find all basic patterns included in the position and (2) determining pattern similarity in the

adaptation process proposed in Kerner (1994). The structure of the tree is analogical to the E-MOP, the memory structure introduced by Kolodner (1981). Our concepts resemble Kolodner's generalized information items and our basic patterns can be viewed as her "events."

3 Evaluation of Game Positions Chess experts have established a set of evaluative terms for evaluation of chess positions. These general measures can be used for all two-person board positions. Table 1 presents a few qualitative measures and their meanings. Table 1. A few evaluative terms and their meanings. Evaluative terms

Meaning White is winning White is better White is slightly better equal

Evaluative terms

´

Meaning approximately equal Black is slightly better Black is better Black is winning

Little research has been done concerning the task of giving more detailed evaluative comments for game positions. Most game-playing programs do not make the evaluation process explicit. They usually give as an evaluation only one evaluative term. More detailed explanations are given by the systems composed by Michie (1981), Berliner & Ackley (1982), Epstein (1989) and Pell (1994). However, they do not supply any detailed evaluative comments about the internal content of the given positions. An exception is the case-based system described in Kerner (1994) that gives detailed explanation concerning chess positions. In this paper we propose an extension that deals with evaluation of any two-player game position.

4 XPs as Explanations Structures for Two-Player Game Positions In order to explain and evaluate basic game positions we have used XPs. Kass (1990, p. 9) regards XPs as "variablized explanations that are stored in memory, and can be instantiated to explain new cases". The XP-structure we use is similar to a certain type of an intent explanation proposed by Schank (1986). It contains the following slots: (1) a fact to be explained, (2) a belief about the fact, (3) a purpose implied by the fact, (4) a plan to achieve purpose, and (5) an action to take. This XP structure has been applied to criminal sentencing (Schild & Kerner, 1993) and chess (Kerner, 1994). More general types of XPs have been applied in various story-understanding systems (e.g., Schank et al., 1994; Cox & Ram, 1994).

In game playing, the XP explains and evaluates a certain basic pattern of a game position from either White or Black viewpoint. Since we are concerned with the evaluation of a given position, we use an evaluation slot instead of an action slot. This slot contains one evaluative term (see Table 1). Each basic pattern in the evaluation tree has two general XPs (one for White and one for Black). Five examples of chess XPs are presented in Fig. 6. For convenience we use some abbreviations: W for White, B for Black, and "e" for the endgame stage. Without loss of generality, our examples will be evaluated from White viewpoint, assuming that it is White turn to move. The XP-structure seems to be a convenient data-structure for describing and explaining a specific basic pattern of a given position. However, each game position usually includes more than one basic pattern. Therefore, we have constructed a more complex structure called Multiple eXplanation Pattern (MXP) to explain an entire game position (Kerner, 1994). The learning of MXPs is described in general in Kerner (1994). In this paper we detail the learning of XPs.

5 Learning of XPs in Previous Case-Based Systems The learning of XPs enable improving the correctness, quality and efficiency of an explanation system. Such a system becomes more adequate by deriving better and more creative explanations than those supplied with fewer explanations. In the domain of case-based learning of XPs we find two prominent systems: SWALE (Kass et al., 1986; Kass, 1990) and AQUA (Ram, 1993). These systems are based on Schank's theory of XPs (Schank, 1986). SWALE and AQUA construct new creative explanation patterns by adapting existing explanation patterns. However, these systems are different in their basic assumption and learning methods. On the one hand, Kass's theory assumes that we work in a well-known domain and that past cases are well understood and can be used for future CBR. SWALE constructs new XPs by slightly altering existing XPs. It uses three kinds of strategies: generalize, replace and specify. On the other hand, Ram's theory assumes that we work in a novel domain. Past cases are unavailable or incorrect or incompletely understood or wrongly indexed in the case library. Ram proposes an incremental learning theory based on the revision of previously existing case knowledge. AQUA constructs new XPs by answering questions that underlay the creation, verification and learning of explanation. The evaluation of game positions in general and of chess positions in particular, has been a well-investigated domain for a long time. Game-playing masters and books can serve as useful case libraries. Therefore, we adopt Kass's theory as a basis for our learning theory.

6 Learning Strategies in our Model We have developed an extended theory of learning strategies for evaluation of game positions. It contains five kinds of game-independent strategies capable of learning XPs or parts of them as follows: replacement, specialization, generalization, deletion, and insertion. The first three strategies resemble Kass and Ram's strategies. The last two strategies are new. 6.1 Learning Strategies for Replacing Slot Fillers of XPs These strategies are responsible for substitution of slot fillers. Each strategy substitutes only one slot filler in a given XP. Such a substitution causes a small change to an XP. The motivation is to fix the retrieved XP in order to better explain the given pattern. In principle, these strategies can operate on every slot filler included in the XP-structure. In our chess application, the belief and evaluation slot fillers cannot be changed since they are dependent on the fact slot filler. 6.1.1 Replacing a Slot Filler using the Appropriate Slot Hierarchy: These strategies replace a certain slot filler using the specific slot's hierarchy. In chess we have only three specific strategies. Strategies 1-3 replace the fact, purpose, and plan slot fillers using the fact hierarchy (the evaluation tree), the purpose hierarchy (upper part of Fig. 2), and the plan hierarchy (lower part of Fig. 2), respectively. The belief and evaluation slot fillers cannot be replaced independently because of the reasons mentioned previously. purpose material strong side weak side take

...

promote advance

discard

exchange sacrifice

strong side occupy

positional

threat

weak side

retreat

...

defend

control bring a piece occupy attack breakthrough block to protect

Fig. 2. A partial description of the purpose and plan hierarchies

6.1.2 Replacing a Slot Filler using the Stereotypical-Agent Links: These strategies replace a certain slot filler using the game-independent agent link that replace either one certain type of a game-piece by another type of a game-piece or a White piece by the same Black piece. In our chess application we have two strategies, strategies 4 and 5 that replace the fact and plan slot fillers, respectively, using the chess stereotypical-agent links, which are described in Fig. 3. In addition to the belief and evaluation slot fillers, we also cannot change the purpose slot filler, since it does not contain any piece but rather describes a general purpose.

bishop diagonal queen

rook

bishop

file/ rank queen

strong queen square knight

white color

bishop center

pawn

a certain chess piece

general

knight

black color

another chess piece

Fig. 3. The stereotypical-agent links

6.1.3 Replacing a Slot Filler using the Stereotypical-Area Links: These strategies replace a certain slot filler using the game-independent area link that replace one certain type of a game-area by another type of a game-area that can play the same function. In our chess application we have only two strategies for the same reasons mentioned above. Strategies 6 and 7 replace the fact and plan slot fillers, respectively, using the stereotypical-area links that are described in Fig. 4.

file i

file i+j

rank i

king-side

file

rank i+j

queen-side

rank

a piece on (X1,Y1) threating (X3,Y3) a piece on (X2,Y2) threating (X3,Y3)

Fig. 4. The stereotypical-area links

6.2 Learning Strategies for Specializing Slot Fillers of XPs

These strategies specialize a certain slot filler of an XP. In principle, these strategies can operate on every slot filler included in the XP-structure. There are two kinds of specialization strategies: (1) that specializes a slot filler from one general filler to a more specific filler and (2) that joins two slot fillers into one extended slot filler. In our chess application, both specialization strategies work only on the fact, purpose, and plan slot fillers. The belief and evaluation slot fillers are not appropriate for these strategies since they are dependent entirely on the fact slot filler. Strategies 8-10, derived from the first specialization strategy, specialize the fact, purpose and plan slot fillers, respectively, using the stereotypical-specification links described in Fig. 5. Strategies 11-13, derived from the second specialization strategy, specialize the fact, purpose and plan slot fillers, respectively, using two slot fillers into one extended slot filler. major piece rook, queen

minor piece bishop, knight

piece a specific type of piece

a specific type of piece specific piece on a specific square

Fig. 5. The stereotypical-specification links

6.3 Learning Strategies for Generalizing XPs

These strategies generalize a certain slot filler of XP. In principle, they can operate on every slot filler included in the XP-structure. These strategies are opposite to the specialization strategies mentioned above. That is, there are two kinds of generalization strategies: (1) that generalizes a slot filler from one specific filler to a more general filler and (2) that deletes one problematic slot filler from two slot fillers. By a problematic value we mean a filler that is not valid for the discussed position. In our application, both strategies work on the fact, purpose, and plan slot fillers. The belief and evaluation slot fillers are not appropriate for these strategies since they are dependent entirely on the fact slot filler. Strategies 14-16, derived from the first generalization strategy, generalize the fact, purpose and plan slot fillers, respectively, using the stereotypical-specification links described in Fig. 5. Strategies 17-19, derived from the second generalization strategy, generalize the fact, purpose, and plan slot fillers, respectively, by deleting one of two or more slot fillers. 6.4 Deletion and Insertion Strategies of XPs

The deletion strategy deletes a problematic XP and its related data from its data-base. The activation of this strategy is dependent upon the expert. In addition, the system can ask the expert for deletion of XPs that are used rather seldom. In contrast there is an insertion strategy enabling an insertion of a new XP and its related data to either DB1 or DB2. For instance, it can be performed in case the system has not satisfied the expert. In our application, these strategies are strategies 20 and 21.

7 Example In this example, we illustrate an evaluation of a certain basic chess pattern "a backward pawn (a7) for Black on the closed a-file in the endgame stage" (call it bp) that is included in Position 1. Out task is to learn XPs that explain bp from White viewpoint. DB1 is a data-base of general XPs (XPs for the basic patterns) and DB2 is a data-base of positions that include representative basic patterns and their XPs. All XPs involved in the evaluation of bp are presented in Fig. 6. A few important concepts mentioned in these XPs are defined in a chess glossary in the Appendix. Here, the learning is as follows: XP-1 is retrieved from DB1 since its fact slot filler is bp. Applying a specialization strategy on it, we get XP-2, which is found as valid for Position 1. Thus, XP-2 is proposed as an explanation for bp. Position 2 and its XP (XP-3) are retrieved from DB2 since the fact slot filler of XP-3 is considered as a near neighbor to bp in the evaluation tree. XP-4 is created from XP-3 after replacing its fact slot filler with bp. XP-4 is found valid for Position 1 and it is also proposed as an explanation for bp. XP-5's slot fillers are the same as those of XP-2 and XP-4 except for the plan slot filler. By using a specialization strategy on these plan slot fillers we get a new elaborated plan slot filler.

By utilizing simple limited searching, we ensure that the new plan can be theoretically made on Position 1. We find a way to switch the White c5-rook to the afile (a5) and to switch the White c2-rook to the 7th rank (c7). Due to the approval of XP-5 by our chess expert, Position 1, its bp and its XP (XP-5) are stored in DB2.

w ________ ________ árdw4wdkd]áwdwd 4kd] à0wdpdw0p]à0wdwdw0p] wdw)pdwd] wdw0w0wd] dw$w)pdw] dwdP0wdw] w0wdw)wd] w0PdPdwd] dPdwdwdw] dPdwdPdw] PdRdwdP)] wdwdRdP)] dwdwdwIw] dwdwdwIw] w w Position 1 XP-1

XP-2

XP-3

Position 2 XP-4

XP-5

a backward pawn for B on a closed file in the "e" stage

a backward pawn (a7) for B on the closed a-file in the "e" stage

a backward pawn (a7) for B on the semiopen a-file in the "e" stage

a backward pawn (a7) for B on the closed a-file in the "e" stage

a backward pawn (a7) for B on the closed a-file in the "e" stage

belief

weakness in pawn structure for B

weakness in pawn structure for B

weakness in pawn structure for B

weakness in pawn structure for B

weakness in pawn structure for B

purpose

winning B's weak pawn

winning B's weak pawn (a7)

winning B's weak pawn (a7)

winning B's weak pawn (a7)

fact

plan

attacking it with a major piece through its rank

evaluation

attacking it with a rook through the 7th rank

attacking it with a rook through the a-file

attacking it with a rook through the a-file

winning B's weak pawn (a7) attacking it with one rook through the a-file & with another rook through the 7th rank

Fig. 6. XPs involved in the evaluation process of bp (the discussed basic pattern)

8 Summary and Future Work

We have developed an initial framework of learning strategies of explanation patterns for evaluation of basic game patterns. Our application, is to the best of our knowledge, the first application of case-based learning of XPs for game positions. These learning strategies seem to enable our system to supply better and more creative explanations for game positions than other existing computer game-playing systems. Experiments in the near future are planned to evaluate and improve the performance of our explanation system over learning new XPs. In chess, profound understanding has been shown to be inefficient without playing and deep searching. Therefore, to strengthen our model there is a need to implement playing modules with searching capability. In addition to this, there is a need to implement specific learning strategies for other games. Case-based planning is another important issue we have to deal with more deeply using game tree search. The learning of multiple explanation patterns (MXPs) for evaluation of any game position remains for future work.

Acknowledgments Thanks to Sean Engelson, Avraham Norin, Shlomit Zeiger, and anonymous referees for valuable comments and suggestions.

References Berliner, H. J. & Ackley, D. H. (1982). The QBKG System: Generating Explanations from a Non-Discrete Knowledge Representation. In Proceedings of the National Conference on Artificial Intelligence (pp. 213-216). AAAI Press. Bradtke, S. & Lehnert, W. G. (1988). Some Experiments With Case-Based Search. In Proceedings of the Seventh National Conference on Artificial Intelligence (pp. 133-138). San Mateo: Morgan Kaufmann. Callan, J. P., Fawcett, T. E. & Rissland, E. L. (1991). Adaptive Case-Based Reasoning. In Proceedings of a Workshop on CBR (pp. 179-190). San Mateo: Morgan Kaufman. Cox, M. T. & Ram, A. (1994). Interacting Learning-Goals: Treating Learning as a Planning Task. To appear in Keane, M.; Haton J. P.; and Manago M. (Eds.), Topics in Case-Based Reasoning - EWCBR'94, Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag. de Groot, A. D. (1965). Thought and Choice. Mouton, The Hague. Epstein, S. (1989). The Intelligent Novice - Learning to Play Better. In D. N. L. Levy & D. F. Beal (Eds.), Heuristic Programming in Artificial Intelligence - The First Computer Olympiad. Ellis Horwood. Kass, A. M., Leake, D. & Owens, C. (1986). SWALE: A Program that Explains. In Schank, R.C. (Ed.) (1986). Explanation Patterns: Understanding Mechanically and Creatively. Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 232-254.

Kass, A. M. (1990). Developing Creative Hypotheses by Adapting Explanations. Technical Report #6, p. 9. Institute for the Learning Sciences, Northwestern University, U.S.A. Kerner, Y. (1994). Case-Based Evaluation in Computer Chess. In Proceedings of the Second European Workshop on Case-Based Reasoning (pp. 95-103). Paris: AcknoSoft Press. Extended paper to appear in Keane, M.; Haton J. P.; and Manago M. (Eds.), Topics in Case-Based Reasoning - EWCBR'94, Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag, 1995. Kolodner, J. L. (1981). Organization and Retrieval in a Conceptual Memory for Events or Con54, Where are You? In Proceedings of the Seventh International Joint Conference on Artificial Intelligence (pp. 227-233). Los Altos: William Kaufmann. Michie, D. (1981). A Theory of Evaluative Comments in Chess with a Note on Minimaxing. The Computer Journal, Vol. 24, No. 3, 278-286. Pell, B. (1994). A Strategic Metagame Player for General Chess-Like Game. In Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 1378-1385). Seattle: AAAI Press/The MIT Press. Ram, A. (1993). Indexing, Elaboration and Refinement: Incremental Learning of Explanatory Cases. Machine Learning 10 (3), 201-248. Schank, R.C. (Ed.) (1986). Explanation Patterns: Understanding Mechanically and Creatively. Hillsdale, NJ: Lawrence Erlbaum Associates. Schank, R.C., Kass A. & Riesbeck C. K. (Eds.) (1994). Inside Case-Based Explanation. Hillsdale, NJ: Lawrence Erlbaum Associates. Schild, U. J. & Kerner, Y. (1993). Multiple Explanation Patterns. In Proceedings of the First European Workshop on Case-Based Reasoning, Vol. II (pp. 379-384). Kaiserslautern: Germany. Extended paper in Wess, S.; Althoff, K-D.; and Richter, M. M. (Eds.), Topics in Case-Based Reasoning - EWCBR'93, Lecture Notes in Artificial Intelligence 837 (pp. 353364). Berlin: Springer-Verlag, 1994. Shannon, C. E. (1950). Programming a Computer for Playing Chess. Philosophical Magazine, Vol. 41(7), 256-277. Simon, H. A. (1974). How Big is a Chunk? Science No. 183, 482-488.

Appendix Chess Glossary Backward pawn: A pawn that has been left behind by neighboring pawns of its own color and can no longer be supported by them. Closed file: A file with pawns of both colors. Endgame: The last and deciding stage of the chess game. In this stage the position becomes simplified and usually contains a relatively small number of pieces. Major piece: A rook or a queen. Minor piece: A bishop or a knight. Open file: A file without pawns. Semi-closed file: A file with pawn/s of only one's own color. Semi-open file: A file with pawn/s of only the opponent's color.

Suggest Documents