Machine Learning Techniques for Adaptive Logic ... - Semantic Scholar

7 downloads 0 Views 201KB Size Report
Machine Learning Techniques for Adaptive Logic-Based. Multi-Agent Systems. Eduardo Alonso, Daniel Kudenko. Artificial Intelligence Group. Department of ...
Machine Learning Techniques for Adaptive Logic-Based Multi-Agent Systems Eduardo Alonso, Daniel Kudenko

Arti cial Intelligence Group Department of Computer Science University of York, York YO10 5DD Phone: 1904 432707, Fax: 1904 432767 Email: fea,[email protected] Abstract: It is widely recognised in the agent community that one of the more important features of high level agents is their capability to adapt and learn in dynamic, uncertain domains [11, 28]. A lot of work has been recently produced on this topic, particularly in the eld of learning in multi-agent systems [1, 30, 31, 32, 40, 41]. It is, however, worth noticing that whereas some kind of logic is used to specify the (multi-)agents' architecture, mainly non-relational learning techniques such as reinforcement learning are applied. We think that these approaches are not well-suited to deal with the large amount of knowledge that agents need in multi-agent systems: Failing in taking advantage of prior knowledge, reinforcement learning performs poorly in such systems. We propose to use logic-based learning techniques such as explanation-based learning [21] and inductive logic programming [22] instead. We have chosen con ict simulations as a hierarchically structured domain to illustrate and test the potential of this new approach.

1 Introduction In recent years, multi-agent systems have received increasing attention in the arti cial intelligence community. Research in multi-agent systems involves the investigation of autonomous, rational and exible behaviour of entities such as software programs or robots, and their interaction and coordination in cooperative or competitive settings. There is a large number of applications of multi-agent systems in such diverse areas as robotics, information retrieval and management, and simulation. Apart from the fact that many applications are inherently distributed, multi-agent systems have signi cant advantages over single, monolithic, centralized problem solving: faster problem solving, decreased communication, more exibility, and increased reliability. It is very dicult and sometimes even impossible for a developer to foresee all potential situations an agent could encounter and specify the agent behaviour in advance. This is especially true for multi-agent environments. Therefore it is widely recognised in the agent community that one of the more important features of high level agents is their capability to adapt and learn [11, 28]. A lot of work has been recently produced on this topic, particularly in the eld of learning in Multi-Agent Systems (MAS) [1, 30, 31, 32, 40, 41]. It is, however, worth noticing that whereas in some cases logic is used to specify the (multi-)agent architecture in order to incorporate domain knowledge, most learning algorithms that have been applied to agents are not logic-based, and instead other techniques such as reinforcement learning are used. Even though these techniques are successful in restricted domains, they strip agents of the ability to adapt in a domain-dependent fashion, based on background knowledge of the respective situation. This ability is crucial in complex domains where background knowledge has a large impact on the quality of the agents' decision making. Independently, logic-based techniques such as explanation-based learning (EBL) [21] and inductive logic programming (ILP) [22] have been studied extensively in the area of Machine Learning (ML) and have been successfully applied to a wide range of domains. We propose to apply logic-based learning techniques to endow deliberative agents in multi-agent domains with learning capabilities. The resulting framework uses logic to represent domain knowledge, individual and collective behaviour, and employs EBL and ILP methods for the learning module.

We have chosen con ict simulations as an organizationally structured domain to illustrate the potential of this new approach. Con ict simulations are models of military confrontations and are an ideal testbed for adaptive logic-based multi-agent systems because of (1) availability of large amounts of crucial background knowledge, (2) diversity of underlying models which pose a challenge to generality and adaptivity of the multi-agent system, (3) variations in complexity which allow to test the scalability of the system, (4) practical usefulness of intelligent computer opponents for military training and strategic decision making. The proposed research project will advance the state of the art of multi-agent systems research by developing and evaluating solutions for: 1. Learning in multi-agent systems: Most approaches to learning in Multi-Agent systems have been using purely reactive architectures where agents base their actions just on the current situation and on the previous history and there is no notion of deliberate planning towards an explicit goal [14]. We believe that such approaches will be less than optimal in complex domains where knowledge-based planning and complex coordination is necessary. Logic-based learning will enable the agents to use important domain knowledge explicitly in their reasoning and adaptation processes in order to increase e ectiveness. 2. Hierarchical logic-based multi-agent systems: We plan to solve coordination and communication problems in a logic-based architecture with a hierarchical agent command structure, analogous to military command hierarchies [29]. Especially the problem of communication amongst di erent levels of hierarchy will be tackled (taking into account di erent world representations on the di erent levels). 3. Organization theory and Organizational Learning: Even though collective behaviour has been the object of important research in multi-agent systems (e.g, the RoboCup Initiative [15]), the scale-up behaviour of the teams under consideration is quite limited. By introducing exible organizations we will advance the study of social action in complex domains. Moreover, we think that the introduction of logic-based learning techniques in organizations will throw new light on the recently developed area of Organizational Learning [38]. 4. Partially de ned agent behaviour: Even though a complete de nition of agent behaviour is often infeasible and too restrictive, partial restrictions on the way agent's act (e.g., military doctrines) can often be useful. We plan to design a system that permits the user to implement such restrictions eciently. 5. Explanation of agent behaviour: While in purely reactive architectures an explanation of the agents long-term behaviour is often dicult to come up with and even harder to prove, logic-based systems do not su er from this drawback. Since the actions are based on deliberate planning, explanations of behaviour can be extracted easier. This is of particular interest in domains where the \why" of a solution is at least as important as the solution itself. 6. Other potential bene ciaries would be military training and strategy development, and arti cial intelligence for games.

2 Multi-Agent Systems Research in Multi-Agent Systems (MAS) is concerned with coordinating behaviour among a collection of autonomous and probably heterogeneous agents [24, 42]. We are interested in long-term MAS in which rational agents reason about how to achieve a common goal in dynamic environments. In such cases, coordination is usually achieved through general-purpose organizational structuring. Organizations can be described at high level and low level: At the high level, an organization provides a framework for agent interactions through the de nition of roles, behaviour expectations, and authority relations (the chain of command). Organizations are conceptualized in terms of their structure, that is, the pattern of information and control relations that exist among agents and the distribution of problem solving capabilities among them. These control relationships are responsible for designating the relative authority of the agents and for shaping the types of social interaction which can occur. The relationships speci ed by the organizational structures give general, long-term information about the agents and the community as a whole. At this level a plan is viewed as an abstract speci cation, as a social plan structure to achieve the common goal. A suitable organizational structure for application domains such as con ict simulations is a command hierarchy, which we intend to implement in our system. The authority for decision making and control

is concentrated in a single problem solver (or specialized group) at each level of the hierarchy. Interacting is through vertical communication from superior to subordinate agent, and viceversa. At the low level, an organization is generally de ned as a set of agents which are jointly committed to achieve a common goal. At this level, a plan is a complex mental attitude, a (joint) intention [26]. Certainly, teamwork is more than a simple union of simultaneous coordinated activity. While collective work does involve coordination, in addition, it at least involves a common team goal and cooperation among benevolent team members [2, 3, 4, 45] . It is nonetheless worth noticing that as main communicative and decision-making responsibilities are delegated, conditions for social action (such as mutual beliefs and joint commitments) should be relaxed. Organizations can ensure that the agents meet conditions that are essential to successful problem solving, including [5]: Coverage: any necessary portion of the overall problem must be within the problem-solving capabilities of at least one agent; connectivity: agents must interact in a manner that permits the covered activities to be developed and integrated into an overall solution, and; capability: coverage and connectivity must be achievable within the communication and computation resource limitations, as well as the reliability speci cations of the group. Moreover, hierarchical structures favour the use of hierarchical planners: Di erent levels in the hierarchy can be viewed as di erent abstractions of a problem space. Hierarchies of abstraction are known to reduce the size of the search space and, consequently, the complexity in problem solving [16]. Coordination and communication costs are also reduced as horizontal communication (communication among agents at the same level of the hierarchy) is avoided and vertical communication is restricted to comply with the principles of relevance, timeliness, and completeness [7]. Even though there is still controversy regarding the role that formal methods should play in Distributed AI, \logic" has been used extensively to specify cooperative problem solving (e.g., [12, 26, 45]): Mathematical logic provides the structured representations needed to specify complex domains; moreover, logic is the preferred representation for knowledge and reasoning (given its naturality, expressiveness and well-understood properties), and agents need large amounts of knowledge (domain knowledge, common sense knowledge, social knowledge) to get coordinated in MAS. It has been pointed out that MAS is one of the main research areas in AI. For example, Yoav Shoham has identi ed four pioneers of \Rational Programming" [34], the new paradigm of AI: Agent-Oriented Programming [33], Elephant2000 [19], Market Oriented Programming [43], and GALA [18]. Not surprisingly, all four are multi-agent systems. On the other hand, well-known AI researchers such as John McCarthy [20] and Nils Nilsson [23] have claimed that the challenge for the new millennium is to go back to \good old-fashioned AI" (GOFAI), and build general (unembedded) intelligent systems. These systems are nothing but agents. And the knowledge and reasoning capabilities that make them intelligent should be encoded in logic languages.

3 Learning in Multi-Agent Systems Research in Multi-Agent Learning (MAL) is concerned with the way agents improve their individual and collective behaviour in MAS scenarios. From a global perspective, the study of MAL bene ts both the multi-agent systems community and the machine learning community: 1. There is a strong need for learning techniques and methods in the area of multiagent systems. These systems show several characteristics that make it particularly dicult to specify them correctly and completely: for instance, there is no global system control, data is decentralized, computation is asynchronous, etc. Because of these characteristics, it is obviously desirable that the agents themselves are capable of improving their own behaviour, in addition to the overall system's behaviour. Several dimensions of multiagent interaction can be subject to learning. These include: when to interact, with whom to interact, how to interact, and what exactly the content of the interaction should be. 2. The machine learning area can pro t from an extended view capturing both single-agent and multiagent learning. Interacting agents, as they can exchange information or modify the shared environment in which they are embedded, can signi cantly in uence each other in their individual learning. Possible forms of in uence are, for instance, initiation, acceleration, redirection, and prevention of another agent's learning process. Interaction makes it possible that learning by one agent can considerably change the conditions for learning with which other agents have to cope. In particular, interaction is the key to various forms of collective learning in which several agents try to achieve as a group what the individuals cannot, by

sharing the load of learning and by pursuing a common learning goal on the basis of their diverse knowledge, capabilities, experience, preferences, and so forth. So far, mainly reinforcement learning (RL) techniques have been used throughout the MAL literature (see [1, 30, 31, 36, 40, 41]). In reinforcement learning reactive agents are given a description of the current state and have to choose the next action so as to maximize a scalar reinforcement received after each action. The task of the agent is to learn from indirect, delayed reward, to choose sequences of actions that produce the greatest cumulative reward. RL is mostly used when the agent receives no examples (is not supervised) and starts with no model of the environment. We don't think, however, that RL is well-suited to handling with the MAS we are interested in. RL is limited to nonrelational descriptions of objects and thus has two major limitations: (a) the background knowledge can be expressed in rather limited form (therefore, lots of training data and experience are required); (b) the lack of relations makes the concept description language inappropriate for some domains, such as con ict simulations. No doubt, RL problems are of great interest, but it is not a good designing choice to strip agents of knowledge and reasoning capabilities when they have to make social decisions in complex domains. We think that formal languages and methods should be employed instead. In particular, we plan to test the following hypothesis: Logic-based learning techniques improve agents' performance in terms of eciency and adaptability in complex, dynamic MAS. Our line of argumentation is as follows:  The use background knowledge improves planning and learning in multi-agent systems;  Logic is the preferred representation for knowledge and reasoning; therefore,  Logic-based learning techniques should be incorporated into agents' architecture. Roughly stated, learning in our MAS can be viewed as a two-level process: On the one hand, agents learn about which control rules make them act optimally in the organization. On the other hand, agents learn from experience (and the help of prior knowledge) which theories are more likely to predict how the world evolves in order to build and update their (sub)plans. The rst learning task puts the accent on improving the system's problem-solving capabilities and the second on the acquisition of new knowledge for planning. Explanation Based Learning (EBL) [21] and Inductive Logic Programming (ILP) [22] will respectively be used to cope with these learning tasks.

EBL in MAL: In the rst case, the agents are concerned in improving the eciency of the problem solver (speed-up learning) rather than in acquiring more knowledge. Obviously, Distributed AI problem solvers, when presented with the same problem repeatedly, should not solve it the same way and in the same amount of time. On the contrary, it seems sensible to use general knowledge to analyze, or explain, each training example in order to optimize future performance. This learning is not merely a way of making a program run faster, but a way of producing qualitative changes in the performance of a problem-solving system. EBL is a suitable strategy to implement this kind of learning. In short, EBL extracts general rules from single examples by generating an explanation of the examples and generalizing it. This provides a deductive (rather than statistical) method to turn rst-principles knowledge into useful, ecient, special-purpose expertise. This sort of analytical learning has been used successfully in "re ning" search control knowledge: By explaining why a control choice was appropriate or inappropriate, the system can learn a general search control rule. The learned rule enables the planner to make the right choice if a similar situation arises during subsequent problem solving. We will apply EBL methods to complete and correct knowledge, the knowledge the agents have about the rules of the \game". In war games [6], the commander at each level will deduce from the rules of the game in which conditions a battle (a combat, a war) is won. For example, a player can learn that if it cuts all the retreat paths of an enemy's unit so that it must enter one of the player's zones of control in its rst movement, then it will succeed in eliminating that unit. The explanation is that the enemy is forced to stop because there is a movement rule saying \A unit entering an enemy zone of control must immediately end its movement phase"; and there is a combat rule saying that \Units cannot end their retreat in an enemy zone of control (they are eliminated if they do)"1: First, the enemy is paralyzed; then, it is eliminated. Much like in chess, where players learn that to kill the opponent's queen they must check its king simultaneously |provided that the checking piece (and, obviously, its own king and queen) is out of the enemy's reach. 1

EBL will also be used to restructure the organization. Learning more ecient control rules in this respect means that the system learns from success and failure which actions and/or agents are needed and which ones are super uous for a concrete (sub)task, which roles should be reassigned, which subtasks should be executed rst, etc.

ILP in MAL: The second learning task to be accomplished in this scenario emerges when rationally bounded agents acquire new knowledge (i.e., rules) about the conditions of their (social) attitudes in collaborative and competitive settings. Because structural reorganization can be costly and time consuming and speci c problem characteristics cannot be predicted beforehand, an organizational structure will enable agents to react suitably to changing situations. At each level agents have relative autonomy (as long as their decisions don't lead to incoherent global behaviour) to learn how to form and change their plans to achieve a subgoal according to their experience. For example, an agent can learn that its subtask is not attainable any longer [12]. It is one thing to deduce from general rules the order and assignment of the subtasks that should, in theory, lead the army to victory; it is another to learn from examples (and prior knowledge) whether the conditions for such subtasks to be executed do actually hold on the spot. An optimal EBL-learned rule can turn out to be impractical in real-life domains in which agents work with incomplete knowledge. In the above example, a player's unit can learn that its type, combat strength, or movement allowance are not adequate to accomplish its subtask because to get to its attacking position has to cross an unexpected forest. Unlike chess, war is a dynamic \game". ILP methods compute an inductive hypothesis through inverse resolution or, alternatively, \top-down" re nement techniques that explains sets of observations with the help of background knowledge expressed as logic programs. So far, logic-based learning techniques have been applied mainly to concept learning. By demonstrating the potential of ILP for learning in MAS scenarios, we hope to show that a logic-based learning method can be applied to less con ned learning tasks. Speci cally, we plan to use ILP for the acquisition of new domain and strategical knowledge from training data. For example, a target hypothesis to be acquired can be rules of the from \IF the agent and the world are in state S AND plan P is executed THEN the agent will succeed (i.e., a state in which the agent's subgoal holds will be achieved)". For an agent to induce such a hypothesis, it will receive positive and negative examples of the target concept (i.e., successful decisions). The initial hypothesis is based on background knowledge (and thus might be incomplete). After executing a plan, each agent will check its hypothesis according to success or failure. If the original hypothesis was correct in predicting success, no changes will be carried out. Otherwise, a new hypothesis will be computed based on the training set extended with the new example. It is our understanding that the use of logic-based learning techniques in hierarchical organizations can be bene cial for both the multi-agent community and the machine learning community: In our hierarchical MAS, an individual agent acts as a \specialist" who is responsible for a speci c subset of the activities that as a whole form the overall learning process. Di erent bene ts can be expected from the integration of abstraction and learning. First, the learning module can be applied within each abstract problem space which enables the agents to solve problems within their respective abstraction space. In addition, the abstract problem spaces de ne more general and thus simpler problems and theories, enabling agents to learn more ecient and appropriate control rules. The major advantage of using logic-based learning techniques is the use of background knowledge. Speci cally, the advantages are: (a) an expressive knowledge representation formalism, which enables the user to describe complex domains (such as con ict simulations); (b) the incorporation of domain knowledge in the learning and reasoning processes, resulting in increased e ectiveness; (c) reduction of the hypothesis space: it can only include those theories that are consistent with prior knowledge; and (d) the results of learning and reasoning processes are easier to read and understand and therefore it becomes easier to check and explain the behaviour of the MAS (e.g., compared to reinforcement learning architectures).

4 Application Domain: Con ict Simulations We decided to use con ict simulations as an appropriate application domain for the proposed research. Before we describe the reasons for this choice, we present a de nition rst. Con ict simulations (or war games) [6, 25, 35] can be generally de ned as models of military confrontations. Usually these models involve a map of the battle eld, military units, and a set of simulation rules that de ne the abilities of the units, combat resolution, terrain e ects, etc. The scale of the models can range from tactical skirmishes to grand strategic campaigns.

Con ict simulations are widely used by the military to train personnel and to develop and test battle strategies. There is a wide range of commercial simulations available (both \on paper" and as computer software) which is popularly used for educational and entertainment purposes. Even though the commercial simulations vary greatly in realism, there are examples of products that have been picked up by the military. For example, a commercial wargame (Gulf Strike [10]) was the basis for some of NATO's strategic decisions after the invasion of Kuwait by Iraq [6]. Con ict simulations are a tting and challenging application domain to test and evaluate logic-based multi-agent systems because of the following properties: 





Large amount of useful background knowledge: One of the major advantages of logic-based systems is the ability to incorporate large amounts of domain knowledge in the reasoning processes. For con ict simulations such domain knowledge is readily available in form of the simulation model, for example, capabilities of di erent units, defensive properties of terrain, etc. Di erent degrees of complexity: The number of agents and the complexity of the underlying simulation model can be easily varied (number of units, size of the map, complexity of rules etc.) and therefore it can be tested how well the multi-agent system scales up. Diversity of con ict models: Even though con ict simulations are a speci c class of applications there are many variations. For example, a simulation of D-Day will be considerably di erent from a simulation of Operation Desert Storm. Di erent simulation models require di erent strategies and this poses a challenge to the generality and adaptivity of the multi-agent system.

In addition, adaptive logic-based multi-agent techniques are highly promising for creating an intelligent computer opponent for con ict simulations for the following reasons: 





Domain Knowledge: Con ict simulation models capture a large amount of information on the domain.

This information is crucial for intelligent planning and therefore we do not expect simple reactive multi-agent systems to perform well. Logic-based approaches are able to use the knowledge available in planning and learning and are thus ideally suited. Complexity: Even on a small scale standard approaches to arti cial intelligence for games such as Mini-max search are infeasible for con ict simulations. Since all units of one side move simultaneously the breadth of the resulting game-tree is of magnitudes higher than that of chess, for example. This problem is further emphasized in medium to large simulations that involve hundreds of units. Since a global search for the optimal strategy is not possible, the search process has to be localized via a multi-agent approach. Coordination: Battles require a high degree of coordination between the units on di erent levels of organization. For example, units have to coordinate single attacks locally, but at the same time all attacks at di erent parts of the battle eld have to be coordinated globally. Multi-agent systems are ideal for the development of hierarchical coordination techniques and this will be a major emphasis of the proposed research project.

We plan to initially focus our applications on turn-based variants of con ict simulations. This restriction will remove problems of real-time updates and computational complexity, and especially makes sense on an operational or strategic level where each turn represents 6 and more hours of battle action. By reducing the time allocated for each turn we can move the application gradually closer to real-time simulations at later stages of development. Note that for an e ective application of intelligent computer opponents in con ict simulations the user should have the option to model military doctrines into the agent behaviour. For instance, computer opponents simulating Iraqi forces during Operation Desert Storm should behave di erently to computer opponents simulating British troops, e.g. in the choice of attack formations. We plan to accommodate this need in our MAS design.

5 Evaluation We will evaluate the model according to three parameters, namely:







E ectiveness of the multi-agent system: What is the quality of the decisions made by the MAS?

This can be measured by a win/loss statistics against human opponents or baseline systems using reactive architectures to highlight the advantages (or disadvantages) of our logic-based approach. Win/loss statistics can be made more transparent by including more detailed information such as casualty numbers, number of turns to win, etc. Learning/Adaptation ability: How well does the multi-agent system adapt to static opponents? Most strategies have counter-strategies. The ability of a multi-agent system to nd and employ such counterstrategies when faced with an opponent that uses a certain strategy over and over will show the success of the learning strategies. Furthermore, the ability of a multi-agent system that has been trained in one con ict model to adapt to and win in a di erent environment (con ict simulation model) can be used as additional evidence. E ectiveness gain from background knowledge: How much does the behaviour of the MAS improve when additional useful background knowledge is added? In other words, how well can the MAS incorporate background knowledge in its reasoning? This can be done by comparing the performance of di erent versions of the MAS with di erent amounts of domain knowledge.

6 Related Work Our work is related to other works in multi-agent systems, machine learning, and con ict simulations: 





Multi-Agent Systems: As mentioned earlier, most implementations of multi-agent learning continue to

use reinforcement learning and neural networks techniques [1, 30, 31, 40, 41]. An exception is the research developed by Sugawara and Lesser [37]. They study learning methods to acquire coordination plans in distributed problem-solving environments. Speci cally, they develop a distributed learning component based on EBL techniques and comparative analysis. However, they do not separate out coordination in teamwork from coordination in general. As a result, they fail to exploit the responsibilities and commitments of teamwork in building up coordination relationships. Our work is related to model-based collaborative systems such as GRATE [13], COLLAGEN [27], and STEAM [39]. These systems provide agents with a model of teamwork based on joint intentions. However, the domains are very limited, both in the number of agents involved and the task they have to perform. On the other hand, the capabilities that the agents need for collaboration are designed from the start, so learning is not an important part in the agents' architecture. Machine Learning: The EBL module in our model is closely related to Knoblock's et al. research in integrating abstraction and EBL [17]. Nevertheless, the authors' motivation is to reduce the search space in a single problem-solver and, consequently, the hierarchies do not represent di erent levels of expertise and the search is not executed in a distributed way. On the other hand, Dzeroski et al. [8] have introduced recently relational reinforcement learning (RRL) techniques, which combine logic programming and RL techniques. RRL employs the Q-learning method where the Q-function is learned using a relational regression tree algorithm. The purpose of this paper is to work on generalization in RL without using neural networks. However interesting this new approach may be, the exact scale-up behaviour of RRL has still to be determined.

Con ict Simulations:

Con ict simulations have been used by the military since ancient times and have been receiving increased attention in the past decades [6, 25]. There have been approaches in the past to implement computer forces for military research [29, 39] and entertainment purposes [9, 44]. Approaches that were not based on multi-agent techniques su ered strongly from a lack of unit coordination. Recent research systems employ multi-agent approaches [39], but logic-based approaches for learning and adaptation has not yet been thoroughly studied in this domain.

7 Summary It is dicult and often infeasible to specify Multi-Agent Systems completely in advance, because there are frequently unforeseen situations that the agents may encounter. One solution is the use of machine learning techniques to enable agents to learn from and adapt to the environment. We aim at developing algorithms for learning and coordination in multi-agent environments. Speci cally, we plan to test the hypothesis that logic-based systems will perform e ectively because of their ability to directly incorporate domain knowledge in the reasoning processes of the agents. We plan to focus on con ict simulations (models of military battles) as an ideal application domain. In the proposed research we will apply Inductive Logic Programming and Explanation- Based Learning methods to logic-based multi-agent systems, with the following expected outcomes: (1) increased ability of Multi-Agent Systems to adapt to new/unknown environments, (2) exploitation of background knowledge in reasoning processes of the Multi-Agent Systems, (3) overcoming the communication bottleneck between agents with hierarchical command structures, (4) development of a tool for military training and strategy development.

References

[1] Proceedings of the IJCAI'99 Workshop on Agents Learning About, From and With Other Agents. Stockholm, Sweden, 1999. [2] E. Alonso. How individuals negotiate societies. In Proc. ICMAS-98, pages 18{25. IEEE Computer Society Press, 1998. [3] E. Alonso. An individualistic approach to social action in Multi-Agent Systems. Journal of Experimental and Theoretical Arti cial Intelligence, In press, 1999. [4] P.R. Cohen and Levesque H.J. Teamwork. No^us, 25:487{512, 1991. [5] D.D. Corkill and V.R. Lesser. The use of meta-level control for coordination in a distributed problem solving network. In Proc. IJCAI-83, pages 748{756, 1983. [6] J.F. Dunnigan. The complete wargames handbook. Morrow, New York, 1992. [7] E.H. Durfee, V.R. Lesser, and D.D. Corkill. Coherent cooperation among communicating problem solvers. IEEE Transactions on Computers, 36:1275{1291, 1987. [8] S. Dzeroski, L. DeRaedt, and H. Blockeel. Relational reinforcement learning. In Eigth International Conference on Inductive Logic Programming, pages 11{22, 1998. [9] S. Estvanik. Arti cial intelligence in wargames. AI Expert, pages 23{31, May 1994. [10] M. Hermann. Gulf Strike. Victory Games, 1988. [11] M.N. Huhns and M.P. Singh (Eds.). Readings in agents. Morgan Kaufmann, San Francisco, CA, 1998. [12] N.R. Jennings. On being responsible. In Proc. MAAMAW-92, pages 93{102, 1992. [13] N.R. Jennings. Controling Cooperative Problem Solving in Industrial Multi- Agent Systems Using Joint Intentions. Arti cial Intelligence, 75:195{240, 1995. [14] L.P. Kaelbling, M.L. Littman, and A.W. Moore. Reinforcement learning: A survey. Journal of Arti cial Intelligence Research, 4:237{285, 1996. [15] H. Kitano, Y. Kuniyoshi, I. Noda, M. Asada, H. Matsubara, and E. Osawa. RoboCup: A challenge problem for AI. AI Magazine, 18:73{85, 1997. [16] C.A. Knoblock. Automatically Generating Abstractions for Planning. Arti cial Intelligence, 68:243{302, 1994. [17] C.A. Knoblock, S. Minton, and O. Etzioni. Integrating abstraction and explanation-based learning in PRODIGY. In Proc. AAAI-91, pages 93{102, 1991. [18] D. Koller and A. Pfe er. Generating and solving imperfect information games. In Proc. IJCAI-95, 1995. [19] J. McCarthy. Elephant 2000: A programming language based on speech acts. Unpublished manuscript, 1990. [20] J. McCarthy. Some expert systems need common-sense. In Formalizing Common Sense: Papers by John McCarthy, pages 189{197, 1990. [21] T.M. Mitchell, R. Keller, and S. Kedar-Cabelli. Explanation-based generalization: A unifying view. Machine Learning, 1:4{80, 1986. [22] S. Muggleton and L. de Raedt. Inductive logic programming: Theory and methods. Journal of Logic Programming, 19:629{679, 1994.

[23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45]

Nilsson N.J. Eye on the Prize. Unpublished manuscript, 1995. G.M.P. O'Hare and N.R. Jennings. Foundations of DistributedArti cial Intelligence. John Wiley, New York, NY, 1996. P.P. Perla. The art of wargaming. Naval Institute Press, Annapolis, Md., 1990. A. Rao, M. George , and E. Sonenberg. Social plans: A preliminary report. In Proc. MAAMAW-92, pages 57{76, 1992. C. Rich and C. Sidner. COLLAGEN: When Agents Collaborate with People. In Readings in Agents, pages 117{124, 1997. S. Russell and P. Norvig. Arti cial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River, NJ, 1995. M.R. Salisbury, D.W. Seidel, and L.B. Booker. A brief review of the Command Forces (CFOR) Program. In Proceedings of the Winter Simulation Conference, 1995. S. Sen (Ed.). Adaptation, Coevolution and Learning in Multiagent Systems. Technical Report SS-96-01, AAAI Press, Menlo PArk, CA, 1996. S. Sen (Ed.). Multiagent Learning. Technical Report WS-97-03, AAAI Press, Menlo PArk, CA, 1997. S. Sen (Ed.). Special issue on evolution and learning in multiagent systems. International Journal of Human-Computer Studies, 48, 1998. Y. Shoham. Agent Oriented Programming. Jounral of Arti cial Intelligence, 60:51{92, 1993. Y. Shoham. Rational Programming. Unpublished manuscript, 1997. M. Shubik and G. Brewer. Models, simulations, and games { a survey. Rand, Santa Monica, Ca., 1972. P. Stone and M. Veloso. Multiagent Systems: A Survey from a Machine Learning Perspective. Technical Report CMU-CS-97-193, Computer Science Department, Carnegie Mellon University, 1997. T. Sugawara and V. Lesser. Learning to Improve Coordinated Actions in Cooperative Distributed Problem-Solving Environments. Machine Learning, 33:129{153, 1998. K. Takadama, T. Terano, and K. Shimohara. Can multiagents learn in organization? Analyzing OrganizationalLearning Oriented Classi er System. In Proceedings of the IJCAI'99 Workshop on Agents Learning About, From and With Other Agents, 1999. M. Tambe. Towards exible teamwork. Journal of Arti cial Intelligence Research, 7:83{124, 1997. G. Weiss and S. Sen (Eds.). Adaption and Learning in Multi-Agent Systems. Proc. IJCAI'95 Workshop. LNAI 1042, Springer, 1996. G. Weiss (Ed.). Distributed Arti cial Intelligence Meets Machine Learning Proc. ECAI'96 Workshop LDAIS. LNAI 1221, Springer, 1997. G. Weiss (Ed.). Multiagent Systems: A Modern Approach to Distributed Arti cial Intelligence. The MIT Press, 1999. M.P. Wellman. A market-oriented programming environment and its application to distributed multicommodity ow problems. Journal of AI Research, 1:1{23, 1993. S. Woodcock. Game AI: The state of the industry. Game Developer, pages 35{43, August 1999. M. Wooldridge and N.R. Jennings. Towards a theory of cooperative problem solving. In Proc. MAAMAW-94, Workshop on Distributed Software Agents and Applications, pages 40{53, 1994.

Suggest Documents