Plan Description and Execution with Invariants The

0 downloads 0 Views 77KB Size Report
The Planning System of the RoboCup Team “Mostly Harmless”. Gordon Fraser Franz Wotawa. Graz University of Technology. Institute for Software Technology ...
Plan Description and Execution with Invariants The Planning System of the RoboCup Team “Mostly Harmless” Gordon Fraser Franz Wotawa Graz University of Technology Institute for Software Technology (IST) A-8010 Graz, Inffeldgasse 16b/2, Austria {fraser, wotawa}@ist.tugraz.at

1 Introduction The RoboCup mid-size league offers a very dynamic, rapidly changing environment that comes closer to the real world than any other league: two teams each of four autonomous robots competing against each other in a soccer game using only slightly modified FIFA soccer rules. There are no external sensors, all data are gathered by the sensors mounted on the robots. Because it has such a dynamic pace it is only natural to focus on creating a fast and reactionary system. Therefore many teams use state machines or decision trees as their decision making systems. Due to the very nature of the middle size league this is a good approach for creating a competitive team - of all RoboCup leagues the mid-size league offers the most challenges, and many problems are yet waiting for a good solution, intelligent decision making is not yet a main focus of attention. To a big extent resources and effort goes into creating a solid platform, which by itself is already a daunting task and seems never to be completed. However only if there is a good and working platform, sufficient AI techniques give you an advantage over other teams. Reactive systems may be a good choice to make agents act quickly, but there are several obvious drawbacks - extensibility, adaptability and foresight for instance have to suffer. Going a step further, what are chances such a system is even able to behave intelligently in another domain? Opinions about this topic diverge, for one there are theories that state there is no need for explicit knowledge representation, intelligence is something that emerges by combination of primitive behaviours (e.g. [Bro86]). On the other hand there are supporters of the idea of using logic-based systems with an explicit representation for knowledge, the ”classic” AI approach. [APC03, GINR99, Bee02] Reconsidering the RoboCup mid-size scenario, it would be nice to have agents act goal-oriented and rational. After all this is exactly what human football players do: They are assigned goals specific to their role in the team and act accordingly - reactive on a low level, e.g., no football player will have to think how to handle a ball, but analytical and deliberate on a higher level. Whatever actions are necessary to fulfill that goal, the act of planning gives a sequence of actions. If these goals change due to change in the changes in the environment (like altered strategy, or following the commands of the coach, or whatever), a player is able to quickly adapt - and again figure out what actions are necessary to fulfill these new goals. This is exactly what one should expect a robot agent to be able to do as well, and this is the approach chosen for the decision making layer of the Mostly Harmless mid-size robots [ea03]. This article describes the system and problems that have to be solved in order to get a good, planning autonomous agent. 1

2 Planning A planning problem is formally defined as having three inputs, namely descriptions of the world before a plan is executed, the goal, and the actions an agent is able to perform, commonly referred to as operators -which in fact specifies a class of planning problems parametrized by the language used. This planning problem has been an active research topic for decades, with a major turning point in 1971 when Fikes and Nilsson presented their STRIPS planning system [RN72]. In fact a STRIPSlike representation is to the present day still the preferred choice for use in planning systems and not surprisingly the planning system described in this article follows this approach. Of course there has been a lot of research ever since 1971 and many extensions and improvements have been discussed ([Wel94], [Wel99]), many of them nowadays expected for sure for a state of the art planning system. The original STRIPS-representation describes the world state as a complete set of ground literals, and defines actions as consisting out of a precondition as a conjunction of propositions, an effect, described by an add-list, containing literals that are to be added to the world-state after an action is executed and a delete-list that contains literals that should be removed from the world-description after action execution. These actions are commonly referred to operators, and the superset of all operators is called a domain theory. At any state satisfying an operator’s precondition this action can be executed. The resulting world state is calculated by removing each literal from the delete list from the state and adding each literal from the add list to the state. This so called classical planning problem makes many simplifying assumptions: atomic time, deterministic effects, omniscience, sole cause of change, no exogenous events, etc. The STRIPS-representation by itself is very limited so several extensions have to be considered, in particular the following have been integrated for planning in the Mostly Harmless robots: Action Schemata with Variables are a very obvious way to boost representational power of operators: instead of having discrete operators for each possible action, actions are described using parameters that define the objects they are applied to. Disjunctive preconditions can be very handy and are easy to integrate in a planning system while possibly enlarging search space enormously. They are not usable for effects though - after all that would make actions non-deterministic. Negations for preconditions Universal Quantification is very powerful indeed and very useful to describe real world actions. The drawback of all extensions is that the more powerful the representation language is, the more complex planning gets. Of course the more complex planning is, the longer it takes - this again is a major drawback when expecting an agent to act quickly. Having defined the syntax for the planning problem the necessary steps to proceed are • Plan creation • Plan execution • Plan monitoring and reparation • Knowledge Engineering

2

Out of these, plan creation is the topic that in the past has had the most research attention. It is a daunting problem and a plethora of more or less efficient solutions exist. Even in its most primitive form, progressive or regressive world-space search, it can be sufficient as long as the representation language is simple enough, while even more sophisticated algorithms from early UCPOP [PW92] to Graphplan [BF95] to compilation to a SAT problem [KS92], [EMW97] can quickly get disastrously slow and inefficient as soon as the problem becomes too complex. As there was much effort put into keeping the RoboCup-problem as simple as possible, any algorithm suffices up to now for use in our planning system - a fact that will definitely need reconsideration once the complexity of the domain description and actions the agents are able to perform reaches a certain level. Total order regression planning is used but for now, but this choice was only due to time limitations when implementing the system. For the system described in this article, the main concerns are monitoring and knowledge engineering.

3 Plan Execution And Monitoring Execution of a plan in the real world may not result in the intended outcome, hence researchers in the planning community are increasingly concerned with executing plans and the design of systems in which planning and execution are continually interleaved and actively managed. Such systems must often be able to handle actions with duration, simultaneous execution of actions, plans with conditionals and loops, and plan failure. As for the ”Mostly Harmless” robots, basic behaviours like ball handling or obstacle avoidance are done at a lower, reactive level, leaving only relevant parts of actions to be controlled by plan execution. Currently, actions are executed sequentially as movement actions cannot be executed in parallel, however controlling the kick mechanism to kick a ball can be done in parallel. In the near future, communication actions are to be integrated and these of course have to be done in parallel, it would not make sense to stop the robot for communication. Traditionally plan monitoring is done at two distinct levels: action and plan monitoring. [RN95] Action monitoring is performed by checking an operator’s precondition before and during execution. This is a sensible thing to do, it allows for quick reaction and plan reparation. Plan monitoring on the other hand is done by checking all the plan’s action’s preconditions, regarding only those terms that are not influenced as effects of other actions. This idea has been extended in the Mostly Harmless planning system by adding the concept of invariants to plans. Every plan is assigned an invariant that has to be valid all along the execution path. Looking at this closer, a goal description now consists of the following parts, each of which is a first order logic sentence (with the described limitations for effect descriptions): • Precondition • Invariant • Goal The precondition describes the state the world has to be in before starting to create and execute a plan. The goal is used by the planning algorithm to create a sequence of actions. Finally, the invariant is constantly checked during execution. It can be equal to the precondition, but this is not necessarily the case. Consider a typical football situation - a defending player near his own goal, the opponent team possesses the ball and is approaching the goal (See figure 1). Now a simplified player’s job (goal) 3

Figure 1: Example soccer situation. Opponents moving from right to left. might look like this (regardless of however realistic this might be): ”If the opponent team is approaching the own goal with the ball and there are two opponents approaching and there is a teammate who is closer to the opponent with ball possession, then go to the second opponent and block him from receiving a pass, as long as this second opponent is a threat.” This sentence includes a precondition, an invariant and a goal description, and using logic could simplified be expressed like this: Precondition ∃x, y(HasBall(x) ∧ Opponent(x) ∧ Opponent(y) ∧ Approaching(x) ∧ Approaching(y)) Invariant Inreach(x, y) ∧ Hasball(x) Goal Blocking(x, y) Efficiently checking a plan’s validity is especially crucial when acting in a dynamical world like RoboCup is, all robots on the field are in constant movement and hence validity of plans rapidly change. Quickly being able to react on a change in the environment is a key factor for being able to use planning instead of a purely reactive approach.

4 Knowledge Engineering And Representation Efficient plan creation and monitoring algorithms are one aspect of getting a planning system with good performance, but even using the most sufficient algorithms available, performance to a big extent depends on the representational power of the language being used. First, one needs a set of predicates and constants to make up the language. It is helpful to consider what actions an agent is able to perform, which in a robot’s case might simply be defined by the physical abilities it has. For example, driving one of the four omnidirectional wheels a Mostly Harmless robot has could be an action, however purely symbolic planning is the goal, so there is a need for actions that are parameterizable only in a symbolic way - e.g. goto, dribble, pass, etc. Having settled on what actions to use this helps on finding constants - which are all objects these actions can act upon. For one these are the objects on the football field: ball, goals, teammates, opponent players. However it is also necessary to represent positional information in a symbolic manner, a position for receiving a pass, a position be able to do defensive actions, etc. Predicates, on the other hand, might need more consideration to be found. Having defined a vocabulary of symbols, the next step is to ground them. This of course if very dependent on the way a predicate’s implementation can access numeric data needed to decide about its 4

truth value. In the case of the Mostly Harmless robot system, there is a central world model fusioning sensor data using Kalman filtering. There are various attempts at finding ways to automate these steps, however it mostly comes back to manual work. Having a language with a defined syntax and grounded symbols, the final step is to express what an agent’s tasks are. These tasks need to be defined in such a manner that the soccer robots can choose cooperating goals that make them act like a team. In order to achieve this we have divided strategic knowledge into the following three levels, each described using logic sentences. Action Plan: Describes a goal together with a precondition that triggers its execution, an invariant that is verified during execution, and the actions that are to be performed in order to reach the goal. Role: Is composed of a precondition and invariant as well as a set of action plans. Through the use of preconditions and invariants for roles the robots can dynamically change role during the course of play. Strategy: Is the sum of roles available for team-play. With this hierarchy the decision making process for selecting the best available plan in a given soccer situation works as follows: 1. A strategy has to be decided on. This can either be defined by the operator or chosen by the robots depending on their current environment. 2. Roles are given priorities. The role with the highest priority and matching precondition is chosen. 3. From the given set of goals, again via priority and precondition, a suitable goal is chosen. As the goal is described as a logical formula, classic planning can be used to find the necessary steps to achieve this goal. In order to follow the above basic principles we have identified a set of predicates that allow to describe basic actions and given situations in an abstract way. For example, a description of the action pass ball using our language looks like the following logical sentence: Precondition:

F acing(player, target object) ∧ Inreach(target object, player) ∧ ¬ Blocked(target object, player) ∧ T eammate(target object)

Effect:

¬ HasBall(player) ∧ HasBall(target object)

5 Conclusion The planning system was first used at the RoboCup 2003 in Padova, where even though severe limitations it has proven to be promising. As many components of the robot control-software were not finished at that time, sensory data was limited and so was the representational language. Even so our robots acted surprisingly well. With respect to adaptability and flexibility this approach has clearly proven to be superior to many commonly used concepts. Very quickly and easily a robot’s behaviour on the field can be changed by modifying sentences that to some extent resemble human language statements. 5

At regular intervals the rules and the environment of the middle-size league are changed to make it as challenging as possible and to impel on research. However, coping with a changing environment is a mere matter of keeping the domain language up to date. In fact we aim to use the very same robots in other environments that are not so well-defined as the RoboCup currently is. For sure this planning approach will scale for any application.

References [APC03]

Miguel Arroz, Vasco Pires, and Luis Cust´odio. Logic based distribution decision system for a multi-robot team. In To appear in the Proceedings of RoboCup 2003, Padua, Italy, 2003.

[Bee02]

Michael Beetz. Plan representation for robotic agents. In Plan-Based Control of Robotic Agents (LNAI 2554). Springer, 2002.

[BF95]

Avrim Blum and Merrick Furst. Fast planning through planning graph analysis. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI 95), pages 1636–1642, 1995.

[Bro86]

R.A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2(1), 1986.

[ea03]

Gerald Steinbauer et al. Mostly harmless team description. In To appear in the Proceedings of RoboCup 2003, Padua, Italy, 2003.

[EMW97] Michael Ernst, Todd D. Millstein, and Daniel S. Weld. Automatic SAT-compilation of planning problems. In IJCAI, pages 1169–1177, 1997. [GINR99] Guiseppe De Giacomo, Luca Iocchi, Daniele Nardi, and Riccardo Rosati. A theory and implementation of cognitive mobile robots. Journal of Logic and Compuation, 9(5):759– 785, 1999. [KS92]

Henry A. Kautz and Bart Selman. Planning as satisfiability. In Proceedings of the Tenth European Conference on Artificial Intelligence (ECAI’92), pages 359–363, 1992.

[PW92]

J. Scott Penberthy and Daniel S. Weld. UCPOP: A sound, complete, partial order planner for ADL. In Bernhard Nebel, Charles Rich, and William Swartout, editors, KR’92. Principles of Knowledge Representation and Reasoning: Proceedings of the Third International Conference, pages 103–114. Morgan Kaufmann, San Mateo, California, 1992.

[RN72]

R.E.Fikes and N. J. Nilsson. Strips: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2(3-4):189–208, 1972.

[RN95]

Stuart Russel and Peter Norving. Artificial Intelligence - A Modern Approach. Prentice Hall, Inc., 1995.

[Wel94]

Daniel S. Weld. An introduction to least commitment planning. AI Magazine, 15(4):27– 61, 1994.

[Wel99]

Daniel S. Weld. Recent advances in AI planning. AI Magazine, 20(2):93–123, 1999.

6