of solutions available to the software devel- oper. ... are the subject of several research and development pro- grams. ..... municates with the agent via a custom protocol. Plan ..... on Arti cial Intelligence (AI'98), Brisbane, Australia,. 1998.
Plan Recognition in Military Simulation: Incorporating Machine Learning with Intelligent Agents Clinton Heinze, Simon Goss and Adrian Pearce Abstract
A view of plan recognition shaped by both operational and computational requirements is presented. Operational requirements governing the level of delity and nature of the reasoning process combine with computational requirements including performance speed and software engineering eort to constrain the types of solutions available to the software developer. By adopting machine learning to provide spatio-temporal recognition of environmental events and relationships, an agent can be provided with a mechanism for mental state recognition qualitatively dierent from previous research. An architecture for integrating machine learning into a BDI agent is suggested and the results from the development of a prototype provide proof-of-concept.
1 Introduction
This paper proposes machine-learning as a tool to assist in the construction of agents capable of plan recognition. This paper focuses on the beliefs-desires-intentions (BDI) class of agents. These agents have been used to provide computer generated forces in several applications [Tidhar et al., 1998; McIlroy and Heinze, 1997]and are the subject of several research and development programs. Much of the research associated with plan recognition in agent-systems has been undertaken in simple domains. In Section 3 the complex domain of air-combat simulation is explained and the requirements that it places upon plan-recognition are investigated. These, together with the descriptions of mental-state recognition in section 2 form the basis of a detailed set of requirements for sophisticated plan-recognition by intelligent agents. In section 5 the speci cs of the favoured machine learning implementation are detailed and in subsequent sections we detail the design of a technology demonstrator.
1.1 Related Research
Systems for implementing resource bounded reasoning that de ne plans as structured descriptions for goal
achievement have been used to implement complex human decision-making models. These systems use prespeci ed plans as recipes [Rao, 1997] for the achievement of prede ned ends. Previous research into plan recognition in these agent systems has required that the agent be provided with explicit representations of candidate plans [Rao, 1994; Rao and Murray, 1994]. Work by Rao [Rao, 1994] tackled the reactive recognition problem by making simplifying assumptions about the nature of the environment: (a) the agent has perfect knowledge of the plans available to other agents; (b) the complete set of plans over which recognition is attempted remains small; (c) the agent has no memory of events that occur; and (d) the world is unchanging during the period of recognition. Work by Tidhar and Busetta extended the functional capabilities of this type of plan recognition to remove the third and fourth assumptions. A demonstration of the capability of this system to recognize mental states within a military simulation was constructed. These extensions to the computational BDI model added memory and were able to incorporate some of the temporal aspects required for mental state recognition. The rst two assumptions still limit the performance of system. In attempting to model human cognition it may be unrealistic (and perhaps impractical) to provide an agent with perfect knowledge of the plans of other agents. This is true in heterogeneous systems where agents must recognize the mental states of real humans, or of dierent varieties of agents that do not possess explicit representations of plans in a form understood by the agent attempting the recognition. As the complexity of the environment, agents, and their interactions increases the set of plans over which recognition must be attempted will increase. Every agent within the system must be provided with a representation of the plans of all other agents that it is expected to recognize. Second order recognition (I recognize that she/he has recognized my plan) complicates this signi cantly and in complex domains will quickly become unwieldy. There are other signi cant implemented agent systems for modelling warfare TAC-AIR-SOAR [Tambe et al., 1995; Laird et al., 1994] is a notable example that implements a dierent model of cognition. The SOAR ar-
chitecture [Laird et al., 1987; Newell, 1991] supports a multi-layered view of cognition and may provide some features that may assist in the development of integrated plan recognition.
2 Mental State Modelling
If an agent is to exhibit sensible behaviour within its environment it needs to have knowledge of what other agents are attempting to achieve. In the case of agents with which it is actively cooperating this can often be achieved by communication or through explicitly coordinated action [Tidhar et al., 1998]. If communication is unavailable or too expensive, or if the environment is a hostile one, then the agent must develop models of the mental states of the other agents based on observation of the world and inferential reasoning. Beliefs, desires, intentions, and plans are all explicit attributes of the BDI model that have relevance to mental state recognition. An agent cannot observe these attributes directly and so must infer based on observation. Plans are structured descriptions of actions and it is these actions that can be directly observed. Thus observation of the environment will lead primarily to plan recognition and then to inference of the other mental states|intention, desire, belief. So plan recognition can be viewed as an important stage in the development of a mental model of another agent. Recognising the currently executing plans of an agent provides the capability to reason about the future action of the agent. It is often desirable to predict the actions of agent well in advance of their execution. This allows an agent to take preemptive measures in support of a friend or to counter a foe with the greatest chance of success. Predicting future action requires a knowledge of the plans that may be attempted and underlines the importance of plan recognition. If an agent is to predict the actions of others the mental model that it develops must include a representation of the plans that they may execute. These observations are the atomic units of plan recognition and are accumulated as evidence toward the acceptance of a single plan from a hypothesized set. In general there will be multiple plans applicable for any single observed event. To remove ambiguity the agent must acquire evidence over time. This evidence will either cause plans to be rejected as impossible or allow them to remain under consideration. When all ambiguities have been removed and a single plan remains the plan recognition has been successfully completed. This process may be simple if the number of applicable plans is small and there are discrete and signi cant dierence between those plans. Plan recognition requires that the agent has two quite dierent capabilities. First it must be able to observe the environment and to extract from it the patterns that will assist in triggering recognition. This observation process is simpli ed if the agent has some concept of what it is that it should be looking for. Complexities in plan
recognition occur when the observed actions represent the outcome of two plans that are executing simultaneously resulting in interleaved actions. For example, an agent that observes actions A1, A2, A3, and A4 will need to dierentiate between a single plan that sequences these four actions and two plans operating in parallel| one of which execute A1 and A3, and the other A2 and A4. To deal with these complexities an agent must secondly be able to inferentially reason about the nature of the observations that it makes. The nature of the current situation will in uence the interpretation of the observation. The reasoning process sets the context for the observation process and then reasons about the data that emerges.
3 Air Combat Modeling
Air Operations Division (AOD) of the Defence Science and Technology Organisation (DSTO) conducts simulation for operations analysis in support of the Royal Australian Air Force (RAAF). Much of this operations research eort is in the development and support of simulations that incorporate models of air force personnel. This cognitive modeling challenge has been addressed by the adoption of the BDI model operationalized by the dMARS software [d'Inverno et al., 1998]. The modeling of human reasoning for air combat simulation places requirements on the nature of plan recognition. From the domain of air-combat we present several examples of dicult scenarios that plan recognition might encounter. Air combat is a domain with speci c characteristics. The environment is dynamic|the physical positions of aircraft and the tactics employed by pilots change quickly and often. Pilots act in teams with de ned roles and responsibilities but even these teams are subject to dynamic rearrangement. Like most military scenarios the air combat environment is hostile. Pilots are actively attempting to deceive, avoid, evade, or destroy other aircraft but are faced with the diculty of reasoning with incomplete or inaccurate knowledge. The sensors at the pilots disposal|radars, eyesight, electronic surveillance measures (ESM) are all subject to limitations of range and accuracy. Furthermore enemy aircraft actively employ counter-measures to neutralise these sensors. The following gures illustrate some of the situations that might occur during combat and serve to highlight some of the plan-recognition diculties that emerge in this domain. Team Tactics: When operating as a pair pilots will cooperate|one as leader and one as wingman. By coordinating radar usage and missile ring they can maximise their chance of victory. Figure 1 shows two aircraft conducting a pincer intercept against a singleton (Bandit-1). The plan recognition challenge for Bandit-1 is to decide whether or not the aircraft are coordinating as a pair or if they are to single aircraft acting as individuals|perhaps oblivious to each others existence. The pilot must recognise team plans and within those plans which air-
craft is adopting what role. Interceptor-1 Interceptor-2
Interceptor-1 simultaneously. This is similar to the previous situation but now Bandit-1 is executing only a single plan that could be recognised dierently depending upon the perspective of the viewer. Interceptor-1 Interceptor-2
Bandit-1
Bandit-1
Are the two interceptors acting as a team, or are they individuals?
Figure 1: Team Tactics Executing of Multiple Plans: When a pilot nds himself in a situation that requires him to deal with simultaneous threats there may be plans executing for each of the aircraft. Figure 2 shows a pilot engaged with two Interceptors. In this case Bandit-1 may be simultaneously executing plans to evade one aircraft and intercept the other. As long as these plans are not mutually exclusive they can coexist. The challenge for the interceptors is to recognise the existence of two simultaneously executing plans. Interceptor-1 Interceptor-2
Bandit-1
Is Bandit-1 turning to evade Interceptor-2 or turning to attack Interceptor-1?
Figure 3: Ambiguous Action Plan Switching: During an engagement a pilot will regularly evaluate his status and consider his actions. Because of the highly dynamic nature of air combat it is common for a pilot to adopt, partially execute, and then drop many plans. This rapid partial execution and discarding makes recognition of the pilots actions dicult. Figure 4 shows a pilot engaged in a typical intercept and highlights the extent to which plan switching can occur. Bandit-1 is intially attacked by Interceptor-1 and commences a radar defeating break-turn. Following the success of this manoeuvre Bandit-1 turns to attack Interceptor-1 and then although the attack proved unsuccessful follows this with an attack on Interceptor-2. In cases such as this the pilot adopt many signi cant changes in adopted plans in a few minutes. Interceptor-2
Bandit-1 manoeuvres to simultaneously place himself in a position to attack Interceptor-1 and evade Interceptor-2.
Bandit-1
Figure 2: Multiple Simultaneous Plans Ambiguous Action: Figure 3 shows an aircraft engaged in combat against two opponents. As Bandit-1 commences a turn to the right Interceptor-1 and Interceptor-2 must decide if the turn represents an attack on 1 or a defensive manoeuvre to escape from 2. If Interceptor-2 is unaware of the presence of Interceptor-1 the manoeuvre will appear to be defensive. If Interceptor-1 is similarly unaware the manouvre will appear hostile. If they are aware of each other presence then the ambiguity will need to be reconciled. Interestingly Bandit-1 may actually be performing both a defensive manoeuvre against Interceptor-2 and an oensive manoeuvre against
Attack Interceptor-2 Interceptor-1 Radar Evasion Manoeuvre
Recommence Attack on Interceptor-1 Attack Interceptor-1
Figure 4: Plan Switching Deliberate Deception: In a combative environment advantage is obtained through denying the enemy knowledge of your intentions. In air combat this can be achieved through stealth or surprise or by
creating doubt by adopting deliberately deceptive or diversionary tactics. Figure 5 shows a situation in which one of the interceptor's feints an evasive manoeuvre to lure Bandit-1 into a defensive position where it will be vulnerable to an attack by the other. Interceptor-1 Interceptor-2
Bandit-1
Interceptor-1 feints an evasion to lure Bandit-1 into a vulnerable position for an attack by Interceptor-2
Figure 5: Diversionary Tactics The scenarios above are indicative of the types of situations must deal with if is is to be successfully employed for air-combat simulation. Observations of other aircraft are limited primarily to the radar, or in the case of close combat, eyesight and to information messages sent via radio or data-link. This limited supply of sensory information means that the information processed by the pilot in trying to recognise enemy tactics are primarily the trajectories of all of the aircraft about which there is data. Pilots are critically concerned with recognising and interpreting changes in the range, heading, and bearing of other aircraft relative to their own and in matching these to known tactics. Interspersed with the trajectory data there may be discrete events that give clues to the plans that the enemy pilots are executing. Missile launches may be detected and warnings about certain radar modes may indicate that an opponent has a radar lock.
4 Requirements
The following section examines the requirements for plan recognition that emerge out of an analysis of air-combat simulation. The requirements are split into two sections: operational requirements that are decided by the nature of the air combat domain; and computational requirements that result from the hardware, software, and human resources available to construct the simulation software.
4.1 Operational Requirements
1. Spatio-temporal Trajectories The nature of air combat is such that the primary source of observational information feeding the plan recognition process are the trajectories of the aircraft and relative geometries across those. Plan recognition must be capable of dealing primarily with this type of data.
2. Intermediate Recognition The plan recognition process should provide the agent with intermediate information regarding the current set of possible and the data regarding the type and weight of evidence for and against each. 3. Levels of Abstraction Generally behaviour is composed of high level plans with sub-plans and subsub-plans. Whilst executing a plan to go to the shop I might rst invoke a sub-plan to nd my car keys. Plan recognition should be capable of recognising plans at dierent levels of abstraction. 4. Structured Plan Recognition Plan recognition should be structured. If it is not possible to determine exactly what plan is being executed it may be possible to determine the class of plans from which the plan is derived. This information will be useful to the agent in planning a preliminary response or in devising ways to actively seek more information. 5. Dynamic Environment The environment will change during the period in which plan recognition is attempted. The agent must be capable of reasoning about when to discard current attempts at recognition as being outdated. The agent must also balance the timeliness of recognition with the accuracy of it. 6. Teams and Team plans Many agents can work cooperatively in teams. This cooperation can be achieved through communication or through explicit team structures and team plans. Recognizing the coordinated behaviour of a team and the relationships between the actions of each of the members of that team is necessary for the recognition of team plans. 7. Irreconcilable Ambiguity Cases will arise that result in a failure of the plan recognition process to discriminate between possible alternatives. It must be possible to recognize this irreconcilable ambiguity when it occurs and deal with it appropriately. 8. Preemptive Recognition The plan recognition should dierentiate possible plans at the earliest possible moment. Ideally this will be before the plan has completed. At any time the agent should be able to reason about the current state of the plan recognition process and take action to resolve ambiguities or to take other appropriate action. 9. Behavioural Prediction The recognition process should not only be capable of recognising plans before they have completed but should be capable of providing accurate insights into the future behaviour that might be expected from an agent executing that plan.
4.2 Computational Requirements
1. Real time performance Agents that interact with humans must be capable of real time performance. Plan recognition must be undertaken faster than real time to allow the agents to respond within the time constraints of embedded real time systems.
2. Complexity Simulating systems as complex as pilots and aircraft requires sophisticated modelling techniques. Managing the complexity of the resulting software is a balancing act between required delity and resource limits. The engineering of large simulations is an expensive and risky business. Anything that can be done to remove unnecessary complexity from the software should be considered. 3. System Integration There has been a large investment in existing agent models of pilots. Any plan recognition software must be developed as a module capable of integration with the existing models.
Start
Mid Finish Bank
Bank Mid turn S1,
S3,
S5,
Contr
Roll, Pitch, S2,
S4,
S6,
5 Proposed System
The proposed system integrates a machine learning algorithm, CLARET [Pearce, 1997], with existing BDI agents. The following sections deal with descriptions of the two components and their integration.
5.1 Machine Learning
In descriptive recognition, an expert pilot explains the relationships between event types specifying a decomposition of high-level manoeuvres - \tell me about these manoeuvres". For example, ying circuits can be decomposed into take-o, crosswind, down-wind, base-leg and nal-approach manoeuvres. Recognition is then used to bind these manoeuvres to traces of simulator activity. The FSIM simulator is enhanced to provide descriptions of the out-the-window world view in the data trace as well as dynamic knowledge of the world, the positions of objects, dynamic entities in the three dimensional ight course relative to the pilot, and pilot motion. The system dynamically binds to dierent manoeuvres as they occur in the trajectories of input time series. The technique used in our simulator is based on statistical pattern matching and learning techniques, currently used for on-line handwriting and gesture recognition. A matching system, called CLARET, has been speci cally adapted for recognizing manoeuvres based on real-time trajectory information [Pearce, 1997; Pearce et al., 1998]. Recognition is applied to low level instrumentation and aeroplane data to bind these manoeuvres as they occur in traces of pilot behaviour. In the CLARET algorithm an unknown segmented and labeled trajectory case is presented to the system together with examples of known trajectories using a simple polygonal approximation technique. First, relationships between trajectory segments are extracted and their relationships calculated. Relational rules are generated that explicitly depict relationships between states. For example, a right-turn manoeuvre is de ned by a subsequence of dierent roll-pitch-yaw states (see Figure 6) over time, r1(S1, S6) relationship(S1, S6,Roll_diff,Time_diff), 15