Reactive and motivational agents: towards a collective minder

Reactive and motivational agents: towards a collective minder (Extended version of original ATAL96 paper) Darryl Davis Cognition and Aect Group, School of Computer Science The University of Birmingham Birmingham, B15 2TT United Kingdom

Abstract

This paper explores the design and implementation of a societal arrangement of reactive and motivational agents which will act as the building blocks for a more abstract agent within which the current agents act as distributed dynamic processing nodes.

1 Introduction This paper reports on the Architectures for Intelligent Agents project within which computational complete agent architectures are being investigated (using a two dimensional simulated world). As we would like to address a wide range of issues (indeed complete agents) many issues are addressed at a relatively coarse grain. Indeed, our whole approach can be viewed as Broad but Very Very Shallow (BVVS). Like Bates and colleagues [1], our high level aims include understanding natural and arti cial complete agent architectures. This broad approach necessarily requires an initially shallow approach in designing computational agencies. The exploration of deeper, more complete, implementations will follow. We see this work as analogous to that being pursued by a number of other research groups, for example the SOAR community [12], the behaviour-based subsumption work of Brooks [4, 5], and others such as Georgeo [9] and Hayes-Roth [10]. SOAR interests us because of its original driving force as an design (and implementation) arising from a theory of cognition; an immediate criticism of this architecture is its failure to consider the processing that occurs in the mid (and lower) brain (e.g. the limbic system), which we consider vital in developing a design for a complete agent. Brooks' approach interest us because it attempts to model the mechanisms of complete creatures without resource to the very areas that Newell focuses on; we can consider these insect-like internal processes to be qualitatively similar to models of what occurs in higher order limbic systems. It therefore seems obvious to us, interested in complete agents, to develop hybrid architectures that fuse the strengths of the behaviour-based subsumption architectures, appropriate for many lower-level processing mechanisms, with the deliberative symbolic architectures of the more classical approach to arti cial agents, useful for reasoning about the world and actions upon it. Unlike Brooks we consider the use of a structured perceptual system to be of bene t within subsumptive architectures. We hope that our investigations will shed light upon a number of interesting questions such as how (and if) sensory data can be structured for dierent levels of internal processing (behaviours) in complex agents, what are the appropriate 1

META-MANAGEMENT processes inner perception

inner action

RESOURCE-LIMITED REFLECTIVE MANAGEMENT PROCESSES (Planning, deciding, scheduling, etc.)

Variable threshold attention filter

Motive activation Automatic (pre-attentive) processes perception

Feedback action

Reflexes (some learnt)

THE ENVIRONMENT

Figure 1: Towards An Agent Architecture decision mechanisms to use in determining which among (possibly con icting) behaviours are to be preferred in certain circumstances and if dierent learning algorithms are necessary at dierent levels within a complex agent. While we are not currently addressing many issues in agent research (e.g. language and communication between agents), we do not suggest that these are unimportant or irrelevant, simply that we are limited in what research issues we can tackle at any particular point. Some inadequacies in our work to date can be traced to these omissions. Earlier work in the Cognition and Aect group at the University of Birmingham has focused on diering agent architectures (based on dierent simpli ed models of biologically and psychologically plausible mechanisms) in a number of simulated environments. One ongoing scenario is that of an abstract minder looking after a number of babies (or minibots) in a dynamic, and possibly hazardous, environment, through the use of situated perceptual organs and eectors (miniminders). Originally proposed by Sloman ([16]) and subsequently developed by others in the Cognition and Aect group [3, 2]), it provides an environment within which dierent agent architectures have been developed. Parallel to this has been the development of an information processing architecture [18, 6], that allows many dierent coexisting components with complex interactions (see gure 1). Some processes are automatic (pre-attentive) in the sense that all changes are triggered directly; for example re exes (whether learnt or innate) that bypass `normal' processing. Others, the `management processes' are `re ective' or àttentive' knowledge-based processes in which options are explicitly considered and evaluated before selection. The management processes are resource-limited and may sometimes require both attention lters to protect them from disturbance and meta-management processes to regulate and direct them, involving some sort of self-monitoring. The internal `management' (and meta-management) processes involve a certain amount of (asynchronous) parallelism. We want to show how resource limits restricting parallelism in high level processes can lead to emotional and other characteristically human states involving partial loss of control of attention and thought processes [15, 17]. This requires an architecture combining a wide variety of types of mechanisms. There are a number of sources of motivation for this project: 1) By producing plausible computational models of simulated agents we may further our understanding of biological, psychological and social agents. 2

2) By designing and implementing agent architectures based on dierent theories of the mind, we

may better understand the strengths and inadequacies of these theories. 3) By developing working agent architectures in a dynamic and potentially hazardous (simulated) environment, we can further our theories and models of control mechanisms for use in real environments, by resource limited agents In the short-term, we are exploring variants of our existing architecture using an iterative design and implement process, and variations on the creche scenario to feed into an emerging theory of a range of `mechanisms for mind'.

2 Overview of the creche agents environment In earlier experiments, we have looked at ways of developing agent architectures using the BVVS approach [6, 7], within the framework of the simulated creche environment. In these experiments, the charges (or minibots) were very simple agents looked after by a more sophisticated abstract minder which controlled a number of eectors and sensors in the environment. Our earlier work has demonstrated the feasibility of the BVVS approach, at least for our creche scenario. Even with a highly impoverished deliberative minder, it was possible using two aectors, to look after twenty- ve or more babies inde nitely with no fatalities. As we expected, to achieve a successful level of competence and autonomy, agents do not necessarily need sophisticated, ònthe- y', planners and that reactive planners are sucient in these types of simulations. Also we managed to demonstrate the usefulness, and fast prototyping capabilities, of the SIM AGENT toolkit for this type of work. The earlier simulations were also enriched by allowing the charges (babies or minibots) to be modelled as very simple reactive agents, (see gure 2), rather than simple programming artifacts of the minder's processing (as in Beaudoin's work). These agents carried no model of the world and had no memory of previous processing. They were capable of a number of dierent internal states, and the internal states in uenced both the eect of perceptual processing, and the type of action favoured by the agent. Any change in the current internal state is dependent upon the current internal state and the incoming (perceptual) information. We term this àutocognition' in that it combines a reactive agent architecture with some form of representation and processing at a level below that of the deliberative reasoning usually associated with the term `cognition'. There are a number of failings in the earlier work including a lack of depth to the perceptual processing (and how it relates to action and perceptual feedback) and how learning (of dierent types) may t into the architectural models. There was also a great disparity between the processing capabilities of the minibots and the minder. While not a problem in itself, this would have caused diculties for some of the learning experiments we were interested in; for instance, how do the minibots learn to act on behalf of the minder and care for other agents (i.e. act as miniminders)? The work described here takes some small steps towards addressing these issues. The work associated with this project is open-ended but primarily relates to the (design-based) investigation and exploration of the possibilities associated with dierent agent architectures, constrained by the overall goals of the Cognition and Aect group at Birmingham. Here we consider a development of the minder scenario that will provide dierent models of cooperative behaviour and learning using two classes of agents: an unsupervised semi-autonomous base level class of agent ;and a fully autonomous motivated level of agency that subsumes the base-level class. The base level agency can be controlled by the minder to perform duties in the creche environment on behalf of the minder. They are allowed some degree of autonomy and must navigate the environment 3

AutoCognition

perception to internal state mappings

Action Perception Reactive Agent

internal state transitions

internal state to action mappings

The Environment

Figure 2: Simple Reactive Agent Architecture Used For Early Creche Minskis without resource to the higher level deliberative processes, unless called upon to perform some speci c duties; which will entail the temporary use of some eectors. The other class of agents are completely autonomous and cannot be controlled by the minder. The minder is an abstract entity (with its own representations of the world and its tasks within that world) which relies upon perceptual information from the base-level agents to update its model of the environment. It also makes use of higher level deliberative agents to perform reasoning tasks over its model of the world and other representations related to agent goals and possible actions. This can be seen as drawing on the ideas of Minsky (expounded in [11]) in that the minder no longer exists as a completely independent entity but is a collective of the charges, deliberative agencies and its own representations. While this may lead to the abandonment of the òne-body constraint' in some situations (i.e. the minder will not have to choose between sending an agent to location X or location Y, if an agent can be sent to each place), there will still exist situations in which a speci c agent is required in two places. Con icts between the actions and goals of the collection of base-level (situated) agents will (ultimately) provide a framework for experimenting with resource bounds on the management-level processing. Here we try to deepen, and make more generic, the perceptual and auto-cognitive processes within our architectures. Initially this will be at the expense of ignoring higher level attentive and resource management processes. Ultimately in the full implementation of the minder scenario, some of the simple agents may learn to act as autonomous versions of what were in earlier experiments the minder's organs; but rather than receive instructions which must be carried out, they communicate with the minder (or collective rational processes) and undertake to perform certain tasks. An intermediate aim is for the collective minder to use the minski agents as eectors as required, for example, to rescue other agents trapped in ditches. The current experimental scenario makes use of six agents (at least at the design level): energy sources: these very simple agents are the energy source for the other situated agents (i.e. the minskis and paraminskis). While their energy level remains above some threshold, they remain static and in the randomly allocated position within the environment. When this threshold is reached (i.e. sucient agents have fed to drain the source), they become dynamic and cease to dispense energy. They then move (at a faster rate than any other agent) towards the recycle door (but constrained to one of the four directions that all agents move in). If any agent attempts to feed or stays in the moving energy source's path they are damaged: 4

Deliberative Monitor (metadjinnski) Deliberative Agents (djinnskis) Planning Goal/Plan resolution Low-level agent supervision Knowledge/Belief updating

Surfaced Goals

The Collective Minder

Attention Filter

Paraminski

Paraminski

Paraminski

In Minski

Out

Minski

Simulated Environment

Figure 3: The four levels of agency in the collective minder scenario. contact results in instantaneous death; while close proximity results in some loss of internal processing to the agent. Once it reaches the recycle door, the energy source is renewed and placed in a new location within the environment. No other (`physical') agent can control the processing of the energy source. Appendix B gives a formal description of the propositions and pattern matching rules associated with this agent. minskis: simple instinctual and reactive agents with no explicit motivational states but the need to maintain energy levels and avoid colliding into agents and other objects within the environment, while maintaining an observational distance from other agents. This type of behaviour is similar to those expounded by Brooks in his description of behaviour-based agents [4, 5] and could be well modelled using a subsumption-type architecture. The minskis can be thought of as frictionless platforms moving around the environment, driven by a single directed lateral (impulse) thrusts plus one centrally placed vertical thruster (for a limited amount of vertical movement). paraminskis: minskis with extra capabilities and behaviours including explicit goal-oriented motivational states such as hunger, pursue and ee; djinnskis: the abstract and deliberative (management) processing capabilities of the èxtended' minskis and paraminskis. This could include cognitive behaviours such as explicit planning, the consideration of multiple surfaced goals from the situated agents and the resolution of con ict between proposed actions of those dierent physical agents. metadjinnskis: the monitor (and `control') of the deliberative processes. This could include what our earlier work [6] has termed inner perception and inner action. The metadjinnskis goals arise from an interaction of the behaviour of the overall agent, and its designated (or acquired) niche role. We consider this meta-management level processing of the djinnskis to be the most abstract level of agent processing, and so avoid the recursive abstraction problem of meta-metadjinnskis to monitor the metadjinnskis etc., and yet further levels monitoring them. minder the agency that is responsible for the monitoring of the environment and the initiation of actions upon it to care for the agents that it contains. While it will contain its own private 5

representational schemes (e.g. an explicit model of the environment), it will make use of other agents to provide perceptual information about the environment. It can also use these other agents to perform actions upon the world; for example, to collect an eector and move to a certain location and retrieve some object or agent from a possibly hazardous situation. It also subsumes the djinnski and metadjinnski agents for the performance of deliberative reasoning over its own databases; i.e. perform actions on its representations of the world. Figure 3 provides a sketch of how these dierent levels of agency are related. The gure shows two minskis and three paraminskis, with the motivating goals of one minski and one paraminski currently being ignored by the djinnski-level agents. The higher levels of agency and the exact relationship and means of communication between them and the minskis (and paraminskis) is currently an open issue. The minder will be able to take direct control of minski agents but will not be able to directly control the paraminskis. The rest of this paper concentrates on the design, speci cation and implementation of the two situated (minski and paraminski) agents and the energy source. Further work will detail developmental work on the djinnski, metadjinnski and minder agents.

3 A Design For A Richer Creche Simulation In earlier experiments we made used of reactive minibots, with a number of character change states, a minder and some directed agents (or miniminders) that the minder could control to perform tasks. No hazards (other than collisions or attacks from other agents) were present, and we did not consider replenishment goals. Here we can consider the creche as a simulated factory

oor with simple autonomous robots (the minskis and paraminskis) used to perform two types of task. One type of task requires collaborative behaviour between the paraminskis; the other requires the minder to control or coerce minskis into performing short-term tasks (e.g. moving obstacles or freeing other trapped agents using some temporarily attached eectors). An obvious analogy is the minder acts like an octopus but with semi-autonomous limbs (the minskis). The minder must plan the goal-oriented use of the minskis in the environment without being sure of the movements of the paraminskis.

3.1 Description of the Environment

The environment makes use of naive (non-newtonian) physics. Friction and momentum do not exist. Changes in direction and velocity are instantaneous, so for example acceleration causes an instantaneous increase in velocity, reversing causes an instantaneous (180 degree) change in direction. Similar eects can be attributed to other lateral movements. Gravity does exist, so any vertical movement is momentary and (without sustained energy use) results in a one (toolkit scheduler) cycle of elevation. Walls are opaque, impenetrable and unscalable. Doors are transparent and (initially) open to certain agents in speci c states; for instance, the recycle door is open to dynamic energy source but not to any other functioning agent, while the exit is open only for agents of a certain age. Subsequent development may allow doors to function as agents that control transparency and accessibility. The creche consists of one large room of four walls: the north wall contains an entrance (or intake door); the west wall contains a recycle door through which `broken' or spent agents are ejected; while the east wall contains an exit (or discharge door), and a hazard, in this case a ditch into which agents can fall. Ditches can be seen by all agents; however this does not mean that an agent can avoid falling into one. Injuries and damage result in some depletion of the energy source, 6

but no other change to the agent. Agents, in a ditch, with sucient energy can use their vertical thrusters and some lateral movement to escape. Only agents with eectors can remove helpless agents out of a ditch. An energy dispenser is situated in any clear space within the environment. Occasionally its energy store runs low and it exits the environment (through the recycle door) and is replaced by a new energy source arriving through the intake door. The energy source, when moving, does not attempt to avoid other agents and a collision results in instantaneous death for the agents. Also when moving it acts as a memory hazard, which causes some disruption of thinking to any agent in close proximity, for example the deletion of some internal processing capability (i.e. a behaviour node). While all agents can visually sense the moving energy source, and so attempt to avoid collisions, only one subtype of the paraminskis can sense the memory danger and so give it a wide berth; this is a bene t that other paraminskis accrue through teaming up with instances of this agent.

3.2 General description of the agents

All agents enter the environment through the intake door or are already present. They leave through the discharge door, or if `terminally' damaged through the recycle door. Two classes of agents can be present in the creche: autonomous (implicitly motivated) reactive agents (minskis); and explicitly motivated agents (paraminskis). The second class has two sub-types: characterless (or active) paraminskis which are like minskis but with one extra layer of processing handling the explicit motivational state related to hunger goals; and paraminskis with character (namely either calm or vigorous). These latter sub-types have extra motivational states not associated with the other situated agents. There is also the potential for the second class of agents to team up and form attachments. The agents can be given up to three senses (visual. auditory and memory danger detectors) so that they can negotiate their way around a potentially hazardous environment. They are also capable of producing sound (which re ects an internal state); the paraminskis detect character changes on the basis of these sounds. When they are deemed to have achieved some high level goal, agents can be discharged from the environment.

3.3 Description of the base level agents: Minskis

The base agents (minskis) combine internal processing with perceptual and action processes. These agents can move in one of four directions (north, south, east and west) and are given an initial energy level, velocity and direction.. Any change in velocity or direction causes an energy unit to be consumed; some actions are more expensive (for example, a reversing manoeuvre is more expensive than a turning behaviour) When the energy level is reduced below a certain level, they become 'hungry' and must nd the energy source. Unless controlled by an external agency (i.e. the djinnski), the desire to feed overrides all other behaviours. We can build a hierarchical set of behaviours which de ne how the minskis move around the environment: Default: this behaviour is sanctioned only if no other action is initiated and relies upon no perceptual information. It requires that the agent simply continues moving in the current direction with the current velocity; Stop: a base level behaviour that brings the minski to an immediate halt; Start: a base level behaviour that causes the agent to move; Turn: a base level behaviour that actuates a 90 change in direction either to the left or right; 7

Up: a base level behaviour that activates a momentary boost of the vertical thruster; typically used to escape from ditches or other situations in which the agent is trapped; Accelerate: a second level behaviour causing an increase in velocity; Decelerate: a second level behaviour causing a decrease in velocity; Reverse: a second level behaviour causing a 180 change in direction; Wander: a top level behaviour causing the agent move around in the environment through some arbitrary combination of reversing, turning, acceleration, deceleration etc.; Feed: a top level behaviour allowing the agent to move towards the energy source and replenish energy levels. Leave: a top level behaviour that allows the agent to move towards the exit door and leave the environment. These behaviours will be activated through the use of perceptual information. For example, if an agent senses objects in front, behind and to the right, turning left will be the most appropriate behaviour. If the agent senses objects in front, to the left and right, but not behind, reversing will be an appropriate behaviour. High level behaviours, such as wandering, can only be activated if no object is in close proximity and this allows the agent to move to more ìnteresting' sectors of the environment. Initially these behaviours will be implemented in a very shallow manner; further deepening of the architecture, the environment and the behaviours may require a more natural modelling of these behaviour forms with actions becoming durative rather than discrete (i.e. acceleration and deceleration can become more gradual temporal processes).

3.4 Description of the second level of agency: Paraminskis

Paraminskis are minski agents that make use of an extra (perceptual, internal and action-related) processing level related to explicitly modelled motivational (or goal-oriented) states. There are two subclasses of paraminski; one that simply models the hunger drive; and a second that models character changes associated with a form of mating (or teaming) behaviour. The second class of paraminski consists of two subtypes: calm agents that are disturbed when in a crowded area (and unattached); and vigorous agents that become frustrated when rejected by a possible mate (e.g. a disturbed agent eeing an overcrowded local environment) or protective (in any scenario) when attached to another agent. As with the higher level minski behaviours, the behaviours associated with these motivational states can be built up from the more primitive actions already described. The structure and activating rules for the generation of these motivational states is discussed in more detail below. An obvious dierence between the minskis and paraminskis relates to how the need to renew energy levels is achieved. This is more hazardous (indeed potentially life threatening) for the minskis for two reasons: rstly they do not possess the perceptual mechanisms to discriminate between the change states of the energy dispenser; furthermore they do not possess the capability to autonomously defer some action (such as approaching the energy source when it is dangerous to do so).

8

3.5 The information processing ontology for the minski agents

Here we present an overview of the types of information processing used in the minski agents. In particular we present some propositional statements, a de nition of important agent de ning attributes and the type of pattern matching rules used to de ne the behaviour of an agent. Note that in the following sections, propositional constraints with the same identi er have the same meaning when associated with dierent predicates. Four predicates relate information about the environment: new sense data is used to represent all sensory information (and is tied into the sensing methods used in our toolkit). Objects that are within the perceptual eld for the perceptual modalities available for an agent gives rise to the following four structured perceptual models. P0 propositions relate not only to the perceptual modality but also give a sensed object's relative position, and take the form: P 0(mode; object identifier; position; range; x; y) where position 2 [Front; Back; Left; Right] P1 extend P0 propositions by giving information about the object's relative motion: P 1(mode; object identifier; position; movement; range; x; y) where movement 2 [Away; Towards]; and mode 2 [vision; hearing; magnetic] P2 supplement P1 propositions with information about the type of object sensed: P 2(mode; object identifier; position; movement; class; range; x; y) where class 2 [Agent; Wall; Ditch; Hazard; Door; EnergySource] These four predicates are meant to relate to dierent levels of processing to be found in biological perceptual systems; the section on architectures further highlights this. There are a number of further database propositions used in the minski agents; here we shall introduce a small number that are related to actions, action potentials and the decision process (see subsequent section): Act(behaviour) used to refer to the current behaviour type where behaviour 2 [start; stop; turn; up; accelerate; decelerate; reverse; feed; wander] Action(level; behaviour; token) representing a speci c behaviour potential where level 2 [0; 1; 2] and token is a unique identifier for each statement Weight(level; behaviour; real) representing the weight associated with a particular behaviour The status of a minski is by default Àctive'; the following rules de ne how this can be changed:

IF (Energy 0) THEN Status(Dead): IF (Trapped) THEN Status(Passive): IF (Energy > 0) AND NOT(Trapped) THEN Status(Active): Further pattern matching rules are used to de ne the various behaviours required of the minskis. Appendix B provides a full ontology of propositions and pattern matching matching rules governing the behaviour of these agents.

3.6 The information processing ontology for the paraminski agents

The paraminskis agents extend the processing used in the minskis, in a number of ways. The perceptual predicates are extended to include 9

P3 propositions that add informational about the intentional state of the perceived object (i.e. its

type of behaviour or whether it is potentially dangerous): P 3(mode; object identifier; position; movement; class; intention; range; x; y) where intention 2 [Active; V igorous; Calm; Frustrated; Protective; Disturbed; Hostile; Passive; Dead] The set of propositions (and parameter values) for paraminskis is extended to allow: Character(sub type) where sub type 2 [Active; Emotional] Emotion(E state) where E state 2 [Disturbed; Frustrated; Calm; Protective; V igorous] MState(state; object identifier) where state 2 [Hunger; Pursue; Flee] Partner(partner) where partner 2 [F; agent identifier] Perceptual space(value) where value 2 [Crowded; OK ] These extra propositions and attributes are related to the motivational (and goal-oriented) processing of the paraminskis. The two types of paraminskis allowed (Active and Emotional) are denoted by the character predicate. The emotional paraminskis can be in a number of emotional states (by default either Calm or Vigorous). Active paraminskis have no goal orientated state other than hunger. Calm minskis become disturbed when their perceived space becomes overcrowded, or more generally when they sense danger (i.e. hazards) or when they become trapped (e.g. in a ditch). Vigorous minskis become frustrated when their perceived space becomes overcrowded or when their attempts to team up with calm agents are rejected. Vigorous agents (when teamed up with a calm minski) become protective in situations where an unattached calm paraminski would be disturbed; this is sucient for the calm paraminskis to remain calm. Some character changes are associated with motivational states, according to the following rules: IF (Trapped) AND Emotion(Calm) THEN Emotion(Disturbed): IF Perceptual space(Crowded) AND Emotion(Calm) THEN Emotion(Disturbed): IF Emotion(Disturbed)AND Partner(Protective agent) THEN Emotion(Calm): IF Perceptual space(OK ) AND Emotion(Disturbed) THEN Emotion(Calm): The character transitions act as triggers to generation of motivational states. For instance, IF Emotion(Disturbed) THEN Generate goal(Flee) IF Emotion(V igorous) AND Near(Agent; Disturbed) THEN Generate goal(Pursue(Agent)) A motivational state is an internal or environmental state that an agent can achieve; for example, an agent increasing its energy level is a `desired' internal state while an agent eeing some object(s) is an `desired' environmental state (which also has an ìnternal' state transition). Goals are related to motivational states, and may be nested. The top-level goal associated with the motivational state of hunger is to feed from the energy source; this may require subgoals such as locate and move to the energy source. Goals are represented using a structured data object. Earlier work has discussed possible the requirements for goals and how they are to be used (see [2, 7]). Among the more important attributes are A list of preconditions for the goal to be generated. Related to this a list of satis ed preconditions, for example `paraminski1 has low energy level' and a motivational attitude related to the propositions (e.g. make false). A set of fuzzy values for goal importance (e.g. high, medium, low); goal urgency (e.g. within 5 cycles); and goal intensity (e.g. high, medium, low). 10

Behaviour Box n

Action Potential

Increasing level

Blackboard

of abstraction of plans/behaviours

Linked nodes represent a coordinated set of actions/plans Behaviour Box j

OUT

IN

Lines denote hierarchical organisation Behaviour Box 1 Dots are possible plans/actions Behaviour Box 0

Figure 4: Behaviour Based AutoCognition Module (with an action potential blackboard)

subgoal or plan factors such as a list of sub-goals, or plans, and the other agents involved. Status information such as commitment status, e.g. one of [unknown, adopted, rejected, ignored],

and its dynamic state, e.g. one of [passive, postponed, active, failed, successful]. Initially, in the work described here, we shall (considerably) limit the gaol attributes (descriptors in Beaudoin's work [2]), in line with our BVVS approach. This means that a very simple preference function, based on gaol attributes, will be used to decide between opposing goals (we will not allow multiple gaol selection for the paraminskis). The goal attributes used in the preference function (importance and insistence) will be allows symbolic token values (i.e. Low, Medium and High). A number of researchers [2, 14] have cited the need for for more sophisticated (time related descriptors). We do not dispute the validity of their cases, we are simply adopting a minimal implementation depth. Goals can be merged if they dier only in the time of their generation. For example two hunger goals, one generated in cycle i which has not been satis ed by cycle j, when another goal is generated, can be combined. Combining goals increases the token values in the following way: Low + Low 7! Low Low + Medium 7! Medium Low + High 7! Medium Medium + Medium 7! Medium Medium + High 7! High High + High 7! High Other goals, e.g. ` ee from agents X,Y,Z' and ` ee from goals W,X,Z' cannot be combined. Further work will no doubt show the inadequacies of our current goal representations and make greater use of the work explored in [2]. Appendix D provides a full ontology of propositions and pattern matching matching rules governing the behaviour of these agents.

3.7 Architectural Considerations

Here we will consider how the earlier simple reactive agent architecture can be extended to allow a deepening of perceptual, behaviour based, and action processing modules as described above. There are a number of architectures that we might consider. In investigating what are appropriate 11

Input Signal Modification e.g. excitation

Behaviour Control Box Behaviour Box n

IN

Output Signal Modification e.g. suppression

OUT Behaviour Box j

Behaviour Box 0

Figure 5: Touring Machine Type Architecture agent architectures for the scenarios we are interested in, we should aim to address such questions as: How can dierent kinds of learning be integrated in these architectures. Are there dierent types of actions, responses and situations at dierent levels in an information processing (agent) architecture and if so in which circumstances is one (or more) behaviours more appropriate? Also how does an agent choose between behaviours, given that some may be incompatible? What types of control systems (e.g. feedback) do we require to model the required agent behaviours? Related to the control issue are questions such as how do certain behaviour sets become over-ruled or interact. For example, in response to certain perceptual stimuli, the internal processing nodes may specify that accelerate and turn behaviours are appropriate. Do we allow the agent to choose just one of these (and if so how?) or an interaction resulting in more complex behaviours? A further requirement is how can the architecture be changed through learning (or training) so that behaviour modules that are initially used at some high level of processing, are subsumed at a lower reactive level. This would equate to learning some set of actions that would faster reaction time and an improved survival rate. One possibility is an integration of the behaviour-based approach with a more classical AI blackboard approach (in a manner not dissimilar to that often proposed by Hayes-Roth [10]). Figure 4 depicts an abstract architecture that embodies these ideas. A number of (simulated) concurrent behaviours are allowed access to the sensory information, each (possibly) producing its own action potential. The behaviours can range from very simple (e.g. continue moving in current direction) through to more complex, such as avoidance and exploratory behaviour. However, this diers from orthodox subsumptive architectures (where there are direct links between dierent behaviours) in that an explicit (symbol based) action potential blackboard is used. The agent decides on the most appropriate (set of) posted behaviours on the basis of some decision process (for example which behaviour subsumes most action potentials). 12

Hunger

P3

A3

Approach Flee

Motivational

States

Perceptual

Action Decision Wander

Stages

P

2

Feed

Low level

Stages

A2

directed behaviour

Accelerate

P1

A1

Deccelerate Reverse

Low level Mechanisms P0 Stop Turn

A0 Start Up

Very Low Level Mechanisms Incoming Sensory Data

Actions On The Environment

Default

Figure 6: A four layer hybrid subsumption architecture for simple agents. The perceptual and action processes do not have to be symmetrical as shown here. An alternative approach using the same behaviour modules is shown in gure 5. Here a decision (processing) node, for example context-activated control rules in Touring machines [8], is used to override certain behaviours. This high level module (or agency) is responsible for switching behaviour nodes on and o. One criticism of this type of control is it tends to preclude low-level (instinctual/re exive). One criticism of both of these two classes of architectures is their use of a at perceptual system, with no discrimination between the types of perceptual information needed by the dierent behaviour nodes. Even simple biological mechanisms tune the perceptual information to the type of input appropriate to or expected by the processors responsible for dierent behaviours. For example, the sensory information passed to a frog's y-catching behaviour will be insucient (and dierent) to that of use to its predator avoidance behaviour (and vice versa). Similar analogies can be drawn with the sensory processing of higher level organisms. To simply stop and not collide with another object in the environment, it is only necessary to consider its relative location (and direction); to perceptually determine something more about that object (for example, its potential danger) requires further processing. A close read of Brooks [4] shows that dierent sensor types are used at dierent levels within his subsumptive architectures; he is in fact simply bypassing structuring of perceptual information by placing this structuring at the sensor level! The architecture that we are proposing for the two lower levels of agency (in our simulation experiments) is a combination of these ideas and the behaviour-based approach [4, 5], but addresses the above criticism of at perceptual architectures. Figure 6 shows how a small set of dierent behaviour or information processing levels can be put together. This architecture makes use of a structured perceptual system that allows more detailed and richer described perceptual events to be used by higher level processing. This works as follows (across all three perceptual modalities, with some slight exceptions). At the base level (P0 ), the agent senses an object immediately ahead, immediately to the left, immediately to the right and (with the exception of vision) immediately behind. At the second level (P1 ), the agent senses something moving left to right, right to left, approaching and moving away. At the third level (P2 ), the agent can detect agents and makes use of structured information about its direction of movement and current position. At the fourth level (P3 ), i.e the paraminski agents, the agent can 13

detect agents and makes use of structured information about its direction of movement and more abstract information (e.g. whether the sensed agent is a potential source of danger, or harmless), We can model the dierent sets of behaviours (described above) as individual (and independent) processing nodes (implemented using symbolic rulesets). Although we have placed similar levels of processing together in distinct layers, these are not connected. Their commonality is the use of a speci c perceptual processing output. These processing nodes can be viewed as concurrent activities (although in our toolkit we can only simulate concurrency).

3.7.1 Making decisions about actions Our adopted approach to modelling the agents can cause problems in that we need to provide some means of deciding between con icting potential actions. For example, the agent cannot reverse, turn left and turn right at the same time. A number of possible solutions exist. Initially we can use a `winner-takes-all' strategy, whereby the generated potential action with the highest summed weight is the preferred action. For example, given the following database entries:

weight(0; stop; 0:25) weight(0; turn; 0:25) weight(1; reverse; 0:50) action(0; turn; 1) action(0; turn; 2) action(0; turn; 3) action(0; stop; 4) action(1; reverse; 5)

default weight associated with the stop behaviour default weight associated with the turn behaviour default weight associated with the reverse behaviour one action potential for turn behaviour second action potential for turn behaviour third action potential for turn behaviour one action potential for stop behaviour one action potential for reverse behaviour

the summing of weights for the action potential would result in:

weight(0; stop; 0:25) summed weight associated with the stop action potentials weight(0; turn; 0:75) summed weight associated with the turn action potentials weight(1; reverse; 0:50) summed weight associated with the reverse action potentials resulting in the adoption of the turn behaviour.

4 Preliminary Experiments With The Architectures We have run a number of simple experiments that compare the dierent agent architectures and their processing. The scenarios used for these experiments make use of the creche environment as described above and various combinations of energy source, minski and paraminski agents. Obvious comparison statistics include the death rate for dierent agents (i.e. how many toolkit scheduling cycles they survive), behaviour losses (i.e. the average rate of behaviour lost due to interference from the dynamic energy source) and feeding rates.

4.1 Hunger Experiments

In the rst experiments we compare the eect of architectural and experimental parameters on the behaviour of one agent, plus the energy source, over multiple runs. The death, behaviour loss and feeding rates provide a means of comparing the dierent agent architectures. Table 1 gives experimental results for comparing four closely related agent architectures: the minski with identical plans for feeding as the paraminskis but implemented as a level 2 behaviour 14

Hunger Experiment (Each architecture run 1000 times) Attribute Architecture Min Max Mean Variance Age minski 3 28 10.14 2.27 Age paraminski1 5 166 23.94 17.83 Age paraminski2 5 31 18.28 4.86 Age paraminski3 3 155 23.85 14.64 Losses minski 0 4 0.06 0.35 Losses paraminski1 0 15 1.62 1.76 Losses paraminski2 0 4 1.00 0.57 Losses paraminski3 0 15 1.65 1.63 Feeds minski 0 2 0.05 0.22 Feeds paraminski1 0 6 0.19 0.73 Feeds paraminski2 0 0 0.0 0.0 Feeds paraminski3 0 5 0.14 0.53 Table 1: Experimental results for hunger experiments using the four dierent agents activity; the active paraminski (paraminski1) which is the minski plus the paraminski processing layer but only for hunger goals; the calm paraminski (paraminski2) which also allows the explicit

ee the dangerous energy source plus hunger goals; and the vigorous paraminski (paraminski3) which also allows the explicit pursue unattached paraminski agents plus hunger goals In each case, the experiment involves siting the energy source at the centre of the room, with the single agent randomly placed in the room. For these experiments, the energy source initial energy level is decreased and the agent's energy use increased to increase the dynamics of the experiment. The scenario is run until the agent dies and statistics for its age at death, number of behaviours lost and number of successful feedings are kept. The agent can die from three causes: its energy level reduces to zero (i.e. it is unable to feed); its set of possible behaviours is reduced to zero (i.e. it has got too close to the moving energy source too many times); or it is `run over' by the energy source. The agent might not be able to feed for at least three main reasons: it never manages to nd (i.e. sense and move to) the energy source; it can sense the energy source but cannot reach it, because the energy source is moving at a greater velocity than the agent; or (in the case of the paraminski2 agent) the ee goal overrides the hunger goal suciently often that the agent's energy source is reduced to zero. These results show a number of interesting things. There is very little dierence in the (mean) age of paraminski agents of types one and three architectures, but that type two paraminskis (with the ee behaviour) do not live as long; the minskis can expect to survive for an even shorter period. There is also a greater variation in the life expectancy for the types one and three paraminskis. In the minskis, no matter how low their energy state, other behaviours (e.g. turn or reverse) can be preferred to selecting the feed behaviour because of the action decision process. This can explain the poorer age when compared to the paraminski type one agent, which has an explicit feeding goal which overrides other behaviours. The paraminski type three architecture shows a very similar increase. The paraminski type two architecture, while having a better survival rate than the minski agent, does not perform as well as the the other two paraminski agents on either survival rate or feeding occurrences (in fact it never fed in any of the 1000 runs!). This must be an eect of the ee goal overriding the hunger goal (it has a greater signi cance value than the hunger 15

Multi Agent Experiments (Each run 1000 times) Surrounding Subject Age Left Situation Architecture Min Max Mean Var % of agents minski minski 3 42 7.36 3.64 0.0 minski paraminski1 5 205 55.84 29.33 3.2 minski paraminski2 5 122 54.51 27.20 1.4 minski paraminski3 5 197 55.37 27.59 3.3 paraminski1 minski 3 23 7.41 2.91 0.0 paraminski1 paraminski1 5 163 28.01 15.46 1.2 paraminski1 paraminski2 5 127 17.88 8.12 0.1 paraminski1 paraminski3 5 187 28.03 15.23 0.7 paraminski2 minski 3 23 7.30 2.89 0.0 paraminski2 paraminski1 5 159 9.31 15.75 1.6 paraminski2 paraminski2 5 51 17.47 7.02 0.0 paraminski2 paraminski3 11 107 21.05 4.20 0.0 paraminski3 minski 3 31 9.14 3.47 0.0 paraminski3 paraminski1 5 91 29.0 8.30 1.2 paraminski3 paraminski2 7 76 25.68 10.94 4.8 paraminski3 paraminski3 27 107 31.67 4.62 0.0 Table 2: Experimental results for multi agent experiments goal) so it ees the dynamic energy source (reducing its chance to feed) but in doing so reduces its life expectancy. The behaviour losses statistic supports this interpretation; the ratio of behaviour losses to age (approximately 2:9) is much lower than the other paraminski agents.(approximately 5:8). We can therefore conclude the decision process used in the minski agents is unsuitable for all types of behaviour, and that by simply making explicit some high level behaviours (as in the paraminski type one architecture) a better survival rate follows. However, there is obviously scope for improvement in the decision process used for deciding between competing goals in the paraminski agents; otherwise the type two paraminski would perform better than the type one paraminski after all it has the extra safety (survival) goal of keeping itself away from danger. The con ict between the feed and ee goals is highlighting a de ciency in either the goal representation scheme, or its processing (or some combination of these two). Further work on deliberative goal processing at the djinnski level and greater sophistication in goal representation may provide better results. No signi cant dierence in using the paraminski type three architecture (rather than the type one) is consistent with the experimental situation (pursue goals are never generated).

4.2 Multi Agent Experiments

In the second set of experiments, ve agents and the energy source are located in one room. The behaviour set for all agents is slightly extended here, in that they are allowed to leave the room once they have reached a speci c age; in the case of the minskis this is just another second level behaviour, for the paraminskis it is a further explicit goal. In all cases, four of the agents are one type (i.e.minski or one of paraminski subtypes), placed in speci c locations near a fth agent. The class of fth agent is varied for each one of these situations. The initial arrangement of agents is 16

70 Age 60

50

40

30

20

10 A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

0 Situation 1

Situation 2

Situation 3

Situation 4

(a) Non-escaping agents

70 Age 60

50

40

30

20

10 A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

0 Situation 1

Situation 2

Situation 3

(b) Escaping agents

Situation 4

Figure 7: Histogram for (mean and standard error) age for agents in the four arranged scenarios such that with speci c combinations of agents, pursue and ee (other agents) goals are generated. Again statistics are collected for its age (either when it dies or leaves the experimental room). Each of these scenarios is run 1000 times. Table 2 displays the mean and variance for age (for when the agent either left the environment or was killed) for all these situations. plus the percentage of agents that manged to leave the room. The most obvious statistic is how the life expectancy varies across the dierent scenarios for the paraminskis, but not for the minskis. For example, in the situation with four surrounding minski agents, age (for the paraminskis) increases over that obtained in the single agent experiments) for the paraminskis. It is reduced for the minskis over the single agent experiments presumably because the greater overall energy consumption rate increases the danger level of the environment (the energy source reaches its recycle threshold sooner and more often). While the life expectancy for the paraminskis is always greater than the minskis, there is a dramatic decrease when more than one paraminski is used. There must be some interplay between the eect of the dierent goals being pursued by the agents and the overall demands placed on the environment. Figures 7(a) and 7(b) graphically show the results where the ages of the non-escaping and escaping agents are seperated. The rst thing to notice is that no minski-type agent ever manages to leave the environment. Secondly, although the age varies very little for the basic paraminski agent (i.e. architecture 2, "A2" in the histograms) over the three situations involving dierent forms of paraminski-type agents (situations 2, 3 and 4), there is more variation for the multiple-goal paraminskis (i.e. architectures three and four). Furthermore, there is very little variation in the age of "A2" agents, whether they escape or not. Figures 8(a) and 8(b) graphically show the results for rerunning this multi-agent experiment, but with all agents (initially) randomly placed. The curious result obtained for a longer life expectancy for the paraminskis when sharing the environment with the minskis is still found. While this change in initial situation shows very little change in values obtained for the age of non-escaping agents ( gures 7(a) and 8(a)) for the four agent types in any of the four scenarios, there is a dierence in 17

70 Age 60

50

40

30

20

10 A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

0 Situation 1

Situation 2

Situation 3

Situation 4

(a) Non-escaping agents

70 Age 60

50

40

30

20

10 A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

A1

A2

A3

A4

0 Situation 1

Situation 2

Situation 3

(b) Escaping agents

Situation 4

Figure 8: Histogram for (mean and standard error) age for agents in the four random scenarios the values obtained for escaping agents. In the second scenario (shown in histogram 8(b)), there are greater numbers of escaping agents.

5 Future Work Future developments of the work presented here can be split into two main areas: extending the complexity of the scenario by implementing the djinnski, metadjinnski and minder agents; and investigations into learning for the situated agents. The development of the djinnski agents will initially make use of some simple deliberative behaviours (e.g. goal con ict resolution), and a means of controlling the minskis to perform speci c tasks. One avenue of research that is pertinent to our research groups long term activities is to further address motivator control states. Further development of our architecture will also need to address the nature of the higher (meta-management) re ective processes. This may include features such as evaluation of behaviour and long-term goals with regard to niche roles, what niche roles are and how they develop and in uence cognitive behaviour. We have other interesting problems to solve such as the diering perceptual information (from the minskis) can combined to give a globally centred descriptive model of the environment rather than a number of agent centred descriptions. There are a number of possible types of learning that are interest to us, and our developing architecture and its underlying theory need to be able to account for these. For example, we can consider 1 How are existing (instinctive or reactive) behaviours within the minskis ne-tuned? For example, the association of certain classes of perceptual space description and actions to perform in those so described situations. We will therefore have to take a closer look at existing adaptive agent architectures. 18

2 How are tuned behaviours (in higher level processes) subsumed into lower levels where, for

example, faster reaction times are possible. 3 The development of new `low-level' behaviours, perhaps in a way analogous to `chunking' in the SOAR architecture [12]. For example, the combination of certain existing capabilities in a speci ed form to solve speci c (internal and external) environmental problems, e.g. looking for the energy source and moving towards it once discovered. 4 The abstraction over the rational space of the deliberative mind to nd better ways of managing complex and (possibly) con icting tasks in a resource-limited architecture.

6 Acknowledgements This research is funded by a University Of Birmingham internal grant. Thanks to members of the Cognition and Aect group for discussions related to this work and earlier versions of this paper.

19

SIM_OBJECT

SIM_ENV

SIM_M0

SIM_M1

SIM_WALL

SIM_ROOM

SIM_DITCH

SIM_DOOR

SIM_EXIT

SIM_ENTRY

SIM_HAZARD

SIM_ESOURCE

SIM_RECYCLE

Figure 9: Hierarchy of classes for objects in the new robotic creche

A Guide to the implementation Here I show a step-by-step process to moving towards an implementation for the scenario discussed above, and in further detail in the following appendices.

A.1 De ne the class ontology

When designing an agent scenario, it pays to be careful in designing an appropriate class hierarchy, to avoid duplication of slots and objects much bigger than necessary. A more important eect of good hierarchy design is that method inheritance will work eciently (i.e. the call next method function on pop11 object class). So once, an approximate idea of what is required is in place, we can design classes of objects and agents required, making all of them subclasses of the classes provided in the SIM AGENT toolkit. For example, see 9 for a diagrammatic sketch of the object hierarchy for all objects in the environment. Figure 10 shows the equivalent class hierarchy for agents in this partial implementation of the collective minder experiment. Note that some classes (sim M0 and sim M1) are common to both set of objects, and that all are de ned on the base SIM AGENT agent (sim object). The main dierence between sim M0 and sim M1 objects are that the former have slots related to physical extent (i.e. length, width, minimum and maximum x and y co-ordinates), while the latter have slots more appropriate to dynamic moving objects (i.e. velocity, direction etc.). The following gives actual code examples showing how parts of this class hierarchy are actually implemented using the SIM AGENT toolkit. The base object is sim M0 which is de ned as a subclass of sim object. ;;; This is subclass of sim_object with extra slots ;;; All environmental objects in this domain use this or ;;; a further subclass of it. define :class sim_M0; is sim_object; slot len = 0; ;;; This is vertical on the screen slot width = 0; ;;; This is horizontal on the screen slot height = 0; ;;; Coming out of the screen! slot minx = 0; ;;; Minimum horizontal position within environment

20

SIM_OBJECT

SIM_M0

SIM_M1

SIM_M2

SIM_MINSKI

SIM_PARAMINSKI1

SIM_PARAMINSKI2

Figure 10: Hierarchy of classes agents in the the new creche slot miny = 0; slot maxx = 0; slot maxy = 0; enddefine;

;;; Minimum vertical position within environment ;;; Maximum horizontal position within environment ;;; Maximum vertical position within environment

The other base object in this work (sim M1) is also de ned upon the toolkit's sim object.

;;; This is subclass of sim_object with extra slots ;;; All objects in this domain use this or a further subclass of it. define :class sim_M1; is sim_object; slot sim_status = 'Passive'; ;;; or 'Active' slot energy = 0; slot posx = 0; slot posy = 0; slot vel = 0; slot mass = 0; slot orientation; ;;; North, South, East, West slot sim_sensors = [{sim_sense_agent 1000}]; slot senses = []; ;;; Senses associated with object slot sim_cycle = 0; ;;; Current age (in toolkit scheduler cycles) slot graphicid = 0; ;;; Display id in graphics world slot debuglevel = 0; ;;; Used to limit extent of text output and debugging information ;;; 0 - none ;;; 1 - minimal data at start and end of sim_run_agent ;;; 2 - 1 plus data at start of each rulebase ;;; 3 - as above but shows rules being used (without pattern matching) ;;; 4 - as above but includes prb_show_conditions ;;; 5 - everything including prb_walk enddefine; ;;; Default room type: an open space unless one of its subclasses

21

define :class sim_room; slot walls = []; slot rooms = []; slot doors = []; slot hazards = []; enddefine;

is sim_M1; is sim_M0; ;;; Walls that define the room ;;; Adjacent rooms ;;; Doors associated with the room ;;; Hazards existing in the room

;;; Mini-bots etc are subclasses of this define :class sim_M2; is sim_M1; slot act = []; ;;; Current (unchanged) action slot act_history = []; ;;; History of actions slot lost_acts = []; ;;; Behaviours lost to agent slot dyna_mesh = []; ;;; Weights associated with behaviour nodes slot effectors = []; ;;; List of effectors slot hungry = false; ;;; If true needs to replenish energy source soon slot senses = [[vision 150.0]]; enddefine; ;;; Function definition encapsulating all the minski processing node rulesets define sim_active_rules(); [ ;;; Perceptual rulesets sim_perceptrules sim_p0rules sim_p1rules sim_p2rules ;;; Level 0 behaviour nodes sim_l0_start sim_l0_stop sim_l0_turn sim_l0_up sim_l0_clear ;;; Level 1 behaviour nodes sim_l1_accelerate sim_l1_deccelerate sim_l1_reverse sim_l1_clear ;;; Level 2 behaviour nodes sim_hunger sim_l2_wander sim_l2_clear ;;; Decision behaviour nodes sim_mE_act_decide sim_mE_mesh sim_mE_act0 sim_mE_act1 sim_mE_act2 ] enddefine; define :class sim_minski; is sim_M2; slot character ='Active'; slot held = false; slot sim_rulesets = sim_active_rules; slot sim_status = 'Alive'; slot voice = 'Silent'; enddefine;

A.2 De ne the sensor methods and do action methods for the classes

This includes de ning internal formats for sensory information and action speci cations. The toolkit provides a default sensing method for its agents and objects. We make use of this with slight variations to some of the sensing methods to give rise to predicates such as:

[new sense data ?OBJECT? ?DISTANCE?]

22

where ?OBJECT? is a reference to some sensed object, and ?DISTANCE? is the distance away. In the minski and paraminski agents, these are used as the basis for the perceptual system; implemented as a number of hierarchical rule sets. It is the output from these perceptual systems that are used to govern and aect cognitive and behavioural processing. On the other hand, the ìnternal processing' of the energy source makes direct use of the predicates placed in the agent database by the toolkit sensing methods. One of the easiest ways of mapping required to actual actions is through the use of the do statement embodied in the syntax of poprulebase (used by the SIM AGENT toolkit). For example, in the minskis decision rulebase (sim decide.p), the mapping from a decision to do something (e.g. turn left) makes use of the following rule (and method): ;;; Rule to map decision to turn left to sim_agent toolkit mapping of action define :rule act_r6 in sim_mE_act0 [Self ?agt0] ;;; First condition allows reference to current agent [action 0 turn_left] ;;; Second condition refers to internal decision to turn left ; [do sim_turn_left ?agt0] ;;; Make use of the sim_agent "do" statement to sanction and action [STOP] enddefine; ;;; STOP Method for all agents (sim_M2 and subclasses) define :method sim_stop(agt:sim_M2); ;;; Stoppping consumes 1 (one) energy unit energy( agt )-1 -> energy( agt ); ;;; If flag for debug messages then print out a statement if (agt.debuglevel > 0) then printf('Stopping an agent: %P\n', [% sim_name( agt ) %]); endif; ;;; Change agent velocity to zero (0) 0 -> vel( agt ); ;;; Change status of agent to "Passive" as opposed to "Active", "Dead" etc. 'Static' -> agt.sim_status; enddefine;

A.3 De ne the (Poprulebase) rulesets

Used for internal processing of the dierent classes of agents, and the rules for each ruleset. This involves de ning the formats for the dierent kinds of information to be used in the internal databases, e.g. sensor information, beliefs about the environment, motivator structures, plan structures, management information, etc. The following appendices give further details.

A.4 Specify the initial databases

This has to be done for each type of agent. The type of information used in the databases (typically predicates stored as lists with the head of the list being the predicate) should relate to the pattern matching rules associated with the object (agent). The following appendices give examples of the the predicates used for the energy sources, and the minski and paraminski class of agents. An 23

example of the database for the minski class of agents is initialised (in its version of the SIM AGENT method sim run agent) is given below: ;;; Method to run the MINSKI class of agents define :method sim_run_agent(agt:sim_minski, objects); ;;; Do not clear paraminskis databases: already set up ;;; with their own predicates in the method for that class of agents unless (issim_paraminski1(agt)) ;;; Clear old data out of databse ;;; - i.e. minskis carry no dtaa forward to the next cycle then sim_clear_database( sim_data(agt) ); endunless; ;;; Add predicates special to minskis to its internal database prb_add_to_db( [Act ^^(agt.act)], sim_data(agt) ); prb_add_to_db( [Energy ^(agt.energy)], sim_data(agt) ); prb_add_to_db( [E_lower ^(agt.ethreshold)], sim_data(agt) ); prb_add_to_db( [Leave ^(agt.leave)], sim_data(agt) ); prb_add_to_db( [Orientation ^(agt.orientation)], sim_data(agt) ); prb_add_to_db( [Position ^(agt.posx) ^(agt.posy)], sim_data(agt) ); prb_add_to_db( [Self âgt], sim_data(agt) ); prb_add_to_db( [Status ^(agt.sim_status)], sim_data(agt) ); prb_add_to_db( [Velocity ^(agt.vel)], sim_data(agt) ); ;;; run_senses ;;; If there is any sensory dat, add it to the database unless (senses(agt)==[]) then sim_add_list_to_db( senses(agt), sim_data(agt) ); endunless; ;;; Move up the method inheritance hierarchy for this method call_next_method(agt, objects); ;;; Now do any post hoc processing if required enddefine;

A.5 Associate rulesets with each type of agent

The user needs to de ne which rulesets are associated with each class of agent, and in what order (in each time slice) they are used. This is given in the class de nitions for sim minski above, using the function de nition sim active rules. It is also possible to de ne the relative processing speeds associated with each ruleset, by allowing more rules to be used in certain rulests (if any match) and the type of pattern matching that is allowed. The default pattern matching behaviour is one rule per ruleset per time slice. By changing the value of prb all rules to true, all possible rule matches per database to ruleset instantiation can be used. By increasing sim cycle limit for any particular ruleset, from the default of one,

24

A.6 Create top level functions

These functions are those that are used by the user in running the agents. They should be responsible for creating the right number of all the required initial instances of the agent classes and put them into a list to be given to the scheduler. They should also create any other data-structures required, and the procedures to access and update them (e.g. a map of the world, if the world itself is not an object). For instance in the current experiments (see dnd/demos/minskis/README) all objects, agents and other data structures are held in slots on a global variable (an instance of the environmental class of object). This object acts like a blackboard in that it mirrors the current state of the present simulation and all information related to any simulation is kept there. High level functions (and methods) that run the toolkit scheduler or provide graphical or text information access this object for all their data structures. Changes in the state of the simulation are written to this, for instance it contains a list of all agents and an initially empty list of deadthings. As the simulation is run, and agents die, they are removed from the listin the rst slot and added to the list in the second slot. A general schema for this type of function is given immediately below: define top_level_function( agentspecs, NumberOfCycles); Set up agents etc. according to specifications (e.g. number of them) as given by agentspecs Do any explicit switching or initialisation of the agents etc. IF graphics required THEN perform the necessary startup graphics Bundle all agents, objects together into one list (TheLot) call sim_scheduler with TheLot and NumberOfCycles, e.g. sim_scheduler(TheLot, NumberOfCycles); Any tidying up/output after running it all enddefine; ;;; End of the top level function

Other high-level functions that are useful, are those that provide text and graphical information (through accessing objects and agents held on the environmental object). These can be runtime or post-hoc explanatory functions. The les sim exp.p and sim graphics.p provide numerous examples. Other important functions for changing the behaviour of the toolkit for individual classes of agent are: sim run agent, an example of which is given above for the sim minski class of agents; sim agent ruleset trace which is responsible for the output of messages when any ruleset associated with an agent is run; sim agent endrun trace which outputs messages at the end of the call to the scheduler for each agent; sim scheduler pausing trace which outputs messages at the end of each scheduling cycle for each agent; and sim agent actions out trace which outputs text messages for any action performed on or by any agent. 25

B De nition of the energy source Energy sources are a relatively simple form of agency, de ned as a subclass of the SIM AGENT toolkit objects and take the following important slots:

Database Dynamic

::= Set of structured propositions

Data for pattern matching rules ::= True OR False Whether it is moving

Energy level ::= numeric value

Its current energy level

Magnetic

::= True OR False

Name

::= text identifier

Position

::= numeric pair for coordinate position

Recycle

::= numeric value

Status

::= Active OR Passive

V elocity

::= positive numeric value How fast it moves (direction given by Orientation)

Whether it can damage other agents processing

Gives it a name Orientation ::= North OR South OR East OR West Its direction (when moving)

Where it is within the environment

Energy level threshold for replenishment Symbol related to status

Propositions allowed in the database are:

Dynamic(truth ? value)

establishes whether energy source dynamic

Elevel(integer)

predicate giving current energy level

Ethresh(integer)

predicate de ning energy level for recycling

Magnetic(truth ? value)

establishes whether energy source dangerous

Position(x ? coordinate; y ? coordinate)

current position

Proximity(distance)

predicate de ning distance within which source is dangerous

Self (object identifier)

predicate enabling ease of reference to own object

Sense(object identifier; distance)

sensory data giving object and its distance

Target(object identifier; distance; x ? coordinate; y ? coordinate; direction)

identi es sought object and navigation information

The pattern matching rules de ning the behaviour of the energy source (once its energy level sinks below a given threshold) are clustered into two rulesets; one (ES movement rules) covering 26

when the energy source should move and in what direction; and the second (ES recycle rules) detailing when action such as leave the environment, destroy behaviour node of another agent in close proximity or destroy colliding agent. The pattern matching rules for determining whether to move and controlling the direction of movement are: IF Elevel(val1) AND Ethresh(val2) AND (val1 < val2) THEN Magnetic( true ) AND Dynamic( true ). IF Sense( recycle door, distance ) AND NOT(Target( recycle door, , , )) THEN Target( recycle door, distance, xpos, ypos). IF Target( recycle door, distance, xpos, ypos) AND Position( xval, yval) AND Near(xval, xpos) AND Near(yval, ypos) THEN Target( recycle door, distance, xpos, ypos, Reached). IF Target( recycle door, distance, xpos, ypos) AND Position( , yval) AND (yval >> ypos) THEN Target( recycle door, distance, xpos, ypos, South). IF Target( recycle door, distance, xpos, ypos) AND Position( , yval) AND (yval > xpos) THEN Target( recycle door, distance, xpos, ypos, West). IF Target( recycle door, distance, xpos, ypos) AND Position( xval, ) AND (xval val2) AND Sense( Exit door, dist) AND Near( dist ) THEN LEAVE. IF Age( val1 ) AND Leave Age( val2 ) AND (val1 > val2) AND Sense( Exit door, dist) AND NOT(Near( dist )) THEN MOVETOWARDS( Exit Door). IF Age( val1 ) AND Leave Age( val2 ) AND (val1 > val2) AND NOT(Sense( Exit door, )) THEN MOVETOWARDS( Exit Door).

31

D De nition of the paraminskis agents Paraminskis subsume the minski agents and extend their characteristics in a number of ways. The relevant les, include those containing code for the minski agents, plus sim l3 rules.p and sim paraminskis.p. Their perceptual processing is slightly extended and behaviour set includes the explicit consideration of motivational states; this is re ected in the following descriptions. The agent de nition is extended over the minski agent de nition to include the following important slots

Character ::= Active OR Emotional: Estate

De nes the the subtype of the paraminski

::= Active OR Calm OR Disturbed OR V igorous OR Frustrated OR Protective

Symbol value de ning the emotional state of the agent

Goals ::= List of currentlyactive goals Motivators ::= Thecurrently active goal(s); can be empty Partner ::= True OR False Senses Status

Boolean value set to true if partnered with another agent ::= [V ision; Hearing] OR [V ision; Hearing; Magnetic]: List of perceptual modalities open to agent ::= Active OR Passive OR Inert Symbol related to status

The extra level of behaviours are speci ed using pattern matching rules that generate goals. Three (motivational) behaviours are allowed: hunger IF Energy( val1 ) AND Ethreshold( val2 ) AND (val1 < val2) THEN Hungry( TRUE ). IF Hungry( TRUE ) AND Age( age ) AND NOT(Goal( hunger, age )) THEN generate goal(hunger, age).

ee: IF Sense( magnetic, energy source) AND Dynamic( energy source ) AND Age( age ) AND NOT(Goal( ee, energy source, age)) THEN generate goal( ee, energy source, age). IF Emotion( calm ) AND PerceptualSpace( overcrowded, objs ) AND Partner( FALSE) AND Age( age ) AND NOT(Goal( ee, objs, age)) THEN generate goal( ee, objs, age). and pursue: IF Emotion( vigorous ) AND Sense( agent ) AND Desirable( agent ) AND Partner( FALSE) AND Age( age ) AND NOT(Goal( pursue, agent, age)) THEN generate goal(pursue, agent, age). IF Emotion( frustrated ) AND Sense( agent ) AND Desirable( agent ) AND Partner( FALSE) AND Age( age ) AND NOT(Goal( pursue, agent, age)) THEN generate goal(pursue, agent, age).

32

Goals, generated by these behaviours, de ned using objects:

Originated Goal type Satisfied Importance Insistence Subgoals Actors Commitment

::= Which cycle did goal come into existence ::= Hunger OR Flee OR Pursue ::= True OR False ::= Low OR Medium OR High ::= Low OR Medium OR High ::= Rulesets de ning actions to be taken ::= List of agents and objects involved ::= Low OR Medium OR High

The impoverished nature of the importance, insistence and commitment slots (they can only take one of three token values), provides a simple means of choosing between goals where necessary. The subgoals slots de nes pattern matching rules (very similar in avour to the type used by Nilsson [13]) that specify precompiled plans for achieving goals. These are listed below for the four types of motivational goals (the ee behaviour has two separate forms: one related to eeing the moving energy source; the other related to eeing overcrowded perceptual space).

Plans for the hunger goals:

IF Energy( val1 ) AND Ethreshold( val2 ) AND (val1 > val2+15) AND Goal( hunger, age ) THEN SATISFIED( Goal( hunger, age ) ). IF Energy( val1 ) AND Ethreshold( val2 ) AND (val1 < val2+15) AND Proximate( energy source) THEN Increase( Energy( ) ). IF Energy( val1 ) AND Ethreshold( val2 ) AND (val1 < val2+15) AND Sense( energy source) AND NOT(Proximate( energy source)) THEN Approach( energy source ). IF Energy( val1 ) AND Ethreshold( val2 ) AND (val1 < val2+15) AND NOT(Sense( energy source)) THEN WANDER. Plans for the ee (the dynamic energy source) goals:

IF Goal( ee, energy source, age ) AND NOT(Sense( magnetic, energy source)) THEN SATISFIED( Goal( ee, energy source, age ) ). IF Goal( ee, energy source, age ) AND Sense( magnetic, energy source) AND NOT(Near( energy source)) THEN SATISFIED( Goal( ee, energy source, age) ). IF Goal( ee, energy source, age ) AND Sense( magnetic, energy source) AND Near( energy source) THEN FLEE( energy source ). Plans for the ee (overcrowded perceptual space) goals:

IF Goal( ee, objs, age ) AND NOT(PerceptualSpace( overcrowded, )) THEN SATISFIED( Goal( ee, objs, age ) ). IF Goal( ee, objs, age ) AND PerceptualSpace( overcrowded, )) THEN FLEE Plans for the pursue (a desirable agent) goals: 33

IF Goal( pursue, agent, age) AND Sense( auditory, agent) AND Near( agent ) AND Accept partnership( agent ) THEN SATISFIED( Goal( pursue, agent, age ) ). IF Goal( pursue, agent, age) AND Sense( auditory, agent) AND Near( agent ) AND Reject partnership( agent ) THEN UNSATISFIABLE( Goal( pursue, agent, age ) ). IF Goal( pursue, agent, age) AND Sense( auditory, agent)) AND NOT(Near( agent )) THEN MOVETOWARDS( agent ). IF Goal( pursue, agent, age) AND NOT(Sense( auditory, agent)) THEN WANDER. The pursue goal is the only one that can be unsatisfactorily resolved! The hunger and ee goals can be satis ed, or postponed, but the agent can never resolve them (and remove them from its goal database) if they cannot be satis ed. The pursue goal, if the desired attachment is rejected, is removed from the goal database but is not satis ed; i.e. it is unsatisfactorily resolved.

34

References [1] J. Bates, A. B. Loyall, and W. S. Reilly. Broad agents. SIGART BULLETIN, 2(4), Aug. 1991, pp. 38{40. [2] L. P. Beaudoin. Goal processing in autonomous agents. PhD thesis, School of Computer Science, The University of Birmingham, 1994. [3] L.P. Beaudoin and A. Sloman. A study of motive processing and attention. In A.Sloman, D.Hogg, G.Humphreys, D. Partridge, and A. Ramsay, editors, Prospects for Arti cial Intelligence, pages 229{238. IOS Press, Amsterdam, 1993. [4] R. A. Brooks. How to Build Complete Creatures Rather than Isolated Cognitive Simulators. In K. VanLehn, editor, Architectures For Intelligence, pages 225{2239. LEA Pubs, Hove and London, 1991. [5] R. A. Brooks. Intelligence without representation. Arti cial Intelligence, 47:139{159, 1991. [6] D.N. Davis, A. Sloman and R. Poli Simulating Agents and Their Environments. AISB Quarterly, October 1995. [7] D.N. Davis. Towards A Formalism For Cognitive Agents Using The Minder Scenario Cognitive Science Report, CRSP12-95, School of Computer Science, University Of Birmingham, November 19895. [8] I.A. Ferguson. Integrated Control and Coordinated Behaviour: a Case for Agent Models In Intelligent Agents: ECAI-94 Workshop on Agent Theories, Architectures and Languages, pp 203-218, Springer-Verlag, 1995. [9] M. P. Georgeo and A.L. Lansky. Reactive reasoning and planning. Proc the Sixth National Conference on Arti cial Intelligence, 2 (pp. 677-682). Seattle, WA: AAAI. 1987. [10] B. Hayes-Roth. Intelligent control. Arti cial Intelligence, 59:213{220, 1993. [11] M. L. Minsky. The Society of Mind. William Heinemann Ltd., London, 1987. [12] A. Newell. Uni ed Theories of Cognition. Harvard University Press, 1990. [13] N. J. Nilsson. Teleo-reactive programs for agent control. Journal of Arti cial Intelligence Research, 1:139-158, 1994. [14] T.J. Norman and D. Long. Goal creation in motivated agents In Intelligent Agents: ECAI-94 Workshop on Agent Theories, Architectures and Languages, pp 277-290, Springer-Verlag, 1995. [15] H. A. Simon. Motivational and emotional controls of cognition. Reprinted in Models of Thought, Yale University Press, 29{38, 1979. [16] A. Sloman. Robot nursemaid scenario. Grant proposal, 1986. [17] A. Sloman. The mind as a control system. In C. Hookway and D. Peterson, editors, Philosophy and the Cognitive Sciences, pages 69{110. Cambridge University Press, 1993. [18] A. Sloman, L. Beaudoin and I. Wright. Computational modeling of motive-management processes. Proceedings of the Conference of the International Society for Research in Emotions, Cambridge, July 1994. (Frijda,N. Ed), 344-348.ISRE Publications 35