Multi-agent collaboration in time-constrained domains ... - CiteSeerX

1 downloads 1318 Views 119KB Size Report
foundations of a domain-independent, distributed decision-support system. It collects data on .... original "owner" — hence the name dynamic scoping. Node N-1 .... Generator and the Planning Agents, as shown in Figure 3. The Scenario.
Multi-agent collaboration in time-constrained domains * Nicholas V. Findler and Uttam K. Sengupta * * Department of Computer Science, Arizona State University, Tempe, Arizona 85287-5406, USA Timeliness is usually an indispensable attribute of planning and problem solving for resource allocation in command, control and communication (C3) systems. The success of such a system is judged on its ability to respond to scheduled and unscheduled tasks within a permissible time period. The response is based on a plan that covers the following activities: resource allocation, plan execution and monitoring and dynamic plan mending, if necessary. Decision making for resource selection can become very time consuming when there are many resources and the number of constraints is large. In a changing environment of multiple agents, restrictive organizational structures and strict communication protocols may cause intolerable further delays. Traditional approaches to planning in deterministic environments require a predictable amount of time to produce and execute plans. However, given more time, such systems usually cannot improve on the plans. In this paper we describe a multi-agent resource scheduler which uses a prioritized rule base to model decision making under the constraints of time. We also discuss dynamic scoping as a negotiation technique for inter-agent cooperation and constrained lattice-like communications as an optimized message routing strategy. Finally, we present some empirical results from a sequence of experiments. K e y w o r d s : multi-agent planning, negotiation, time-critical planning, prioritized rule-based planner, dynamic scoping, envelope of effectiveness, constrained lattice-like message routing. 1. INTRODUCTION This paper describes our effort in modeling the behavior of a set of semiautonomous, collaborating intelligent agents in domains with medium timecriticality — response to given events in the environment should be produced in the order of minutes to hours. A large programming system, SENTINEL, has

* The opinions and assertions contained in this paper are those of the authors and do not necessarily reflect the views of

the U.S. Coast Guard. ** Now with the Motorola Government and Systems Technology Group, MSR 3208, 8220 East Roosevelt Street,

Scottsdale, Arizona 85282, USA.

2

been developed which simulates the environment of a hierarchy of decision making entities (nodes). A node has a given area of jurisdiction and is in charge of a given set of resources. These resources are of different types and in various states of readiness and availability. The tasks are usually non-stationary and may move within a single area of jurisdiction or migrate across the boundaries of several jurisdictions. The latter may require interaction and resource sharing among the nodes. Planning for resource allocation and scheduling has to be undertaken in this environment characterized by resource attributes, dynamic environmental effects, legal, fiscal, organizational and other constraints. SENTINEL establishes the foundations of a domain-independent, distributed decision-support system. It collects data on resource usage, success rates and other measures related to general performance metrics. It can be used for strategic planning at all levels of the decision-making hierarchy as well as for tactical and reactive planning. We have designed and implemented a multi-faceted testbed for setting up a variety of experiments in a high-level manner, testing and evaluating different forms of inter-agent cooperation and communication, organizational structures of agents, planning strategies, resource allocation and task distribution methods, etc. In a more general and domainindependent sense, we can study in the testbed how an individual agent can • cooperate with a selected set of others to achieve a common set of objectives, • is interconnected with others for optimum resource utilization and effective goal accomplishment (network architecture), • share appropriately distributed knowledge and handle uncertainty, • perform as a function of the amount of knowledge and meta-knowledge available to it, • reconfigure the network in response to a dynamically changing environment. We introduce in this paper the concept of dynamic scoping , an effective and efficient protocol for negotiation in multi-agent problem solving. When an agent fails to find a resource from its own pool, it initiates communication with its geographically nearest neighbors and requests assistance for the task at hand. If these first level neighbors fail to locate the needed resources, they in turn communicate with their nearest neighbors. This expansion of scope continues till either a resource is found or else a preset limit on the expansion level ( envelop of effectiveness ) is reached. The resources are returned to their actual "owners" after the tasks at hand are accomplished.

3

Time-stressed communication increases the complexity of inter-agent communication and adds a substantial overhead to it in view of the necessary coherence between the agents' worlds. We compare and contrast different message routing strategies among a set of agents positioned in a usual hierarchical structure. Typically, many organizations (military, business, government, etc.) require the message routing to conform to the rigid order of the hierarchy — for reasons, such as an unambiguous assignment of responsibility. However, in some domains, a c o n s t r a i n e d , lattice-like strategy, described later, would allow concerned agents to bypass the hierarchy and interact directly without ignoring the required acts of control and information transmission. The constraints of time represent an important aspect of resource scheduling and task allocation. To facilitate this process and to prioritize the operating attributes of the environment, we use a utility function composed of two attributes, importance and urgency . The former is constant over time whereas the latter is inversely proportional to the difference between current time and the deadline after which any solution becomes meaningless. 2. ON DISTRIBUTED MULTI-AGENT PLANNING PROBLEM SOLVING There are problem-solving tasks whose size and certain other characteristics do not allow them to be processed effectively and efficiently by a single computer system. Such tasks are characterized by one or several of the following properties 1 : spatially distributed input/output (or sensor/effector) operations, a need for extensive communication over long distances, time-stressed demands for solution, a large degree of functional specialization between quasi-simultaneous processes, • a need for reliable computation under uncertainty — that is, in relying on incomplete and inconsistent data and knowledge, • a need for graceful degradation — the results are acceptable (although possibly of lower quality and obtained more slowly) even when some of the computational and communicational facilities have become disabled, • a limited amount of shared resources must work on common objectives, without competition and conflict of interests. • • • •

Advances in computer and communication technology have made studies in Distributed Artificial Intelligence (DAI) possible. The area of our concern, Distributed Planning and Problem Solving Systems is the most active one in

4

DAI. It is considered as the combination of AI and Distributed Processing methodology. Distributed Processing is characterized by the following: several dissimilar tasks are being solved simultaneously by a network of processors, • the management of common access by the individual processors to a set of resources is the main reason for the interaction between processors, • resource distribution and conflicts between tasks are hidden from the user, • from the user's point of view, each task is performed by a system dedicated to that task only. •

The underlying philosophy of such networks is cooperation . The allocation of network components to problem solving processes is explicit and deterministic but may change as the environment changes. The interactions between the network components and the decisions made by them can be known to the user if necessary. Each node of the network possesses enough problem solving knowledge to apply its own expertise to its task and to communicate with other nodes. The problem solving strategy of every node is in harmony with that of the others. 3. THE SENTINEL SYSTEM The primary responsibility of SENTINEL is to perform distributed control over allocating moving resources to moving tasks. Our main focus here is on the role of organizational structures in the overall decision making and the feasibility of resource allocation under the constraints of time. 3.1 Inter–agent communication and message routing Agents have to communicate with each other at different levels — requesting and providing information, asking for and offering help, ordering/suggesting/accepting solution methods to sub-problems, etc. Although it is true in general that computation is cheaper, less error prone and faster then communication, agents cannot function in a cooperative mode without some basic communication going on. Further, the t o t a l k n o w l e d g e available about the environment and the individual agents' capabilities is too large to be stored by every agent, and the necessary continual updating of such information about the changes only aggravates the situation. A potentially satisfactory but very difficult solution is to assign some well-defined, general and basic "knowledge package" to every agent, which is to be augmented by some node-specific information needed for its operation. Only empirical ad hoc solutions to this problem have been

5

found for given domains with given characteristics, which of course is neither an effective nor an aesthetically pleasing approach. Let us consider a typical Common, Control and Communication (C 3 ) environment. A tree can represent a hierarchy of decision-making entities (see Figure 1). Information (e.g., about changes in the environment) usually flows from the lower-level agents upward to the higher-level ones whereas control instructions (e.g., to perform planned actions) follow the opposite direction. On their way, the intervening nodes may modify, augment or reduce the contents of the messages; the reason being that higher-level agents need more abstract and summarized information and increasingly detailed commands need be given as to how to achieve certain objectives. Headquarters

Region 1

control

information

District 1.1

Group 1.1.1

Station 1.1.1.1

. . .

Region 2

. . .

. . .

. . .

District 2.1

Group 2.1.1

Station 2.1.1.1

. . .

. . . . . .

Fig. 1 An abstracted version of the U.S. Coast Guard command and control hierarchy Although such a hierarchical organizational structure is effective and can supply uptodate information and control on a need-to-know basis, the inflexible and lengthy pathways render the communication processes less than efficient and error-prone. We have introduced the notion of constrained lattice-like communication structure which permits direct interaction between functionally related agents at any level. The 'constraints' refer to domain- and application-specific requirements that the routing strategy must abide by. The two organizational structures co-exist; for example, in our domain, a transfer of temporary control over resources can be negotiated between the relevant agents directly while higher-level authorities will learn about such decisions and can also modify or completely reject its implementation.

6

3.2 Inter-agent cooperation and resource sharing We now present the concept of dynamic scoping as a mechanism for interagent negotiation in the course of resource sharing (see Figure 2). When an agent does not have the resources of the appropriate type and in sufficient number to take care of the task within the desired time period, it contacts adjacent nodes and gives them the specification of the task at hand. ( Adjacency in our domain refers to neighbor locations along the onedimensional coast line. In other domains, two- or three-dimensional adjacency must be considered — for example, in ground-based air defense and in air-to-air combat, respectively.) Several possible scenarios may then arise: (i) An adjacent node has all or some of the needed resources available for the new task. It transfers the control of these to the requesting node on a temporary basis. (ii) The node has the resources but they are engaged in tasks that have a lower priority than the task at hand. These resources are pre-empted, again on a temporary basis, and loaned the requesting node. (iii) The adjacent nodes jointly cannot provide all the needed resources. They must then communicate with their adjacent nodes, farther away from the first node with the original task, and ask for their help. This process goes on until enough resources are re-assigned to the requesting agent. In turn, as tasks become accomplished, resources from other nodes return to their original "owner" — hence the name dynamic scoping. Areas of Jurisdiction

Node N-2

Task X Node N-1 Node N

Node N+1

Node N+2

Fig. 2 A hypothetical coast line to illustrate the concept of dynamic scoping in the one-dimensional case.

7

We also define the level of expansion to refer to the number of nodes involved in the resource assignment process. Level-1 means that a node asks and may get resources from the directly adjacent nodes "on loan", whereas Level-2 means that additional, once-removed nodes' resources are also involved in a multi-phase dynamic scoping process . This process goes on iteratively until either all needed resources have been acquired or a socalled envelope of effectiveness has been reached. The latter is defined by the fact that the time and expense necessary for reaching from it the task can no longer be justified by the benefit obtained from the accomplished objective — in other words, when the trade-off between the overall cost and the expected success of the operation is no longer favorable. We note that a central control mechanism has, in principle, the advantage over a distributed regime in that its resource utilization is more efficient as it can assign its resources in a more balanced manner. It often suffers, however, from a lack of robustness, high vulnerability, sluggish responsiveness, inadequate computational power, obsolete knowledge base and unsatisfactory compromises. We show that dynamic scoping extends the efficiency of centrally controlled systems to the distributed ones without adversely affecting the superior qualities of the latter. 3.3 Resource scheduling under time constraints Computation in the world of limited resources is often associated with cost/benefit trade-offs. A trade-off is defined as a relationship between two attributes of utility (for example, the immediacy and the precision of a computational result), each having a positive influence on the perceived total value of computer performance while adversely affecting the level attainable by the other 2 . Based on such a need, we define a measure that allows the proper ordering of operating attributes and constraints used in decision making, PRIORITY = IMPORTANCE * URGENCY where I M P O R T A N C E is a measure which describes the relative s t a t i c importance at a given knowledge level of an attribute in the decision making process, and U R G E N C Y characterizes its gradually c h a n g i n g (usually increasing) relative importance on the time scale. The constraints are arranged according to the hierarchy of priorities so that more and more details are taken into account with each time-slice and, correspondingly, more and more knowledge is used in the inference mechanism. A time-slice is the minimum time required for performing a single unit of meaningful decision making (for example, a single inference cycle). The ordering of

8

constraints on the basis of their priority also guarantees that the result of the planning process is as good as time has permitted "so far". 4. A SAMPLE OF RELATED WORK The task of the Phoenix system is to simulate the control of forest fires by deploying bulldozers, crews, airplanes and other objects 3 . In response to a situation, such as a new fire, an appropriate plan skeleton is retrieved from the plan library and placed in the t i m e l i n e , a temporal Blackboard. State memory stores information, such as weather, resource conditions and sensory input, that helps the cognitive agent select appropriate plans and instantiate the variables of the chosen plan for the current situation. At any time during plan execution, sensory data can trigger reflexive actions. The reaction happens very fast relative to the cycle time of the cognitive component. In a sense, dynamic scoping appears similar to CNet concerning negotiation for resources from neighboring nodes 4 . However, dynamic scoping does not use a "free-for-all" bidding process, requests are based on geographical proximity, and the requesting node resolves any conflicts and over-commitment of resources. Multistage negotiations 5,6 have extended the basic CNet protocol by incorporating an exchange of agent’s inferences iteratively to evaluate the impact of the agents’ local plans on the global plan. Durfee and Montgomery 7 provide a hierarchical protocol for inter-agent negotiation. For the chosen problem domain, the authors provide a mechanism for a hierarchy of agents to plan actions incrementally with the highest ranked agent planning first. Subordinate agents defer to their superior’s plans and make local adjustments. Endless loops of refinement are not expected due to the nature of this hierarchy. In contrast to SENTINEL, negotiation and detailed action plans are produced up front. In MultiFireboss Phoenix 8 , a three-phase negotiation is used by the fireboss in acquiring resources from other neighboring firebosses. Temporal conflicts on resource usage are resolved by delaying the handling of lower priority tasks. This strategy is similar to the one used by SENTINEL where higher priority tasks can pre-empt resources from lower ones. We note that a central control mechanism has, in principle, the advantage over a distributed regime in that its resource utilization is more efficient as it can assign its resources in a more balanced manner9 . One drawback of this conservative scheme is that the decision making process becomes relatively more time-consuming, especially in emergency cases. A complete relaxation of this strict organizational structure may be counter-productive as the agents' behavior would be difficult to predict. The agents' goals could become unfocused, the communication overheads may become too high, and

9

destructive interference could degrade system response times. We feel that dynamic scoping and constrained lattice-like message routing strategies extend the efficiency of centrally controlled systems to distributed ones without adversely affecting the superior qualities of the latter. Dean and Wellman 10 have provided the conceptual framework for a class of reactive systems. These systems use algorithms that are characterized by linear improvement of solution quality with time. In other words, such algorithms can be suspended, the current answers checked "anytime", and the procedure continued (or resumed) from where it was interrupted. In SENTINEL, we have taken this concept and prioritized the attributes of the operating environment in the order of their utility. We have then used this in a system with an “anytime” (instead of a suspend/resume) termination feature. We have also provided experimental results of our approaches. The model of time-critical planning suggested here is adequate for unscheduled tasks of low-to-medium priority. However, for high priority tasks (such as a computer-controlled intensive care unit in a hospital), specialized plans need to be pre-compiled and stored in the system for fast retrieval and utilization. Some reactive planning systems use strategies that guarantee the search process used for planning is swapped out so that other processes are given a chance to react to real-time inputs 1 1 . In such, Kaebling presents an architecture and a programming language, called REX, for intelligent reactive systems, such as dynamic path planning for robots. Other systems use the strategy of delaying commitment to a particular plan till the very end or revise an initial plan reactively during execution 1 2 , 1 3 . A partial planning mechanism may be extended with monitoring and execution information. SENTINEL uses some domain-specific plan revision techniques for certain instances of plan failure during execution. Sycara et al. 1 4 present the case for a multi-agent resource allocation system in timeconstrained applications, such as job-shop scheduling. The authors suggest a hetararchical arrangement of the agents with the use of sophisticated local control for achieving overall system goals. To handle time-critical planning, Michalski and Winston 1 5 proposed a variable precision system that uses "censored" production rules. The latter were categorized as low-likelihood assertions. If there is insufficient time or other resources to determine the censors' truth values, the censors are ignored and the decision is still considered true with high likelihood. Algorithms employing variable look-ahead and fast evaluation functions also fall in the category of "anytime" methods16 .

10

Finally, we note our work on Air Traffic Control 1 7 , 1 8 , 1 9 . The versatile testbed made it possible to experiment with different architectures and solution strategies. The shallow look-ahead simulation of variable depth used for plan generation was interleaved with plan execution and monitoring. 5. DOMAIN DESCRIPTION Some of the C3 operations of the U.S. Coast Guard (USCG) provide an ideal environment to test the SENTINEL system. The two main responsibilities of the USCG are to perform Search and Rescue (SAR), and the Enforcement of Laws and Treaties (ELT) missions. The USCG divides its maritime jurisdiction into a hierarchically organized set of subdivisions called areas, districts and groups and stations (see Figure 1), which follows the allocation of responsibilities. SAR and ELT cases represent unscheduled tasks of medium-to-high priority and require reactive resource scheduling. The role of a decision maker at a C3 center is to determine how to allocate the resources under its jurisdiction so that all tasks are attended to as soon as possible and in the order of their merit. (Here we have used the generic term merit to reference importance, urgency, practicality, etc.) There are various constraints to be satisfied due to scheduled and unscheduled vehicle maintenance, crew endurance, different degrees of resource and fuel availability, weather and sea conditions, vehicle capabilities, etc. All this leads to an information overload for human decision-making in a timestressed environment. ELT tasks become known when a vessel is observed that either behaves "suspiciously" or is on a lookout-list (generated by intelligence reports) . ELT cases include drug and migrant traffic, violators of fishing and other maritime treaties. SAR cases are generated by incoming distress signals and/or reports of overdue vessels. The types of USCG resources are boats of various types and sizes, and aircraft (both fixed-wing and helicopters). Normally, most of these resources are engaged in patrolling both coastal waters and the high seas. A p l a n consists of a resource or resource-mix selection and an ordered set of actions or mission steps (e.g., interdicting, boarding, searching, etc.) to be carried out at designated locations. After the selected resource reaches the interdiction point, it performs an on-site appraisal (this may involve boarding and examining the cargo with ELT cases, and checking the need for medical or other help with SAR cases). Most cases require either airlifting, towing or escorting to the nearest USCG facilities, which actions are collectively referred to as the hand-off phase . SENTINEL performs reactive planning by selecting the primary resources for interception and the secondary resources to complete the task of hand-off (escorting, towing, etc.)

11

6. OVERVIEW OF THE SENTINEL ARCHITECTURE The SENTINEL system comprises of two major components: the Scenario Generator and the Planning Agents, as shown in Figure 3. The Scenario Generator module comprises of three sub-modules: the Simulation Model, the Task Analyzer and Distributor and the Scenario Generator Message Handler. The Simulation Module reproduces the task environment and handles details of resource movement and performance, task generation and movement, and dynamic weather conditions (sea states and visibility). The Task Analyzer and Distributor accepts a scenario from the Simulation Module and performs an initial evaluation to select a suitable site for resource planning. The task and site selected are transmitted to a Planning Agent by the Scenario Generator Message Handler. The Planning Agents model the activities of a node within the distributed domain. The Planning Agents are distributed geographically and functionally in the model to match the organizational structure of a real-life environment. Each Planning Agent contains three modules: the Planning Agent Message Handler, the Plan Generator, and the Communication and Coordination Module. The Planning Agent Message Handler deals with the routing of all messages, both incoming and outgoing. The Plan Generator is a rule-based planner that models the role of a decision-maker at a site, operating under the constraints of time. The Communication and Coordination Module is invoked for the temporary acquisition of resources from neighboring sites. 6.1 Modeling the Environment The Simulation Model is a discrete-event simulator that models the task environment and the operational procedures of the USCG. It is an integral part of the Scenario Generator subsystem and interacts with the Planning Agent module to provide it with the environment in which the USCG resources carry out their tasks. The Simulation Model handles the following facets of the task environment: • Traffic generation and movement – Traffic includes legal as well as illegal sea-borne or air-borne traffic in the region. The vessel characteristics include size, type, suspicion level (if any), violation type (if any), flag, destination, location, and speed. • Resources, their activities and movement – The USCG resources belong to sea-going and air-borne vessel categories. Some of the attributes used in modeling resources are type, operational capabilities (such as speed and endurance), and patrol routes. Dynamically changing attributes of the vessels (such as current mission, location, speed, route, fuel level and crew mission times) are also tracked. The ELT task components include patrolling, spotting, intercepting, boarding, searching, and towing.

12

• Sea and visibility state changes – Weather needs to be modeled as it affects wave height and visibility. Wave height influences boat speed, while visibility has an effect on air-borne vessels and on the handling of SAR tasks. These phenomena are modeled through Markov processes20 . Simulation Model Task Task Analyzer and Distributor Task and Planning Agent Specifications

The "Best" Feasible Plan

Message Handler To Planning Agents

Scenario Generator

Planning Agent

Planning Agent Resource Allocation Plan

Planning Agent

Task

Message Handler

Plan Generator

Communication and Coordination Module To Other Planning Agents

Fig. 3 A high-level view of the SENTINEL system modules.

13

• Accounting for unavailable resources – Scheduled maintenance operations on boats and aircraft take resources away from the ELT cases. SAR cases may also cause pre-emption of resources from ELT. The primary function of the Task Analyzer and Distributor is to evaluate an incident reported by the Simulation Model and to select a suitable site for appropriate action. The task and the associated world information concerning a particular site are converted to a message. The Scenario Generator Message Handler then transmits this message directly to the site selected by the Task Analyzer and Distributor. The Scenario Generator Message Handler then waits for a response containing details of resource plans, interdiction paths, etc. The Simulation Model provides the environment in which the resource plans are executed. Once a decision is made (e.g., concerning an interdiction) on the Planning Agent side, the Simulation Model simulates and monitors the execution of the mission with the assigned resources and designated targets. It also collects and prints statistics (interdiction rate, time spent in missions, etc.). Global and detailed individual statistics (for example, by each vessel) are generated periodically during the course of an Simulation Model run. 6.2 Resource Selection and Allocation The Planning Agent Message Handler receives tasks specifications as a message from the Scenario Generator or other Planning Agents. Information is maintained at each node about the Planning Agent hierarchy. This includes the Planning Agents' command level (e.g., Group, District, Area), the names of the hardware nodes on which they reside, designated communication channels and the types of resources available. From the incoming message, details of the suspect (e.g., its profile and current activities), current location and activities of the USCG vessels, and weather conditions are extracted. The knowledge-based Plan Generator analyzes the information and selects a suitable resource. The responses generated by the Planning Agent are composed into appropriate messages and transmitted back to the Scenario Generator (or the requesting Planning Agents). The Plan Generator uses a three-phase planning process for resource selection. The first, requirement phase is used to select the resource types needed. In the second phase, resource types are mapped onto resource instances. However, due to a potentially large set of resource instances, a series of heuristics are used to filter out less likely cases. In the final, refinement phase, each of the resource instances filtered from the preceding stage undergo an in-depth quantitative evaluation. These measurements

14

include the exact distance of a given USCG resource from the task (invoking the Path Finder process), the current status of USCG resources, fuel levels and crew endurance. The Communication and Coordination Module is invoked to model interagent communication. When the site selected for resource planning fails to find a suitable resource plan, it attempts to "borrow" a resource from a neighboring site, based on dynamic scoping. The flexible experimental environment permits the choice of various inter-agent communication protocols. In particular, two schemes have been evaluated: one protocol adheres strictly to the hierarchical chain of command while the other uses a more flexible discipline, the constrained lattice-like routing strategy. Dynamic scoping is performed at the Group level of the Planning Agents. A Group, being unable to generate a resource plan for lack of sufficient resources, would communicate with its two nearest neighbors to acquire the needed resources from them on "loan". An example of such activity is given using Figure 4. Assume Group 4 interacts with its neighbors, Groups 3 and 5. In a strict hierarchical regimen, messages have to be routed along the proper chain of command — in our case, from Group 4 to Group 3 through their common District, District-8; and from Group 4 to Group 5 through District8, the Area operational center, and District-7. In the lattice-like scheme, such communications are direct and can bypass all intermediaries. (The nodes bypassed are, however, notified of the decisions made locally at the Group level to satisfy the information needs of the nodes traversed in the traditional manner of communication.) Area

District-8

1

2

3

District-7

4

Groups

5

6

7

8

9

Stations

Fig. 4 Configuration of C 3 sites used in the experiments.

15

6.3

Modeling

Time–Criticality

The Plan Generator is able to model planning under time constraints. To facilitate this, the planning process should be interruptible at any relevant point of the process. We consider "interrupts" in terms of time slices allocated to the various stages of the planning process. This allocation is governed by task PRIORITY composed by its IMPORTANCE and URGENCY. IMPORTANCE is static and is based on the task's suspicion type and level (i.e., its status on the look-out list). URGENCY varies over time and is determined by the proximity of the suspect to its estimated final destination. This proximity of a suspect is converted to the difference between the estimated time of its arrival (ETA) and current time, which in turn is used to determine the URGENCY value. 7. THE EXPERIMENTAL SETUP 7.1 Test Environment The USCG 7th and 8th Districts were selected as test environment. These comprise the waters around Florida, Alabama, Mississippi, Louisiana, Texas, Puerto Rico, Virgin Islands and the Gulf of Mexico. There are several USCG ports and air stations scattered all along the coastline and on the islands (25 in our model). Over fifty resources (boats and aircraft) are stationed at these and participate in the model. Suspect traffic originates South of the three choke points (sea lanes between Mexico and Cuba, between Cuba and Haiti, and between the Dominican Republic and Puerto Rico), and heads towards U.S. coastal waters or other "customary" high-sea transfer points. This operational environment is affected by changes in the sea and visibility states. The weather attributes influence the launching of resources for patrolling purposes and their assignments to both SAR, and ELT tasks. We have modeled the command and control activities of Group, District and Area commanders, only. Although the system is set up with Stations under the command of Groups, to reduce complexity, the Groups perform all decision making for the Stations also. 7.2 Scenario Generation and Experimental Design In generating suspect traffic, the following stochastic elements were considered: (1) inter-arrival time , defined as the interval between the generation of successive suspects; (2) the suspect profile , defined as the combination of suspect attribute values; and (3) the p a t h followed by the suspect. These were obtained from a Summary Enforcement Reports Database. Visibility and sea states are modeled as Markov chains, using

16

distribution data from the National Oceanic and Atmospheric Administration. Data for the resources were gleaned from Jane's Fighting Ships and from various other sources. The primary performance metric for plan success is interdiction and boarding rate for the total suspect traffic generated over a sufficiently long time period. Important secondary measures are resource utilization, fuel consumption and equitable load distribution on resources. The statistics generated by SENTINEL can also be used for strategic planning (e.g., resource purchase and positioning, boat schedules and patrol routes) and for measuring the utilization of different resources. The Scenario Generator and the related Simulation Model was run for a period of 100 simulated days. All the random streams used to model the stochastic elements mentioned above and all other relevant input parameters were kept identical for the entire suite of experiments. In other words, we were able to replicate the simulated environment consistently for all the experiments. All results and claims made in this study are based upon these experimental results. The experiments performed can be categorized into the following: • Level-1 dynamic scoping in a Hierarchical Message Routing Environment. • Level-2 dynamic scoping in a Hierarchical Message Routing Environment. • Level-1 dynamic scoping in a Lattice-like Message Routing Environment. • Level-2 dynamic scoping in a Lattice-like Message Routing Environment. • Level-2 dynamic scoping in a Lattice-like Message Routing Environment and all rules fired for resource selection. • Level-2 dynamic scoping in a Lattice-like Message Routing Environment and controlled firings of the rules for resource selection. 8. RESULTS AND ANALYSIS 8.1. Dynamic Scoping The cumulative results for Level-1 and Level-2 dynamic scoping are shown for a 100–day simulation run in Table 1. Figure 5 depicts the number of suspects boarded for these two experiments. It is clear that Level-2 fares better than Level-1. The results shown here are consistent with the description of dynamic scoping and its expected effect. The success rate of law enforcement activity at the various sites using dynamic scoping is given in Figure 6. Success rate is the ratio between successful expansions (temporary acquisition of needed resources) and the total number of attempted expansions. Some of the sites are more active than others, which fact can be explained by the higher density of traffic in these regions.

17

Level-1

Level-2

Cases Days Traffic Boarded Spotted

Cases Boarding/Spotting SAR Abandoned Boarded Spotted Abandoned Level-1

Level-2

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

0 1 3 4 5 6 8 9 10 13 13 15 16 18 19 21 24 25 25 28 28

0.000 0.529 0.529 0.531 0.468 0.451 0.479 0.496 0.494 0.508 0.492 0.491 0.485 0.437 0.448 0.449 0.449 0.413 0.422 0.424 0.431

0 178 334 527 710 872 1045 1231 1417 1578 1743 1922 2083 2255 2427 2605 2767 2913 3127 3281 3457

0 39 61 91 115 148 190 209 250 250 274 281 311 328 340 358 362 386 404 433 451

0 87 137 195 288 356 445 455 565 567 643 643 715 759 783 812 814 913 933 1051 1079

0 1 4 4 9 11 13 16 16 16 16 16 17 17 18 18 18 19 19 19 19

0 0 46 87 72 136 119 224 145 310 172 381 231 482 239 482 259 524 273 537 310 630 310 631 320 660 349 799 372 830 373 830 373 830 390 943 451 1069 453 1069 469 1089

0 1 3 4 17 18 21 21 21 21 22 22 28 29 29 30 30 43 43 43 43

0.000 0.448 0.445 0.467 0.399 0.416 0.427 0.459 0.442 0.441 0.426 0.437 0.435 0.432 0.434 0.441 0.445 0.423 0.433 0.412 0.418

Table 1 Data related to dynamic scoping from a 100-day simulation run. All the sites exhibit an improvement with Level-2 over Level-1 expansion, except Site 3 and 5. This is due to following natural reasons: • Site-4 has significantly more tasks during the period of interest than its neighbors 3, 5 and 6. The latter sites cannot, therefore, expect Site-4 to loan resources to them. Consequently, Site-3 and Site-5 suffer with Level-1 scoping, and Site-6 suffers with Level-2 scoping. • Site-3 being on the "boundary", gets only half the level of effectiveness with both Level-1 and Level-2 scoping. For that reason, Site-5 cannot rely on Site-3 for Level-2 scoping, either.

18 500 Level-1 Level-2

Suspects Boarded

400

300

200

100

0 0

5

1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 5 5 6 0 6 5 7 0 7 5 8 0 8 5 9 0 9 5 100

Days

Fig. 5 Number of suspects boarded with Level-1 and Level-2 dynamic scoping. This explains the anomalous behavior of Sites 3 and 5, respectively. It should be noted that the peculiar behavior of a boundary node is independent of the number of agents in the system. In Figure 7, the failure rates with Level-1 and Level-2 dynamic scoping are depicted. Failure rate is the ratio between the number of suspects abandoned due to the suspects' evasive tactics and/or bad weather conditions, on one hand, and the total number of suspects spotted, on the other. When resources are acquired for interdiction purposes from a distant site, the suspect is usually out of the radar spotting range of the resource. As a result, the resource is unable to continue with interdiction when the suspect uses evasive tactics. So, the farther we go to acquire a resource, the likelihood of the case being abandoned increases. 8.2 Comparative Analysis of Message Routing Protocols Table message each of structure

2 captures the results for Level-2 dynamic scoping, using the two routing protocols. Figures 8 depicts the relative performance of the communication routing strategies, the constrained lattice-like and the hierarchical scheme. The differences between the two are

19

more marked in Level-2 expansion. Such result is expected as communications have to be carried out between sites that are farther away. 8.3

Time–Critical

Planning

As noted before, we have modeled time-criticality by assigning a PRIORITY to every incoming task. It is calculated by considering task URGENCY and IMPORTANCE. The URGENCY factor is based upon the degree of "time-stress" of the task while the IMPORTANCE depends on its nature. We have continuously updated the URGENCY measure of incoming tasks to account for the real-life role of the changes in the environment. The effects of the changing PRIORITY values are studied in experiments carried over 100 simulated days. Two sets of experiments were designed to test the effects of the prioritized rule base and the controlled inference mechanism. In the first set, all the rules were allowed to fire, i.e., the planning system was given an indefinitely long period of time to complete the planning process. In the second set of experiments, each suspect was given a PRIORITY based upon its IMPORTANCE (suspicion type and level) and URGENCY (time needed to reach the task, otherwise the suspect escapes ) The results of the two experiments are shown in Table 3.

1.2 Level-1 Level-2 1.0

Success Rate

0.8

0.6

0.4

0.2

0.0

3

4

5

6

8

9

11

Site

Fig. 6 A comparison of the number of successful dynamic scopings from the various sites for the two levels.

20 0.12

0.10

Failure

Rate

0.08

0.06

0.04

0.02

Level-1 Level-2

0.00

0

20

40

60 Days

80

100

120

Fig. 7 Failure rate versus time: interdictions started but abandoned due to suspect evasion and weather conditions for Level-1 and Level-2 dynamic scoping. Failures are more frequent when a resources are acquired from a site that is farther away. The data used in this graph is from Table 1.

Site

Attempts

Success

3 4 5 6 8 9 11

164 262 88 138 836 306 130

103 87 83 131 771 303 114

Lattice-Like Time Total Average 69.19 85.70 106.13 207.87 971.17 525.19 109.50

0.628 0.332 0.943 0.949 0.922 0.990 0.877

Hierarchical Time Total Average 77.53 111.27 126.96 264.92 1262.69 566.08 129.83

0.753 1.279 1.530 2.022 1.638 1.868 1.139

Table 2 Inter-site communications for Level-2 dynamic scoping. Time is in seconds.

21

3 Lattice-Like

2

Average

Time

Hierarchical

1

0 3

4

5

6

8

9

11

Site

Fig. 8 A comparison of the strictly hierarchical and the constrained latticelike organizational structures. This comparison is based on the average time (in seconds) taken for a successful Level-2 expansion.

The average time taken for a single boarding is presented in Figure 9. It is clear that when the planner is given an indefinitely long period of time (i.e., all the rules are allowed to fire), the average time taken for a single boarding is higher than when the time-critical inference mechanism is used. However, this improvement in decision time is at the expense of poorer decisions with regards to resource selection. More cases have to be abandoned due to the evasive tactics of the suspect and/or when weather conditions become prohibitively bad. Figure 10 shows the difference in failure rates in the two series of experiments. Failure rate is the ratio between the number of suspects abandoned (lost) and the total number of suspects spotted (cases are abandoned due to the evasive tactics of the suspect and/or when weather conditions become prohibitively bad). The

22

difference points to the fact that a time-critical inference mechanism reduces the decision-making time necessary but the resources selected may not always be suitable. The difference in the total number of cases boarded over a 100 day run is insignificant (7 in 500), which proves the relative success of the time-critical mechanism.

Experiment # 1 Days Traffic Spot Board Lost Time

Experiment #2 Time to Board Failure Rate Spot Board Lost Time Expt#1 Expt#2 Expt#1 Expt#2

0

0

0

0

0

0.00

0

0

0

0.00

0.00

0.00

0.00

0.00

5

178

80

33

1

0.69

80

33

1

0.27

1.255

0.491

0.013

0.013

10

334

120

60

2

0.94

120

61

2

0.52

0.940

0.511

0.017

0.017

15

527

190

86

3

1.85

191

88

3

1.41

1.291

0.961

0.016

0.016

20

710

260

104

3

2.66

263

123

13

2.24

1.535

1.093

0.012

0.049

25

872

316

132

6

3.08

356

168

16

3.2

1.400

1.143

0.019

0.045

30 1045

412

173

9

3.9

399

199

16

3.94

1.353

1.188

0.022

0.040

35 1231

475

208

11

4.8

491

220

17

4.38

1.385

1.195

0.023

0.035

40 1417

503

235

12

5.64

527

239

17

5.14

1.440

1.290

0.024

0.032

45 1578

547

246

14

6.04

573

259

22

5.51

1.473

1.276

0.026

0.038

50 1743

557

253

15

6.79

631

288

27

6.35

1.610

1.323

0.027

0.043

55 1922

620

298

16

7.73

637

294

27

7.06

1.556

1.441

0.026

0.042

60 2083

654

299

16

8.11

653

311

27

7.43

1.627

1.433

0.024

0.041

65 2255

744

347

19

9.11

771

335

28

8.35

1.575

1.496

0.026

0.036

70 2427

799

375

20

10.03

845

378

32

9.31

1.605

1.478

0.025

0.038

75 2605

902

428

25

11.04

934

417

36

10.32

1.548

1.485

0.028

0.039

80 2767

951

444

26

11.86

1037

450

37

11.35

1.603

1.513

0.027

0.036

85 2913 1046

464

38

12.83

1140

490

50

12.95

1.659

1.586

0.036

0.044

90 3127 1111

498

39

13.85

1158

501

50

13.36

1.669

1.600

0.035

0.043

95 3281 1133

515

39

14.74

1160

507

51

14.15

1.717

1.675

0.034

0.044

100 3457 1220

534

40

15.73

1190

527

51

14.99

1.767

1.707

0.033

0.043

Table 3 Data related to time-critical planning from a 100-day simulation run. Lost indicates that the pursuit was abandoned due to suspect evasion or inclement weather.

23

Unit Boarding Time

2

1

All Rules Fired Controlled Firings 0 0

20

40

60

80

100

120

Days

Fig. 9 A comparison of the average time taken (in minutes) for a single boarding in the two sets of experiments. The controlled firing of rules leads to quicker decision making. The results were accumulated over 100 simulated days.

0.05

Failure

Rate

0.04

0.03

0.02

0.01

All Rules Fired Controlled Firings 0.00

0

20

40

60 Days

80

100

120

Fig. 10 A comparison of the failure rates (cases lost/cases spotted) for the two sets of experiments. The results were accumulated over 100 simulated days.

24

9. CONCLUSIONS AND FUTURE WORK Dynamic scoping has been shown to be an effective negotiation mechanism for inter-agent cooperation and to lead to increased resource utilization in terms of higher suspect interdictions. The simulation results also confirmed that the number of interdictions went up when dynamic scoping was extended from Level-1 to Level-2 expansion. Similarly, the lattice-like message routing strategy fared better than the strictly hierarchical arrangement. The results from the studies of time-critical planning indicate the effectiveness of the prioritized rule base and the controlled inference mechanism. The implementation of the time-critical planning mechanism was not completely satisfactory in the SENTINEL system. The Simulation Model and the Planning Agents operate in lock-step fashion – Simulation Model momentarily stops for the Planning Agent to provide a resource and then resumes its operation. This implies that the time taken for the actual planning process is ignored. In real life, the world does not stop during the planning process (even if it is only of a very short duration). We intend to fix this problem by slowing the simulation clock to the real-time clock during the planning/scheduling phase. Similarly, the communications between the Simulation Model and the Planning Agents and between the Planning Agents should be asynchronous. Such a setup is also ideal in modeling the p r e emption of an ongoing planning process. Also, the Planning Agents should take into account the changes in the environment during the planning phases and the time taken for planning21 . The concept of the envelop of effectiveness needs to be investigated further. When resources are acquired for interdiction purposes from a distant site (as the result of dynamic scoping), the suspect is usually out of the radar spotting range of the resource. Consequently, the Coast Guard resource is unable to continue the interdiction process when the suspect uses evasive tactics. So, the farther we obtain the resource from, the higher the likelihood of the case being abandoned is. This phenomenon needs to be investigated and analyzed both quantitatively and qualitatively. Similarly, the tradeoffs of dynamic scoping need to be evaluated further. A quantitative analysis of costs and global success rates would help in deciding whether to pre-empt a local resource or to pre-empt a resource from a neighboring site. In closing, a final test of SENTINEL would be its use as a distributed decision-support system in an operational environment. In such an arrangement, the simulation model would be replaced by the events of the

25

real world, and real-time inputs would be provided automatically and by humans. Acknowledgments This research has been supported in part by the U. S. Coast Guard, Contract Nos. DTCG-88-C-80630 and DTCG39-90-C-E9222, and by Digital Equipment Corporation, External Research Grant Nos. 158 and 774. We are indebted to CACI Products Company for providing SIMSCRIPT II.5. We would like to thank all the members in the Artificial Intelligence Laboratory, particularly Cem Bozsahin, Raphael Malyankar, Qing Ge, and Glen Reece for their help and support. Ken Klesczewski (USCG R & D Center) was instrumental in developing the first version of the simulation model and provided data for the later versions. REFERENCES 1. Findler, N. V., Contributions to a Computer-Based Theory of Strategies . Springer-Verlag, Berlin, Heidelberg, New York, 1990. 2. Horvitz, E. J., Reasoning about beliefs and actions under computational resource constraints, in: Uncertainty in Artificial Intelligence 3 , eds. L. N. Kanal, T. S. Levitt and J. F. Lemmer, Elsevier Science Publishers B. V. (North– Holland), Amsterdam, The Netherlands, 1989, pp. 301–324. 3. Cohen, P. R., Greenberg, M. L., Hart, D. M. and Howe, A trial by fire: Understanding the design requirements for agents in complex environments, AI Magazine, Fall 1989, pp. 32–48. 4. Davis, R. and Smith. R. G., Negotiation as a metaphor for distributed problem solving, Artificial Intelligence. 20(1), 1983, pp. 63–109. 5. Lesser, V. R. and Corkill, D. D., Functionally accurate, cooperative distributed systems, IEEE Trans. on System Man and Cybernetics . S M C 11(1), 1981, pp. 81-96. 6. Conry, S. E., Meyer, R. A., and Lesser, V. R., Multistage negotiation in distributed computing, COINS Technical Report 86-67, 1986. 7. Durfee, E. and Montgomery, T. A., A hierarchical protocol for coordinating multiagent behaviors, Proc. of the 1990 National Conf. on Artificial Intelligence , 1990, pp. 86-93. 8. Moehlman, T. and Lesser, V., Cooperative planning and decentralized negotiation in multi-Fireboss Phoenix, DARPA Workshop on Innovative Approaches to Planning, Scheduling and Control,ּ 1990, pp. 144-159. 9. King, J. L., Centralized versus decentralized computing: Organizational considerations and management options, ACM Computing Surveys , 15, No. 4, 1983, pp. 319–349. 10. Dean, T. and Wellman, M. P., Planning and Control . Morgan Kaufmann Publishers, San Mateo, CA, 1991.

26

11. Kaebling, L. P., An architecture for intelligent reactive systems, in Proc. of the National Conf. on Artificial Intelligence ., 1986. 12. Georgeff, M. P. and Lansky, A. L., Reactive reasoning and planning, Proc. of the National Conf. on Artificial Intelligence., 1987, pp. 677–682. 13. Ambros-Ingerson J. A. and Steel S., Integrating planning, execution and monitoring, in Proc. of the National Conf. on Artificial Intelligence ., 1988, pp. 83–88. 14. Sycara, K., Roth S., Sadeh, N and Fox, M., Managing resource allocation in multi-agent time-constrained domains, DARPA Workshop on Innovative Approaches to Planning, Scheduling and Control, 1990, pp. 240-250. 15. Michalski, R. S. and Winston, P. H., Variable precision logic, Artificial Intelligence , 1986, 29, pp. 121-146. 16. Boddy, M. and Dean, T., Solving time-dependent planning problems, Proc. IJCAI-89, 1989, pp. 979-984. 17. Findler, N. V. and Lo, R., An examination of distributed planning in the world of air traffic control, Journal of Distributed and Parallel Processing , 1986, 3, pp. 411-431. 18. Findler, N. V. and Lo, R., Distributed air traffic control — Part I: Theoretical studies, Journal of Transportation Engineering , 1993, 119, pp. 681-692. 19. Findler, N. V. and Lo, R., Distributed air traffic control — Part II: Explorations in a testbed, Journal of Transportation Engineering , 1993, 119, pp. 693-704. 20. Sengupta U. K., Bozsahin, H. C., Klesczewski, K., Findler, N. V. and Smith, J., PGMULES: An approach to supplementing simulation models with knowledge–based planning systems, Workshop on Simulation and Artificial Intelligence, National Conference on Artificial Intelligence, Boston, MA, 1990. 21. Pashupathy, A., Analytical Studies on Reactive Resource Allocation, Unpublished M. S. Thesis, Department of Computer Science and Engineering, Arizona State University, 1993.