Can a Composite Agent Be Used to Implement a ... - Semantic Scholar

2 downloads 0 Views 246KB Size Report
uses multi-agent system simulation technology to implement various cognitive processes of a single entity or agent. It is this author's contention that a composite ...
Can a Composite Agent Be Used to Implement a Recognition-Primed Decision Model? John A. Sokolowski Virginia Modeling, Analysis & Simulation Center Old Dominion University 7000 College Dr. Suffolk, VA 23435 757-686-6215 [email protected] Keywords: Decision making, multi-agent system simulation, naturalistic decision theory ABSTRACT: Legacy constructive military simulation systems such as the Corps Battle Simulation (CBS) and the Joint Theater Level Simulation (JTLS) have simple models of military commanders and their decision-making process. It would be useful to the military community to advance the robustness and the realism of these decision models. To improve human decisions in a simulation we need better models of human decision-making. The Recognition-Primed Decision (RPD) model was developed to explain the mental process that decision makers, especially experienced ones such as senior military commanders, go through to arrive at a decision. Several efforts are ongoing to try to produce a computational model of RPD. These efforts are centered on rule-based, neural network, and fuzzy logic approaches. This paper analyzes a different approach, using a composite agent, to implement the RPD model. A composite agent uses multi-agent system simulation technology to implement various cognitive processes of a single entity or agent. It is this author’s contention that a composite agent’s decision-making method closely matches that described by the RPD model. This close match is expected to produce a better implementation of the RPD model.

1.

Introduction

Legacy constructive military simulation systems such as the Corps Battle Simulation (CBS) and the Joint Theater Level Simulation (JTLS) have simple models of military commanders and their decision-making process. These models do not provide a realistic decision representation or decision result. Consequently, the military’s ability to train, to perform analysis, and to conduct warfighting experiments with modeling and simulation has suffered. In their review of human behavior representation in military simulations, Pew and Mavor summed up the existing decision models this way [1]. First the decision process is too stereotypical, predictable, rigid, and doctrine limited, so it fails to provide a realistic characterization of the variability, flexibility, and adaptability exhibited by a single entity across many

episodes. Variability, flexibility, and adaptability are essential for effective decision making in a military environment…. Second, the decision process in previous models is too uniform, homogeneous, and invariable, so it fails to incorporate the role of such factors as stress, fatigue, experience, aggressiveness, impulsiveness, and attitudes toward risk, which vary widely across entities. The military community has recognized these shortcomings [2,3] and has made it a priority to improve human behavior representation as spelled out in the Department of Defense (DoD) Modeling and Simulation Master Plan [4]. One way to achieve a better model is to accurately portray how humans make decisions. That is not to say that another form of decision modeling that mathematically

produces a decision irrespective of the actual human decision process is not feasible. However, the focus of this paper is on modeling the actual process. It is generally accepted that there are two schools of thought that describe how humans make decisions. The first school is encompassed by what is termed classical decision theory. This theory implies that all humans make decisions in a rational manner; they follow prescribed logic to always maximize the payoff of their decision, which is controlled by values for risk or uncertainty. To account for variability of decisions among humans, von Neumann [5] introduced Utility Theory, a subset of classical decision theory. Utility theory added a mathematical function, unique to each decision maker and to each decision, to account for a person’s tolerance for risk. Thus, when confronted with the same problem, two individuals may make different decisions based on a subjective value of how much risk he or she is willing to take. However, these decisions are still calculated in a mathematically logical way to maximize payoff. As classical decision theory progressed, Kahneman and Tversky noticed that humans did not always make rational decisions [6]. Certain factors caused people to follow courses of action (COA) that did not maximize the expected payoff of their decisions. In part, Tversky and Kahneman explained this behavior as humans having certain biases that tilted their decision preference away from the optimal one [7]. In another set of studies, Klein noticed that humans rarely performed mathematical calculations to arrive at a decision [8]. In all but the simplest problems, one does not know the probability of outcome for a given choice. Most likely a decision maker does not even know all the possible decision choices. Still, humans make decisions all the time. To arrive at a decision, people rely on their understanding of the problem and their past experiences dealing with similar situations. Researchers have observed many of these decision situations and have tried to understand how an individual actually arrives at a decision. This approach is the second school of thought generally referred to as the descriptive or behavioral school of decision theory. Recognition-primed decision-making (RPD) is a specific model that embodies descriptive decision theory, i.e., it actually tries to describe the human decision process. Because of the shortcomings of classical decision theory, this author believes that RPD is a better model for capturing a military commander’s decision process and instantiating it in a military simulation. It would be advantageous to the military community to produce a

computational form of the RPD model to improve upon the existing decision models. This paper explores the use of a composite agent to implement the RPD model in a computational form. The remainder of this paper is organized as follows. Section 2 describes the RPD model, its validity in the military domain, and surveys the ongoing efforts to implement RPD in a computational form. Section 3 proposes a different method to accomplish RPD implementation and reviews the characteristics of the proposed method.

2.

Recognition-Primed Decision Model

RPD is a specific implementation of a descriptive decision theory called Naturalistic Decision Making (NDM). NDM has been formally defined as “…the way people use their experience to make decisions in field settings [9].” NDM is typically employed by experienced decision makers (those that are knowledgeable in a decision domain) over classical decision methods (where multiple COAs are considered) when decisions are characterized by: ill-structured problems; uncertain dynamic environments; shifting, ill-defined, or competing goals; action/feedback loops; time stress; high stakes; multiple players; and organizational goals and norms [10]. Klein observed decision makers in their natural environments and saw that in the majority of cases, they did not use classical decision theory processes to arrive at their decisions [8]. Instead, they spent a significant amount of time analyzing the decision situation to gain a thorough understanding of it; they explored a single option rather than comparing multiple COAs; and they looked for a satisfactory solution to the problem rather than an optimum solution. As a result, Klein developed the RPD model, which is set apart from classical decision theory in seven ways [11]: 1. 2. 3.

4. 5.

RPD focuses on situational assessment rather than comparing several decision options. RPD describes how people user their experience to arrive at a decision. RPD asserts that an experienced decision maker can identify a satisfactory COA as the first one he considers rather than treating option generation as a random process. RPD relies on satisficing rather than optimizing— finding the first COA that works rather than the optimal one. RPD focuses on sequential evaluation of COAs rather than on the simultaneous comparison of several options.

COAs, the decision maker in RPD is only looking at one COA at a time. If the primary action under consideration does not quite solve the problem, then the decision maker may employ mental simulation to consider different aspects of the action to modify so that it can achieve a satisfactory result. At some point the decision maker may decide that the primary action will not work. He or she must then choose another action for consideration.

Start

Experience the situation in a changing context

Reassess Situation

No

Is the situation familiar?

Seek more information

Yes

Recognize the situation

Yes

Goals Are expectancies violated?

Cues

Experiences Expectancies

Actions 1… n

No

Imagine Action (i)

Modify

W ill it work?

No

Yes, but Yes

Implement Repeat

End

Figure 2.1 RPD Model Flowchart [8]

6.

7.

RPD asserts that experienced decision makers use mental simulation to assess a COA rather than comparing the strengths and weaknesses of several COAs. RPD allows the decision maker to be continually prepared to initiate action by committing to a COA being evaluated rather than waiting until all COAs are compared.

Figure 2.1 depicts the RPD process. It begins as the decision maker experiences the decision situation and draws on his or her experience to recognize it. If the decision maker does not recognize the situation, he or she may seek further information until situational recognition takes place. Once recognition occurs, it produces four byproducts. The decision maker becomes aware of goals that he or she must attain as part of the decision. The decision maker will be aware of cues on which to focus so that information overload does not occur. He or she will know what to expect next as the situation unfolds and the decision is implemented. Finally, the decision maker will know from past experience what actions worked in this type of situation. Unlike in classical decision theory, which showed a decision maker considering multiple

Clearly, senior military commanders are considered experts in their field. They often face problems characterized by the eight factors that predict when a decision maker would use RPD. Indeed, several researchers have shown that military commanders employ RPD in the majority of their military decisions [12,13,14,15,16]. For example, in one study, Kaempf et al. [13] investigated command and control decisions made by the antiair warfare (AAW) team on a U.S. Navy AEGIS cruiser. They found that an RPD strategy was used in 95% of the decisions situations the AAW team faced. From above, one can see that RPD appears to be an appropriate method for modeling a military commander’s decision process. Indeed, several efforts are ongoing to implement RPD in a computational form. Warwick et al. [17] have implemented the recognition part of RPD by encoding a person’s experience in a long term memory (LTM) structure based on Hintzman’s multiple trace memory model [18]. LTM stores a trace of a previous experience. The trace consists of variables used to recognize the situation, the resulting expectancies, and the decision. When a new decision situation is presented, a similarity value is computed and is used to “recognize” closely related experiences stored in LTM. The decisions associated with these experiences form the basis for a new set of actions for the current situation. Liang et al. [19] implemented part of RPD using a neural network structure to represent experience. They collected decision data from twelve subject matter experts and used it to train the neural network to solve a particular land warfare problem. Once trained, the network produced acceptable decisions for about 50% of the problems presented. Robichaud [20] improved on [19] by extending the neural network model with fuzzy variables and fuzzy inference rules. A fuzzy interpretation of the external environment was added since humans internalize their external environment in a personal way. Also, perfect battlefield information is rarely available and must often be interpreted for its meaning. This enhancement added

another layer of humanism to this implementation of RPD. Norling et al. [21] have begun an implementation of RPD using a form of multi-agent system (MAS) simulation called a Belief-Desire-Intent (BDI) agent. Belief represents the agent’s situational recognition, desire represents the goals to be achieved, and intent represents the actions to be taken to achieve the goals. Their work has focused on the belief part of the BDI agent by looking at various methods to interpret subtle contexts among decision situations.

3.

A Composite Agent Implementation of RPD

The previous section described the RPD model, its applicability to military decision-making, and several ongoing efforts to implement it in a computational form. This section will provide the author’s thoughts on using a composite agent to implement the RPD model. 3.1

Composite Agent Design

Hiles et al. [22] developed the concept of a composite agent (CA) from their work on computer-generated autonomy at the Naval Postgraduate School. A CA is a MAS composed of both symbolic agents called symbolic constructor agents (SCA) and reactive agents (RA). See Figure 3.1. CAs draw on the strengths of these two fundamentally different types of agents to produce a single meta-agent capable of complex behavior. The SCAs sense the external environment, Eouter, and convert it to an internal representation, Einner, in much the same way humans internalize their environment. SCAs also control and filter impressions of Eouter so that the CA is not overwhelmed by unimportant sensory inputs. RAs operate on the SCAs’ internal representation of the environment to generate actions for the CA to perform. Each RA represents a specific behavior of the CA. Associated with an RA is a set of one or more goals that are necessary to further the associated behavior. These goals are what drive the CA toward a particular action or decision much the same as humans are goal-driven. Humans rarely have a single goal to achieve. They typically face competing goals when trying to make a decision. For example, a military commander may have the goal of occupying a specific piece of land. Competing with that goal are ones like minimizing the number of casualties and possibly minimizing the damage to the land being occupied. It is the resolution of these competing goals that characterizes the human decision process. CAs

Figure 3.1 Composite Agent [22] have a goal management apparatus that performs a similar function. As described by Hiles et al. [22], “In human decision-making, goals are constantly shifting in priority, based on the person’s context and state. Agents can mimic the flexibility and substitution skills of human decisionmaking through the use of a variable goal management apparatus within the RAs. It is from this goal apparatus where contextually appropriate, intelligent behavior emerges.” Associated with each goal is a set of actions to further that goal. These actions represent the typical decisions that are available to the decision maker based on his or her past experience. Through the interactions of the multiple RAs within the CA, a set of actions emerges as a “satisfactory decision” based on how well those actions are achieving the overall goal. Since a decision situation is most likely dynamic, the CA continues to monitor its cues and expectancies by way of the SCAs and will shift to a different set of actions if the decision situation changes context. To summarize, SCAs sense the CA’s external environment and convert it to an internal representation of the decision situation. RAs then act on this internal representation to choose a set of actions consistent with furthering the goals that the CA has selected. These goals are based on the characteristic behaviors of the CA. Through the interactions of the RAs while pursuing their goals, a complex pattern of behavior emerges that represents decision-making in a descriptive form. 3.2

CA and RPD Comparison

To answer the question of whether a CA can be used to implement RPD, we must look at the elements that define the RPD model and relate those elements to characteristics within a CA.

First, RPD starts with the decision maker experiencing a situation in a changing context and trying to determine if the situation closely matches past experience. In essence, SCAs perform the same function of situational recognition for the CA. They sense the external environment, produce an internal representation of the situation, and periodically sample the environment to look for a changing context. When situational recognition occurs, RPD produces four byproducts: goals, cues, expectancies, and actions. Recall that decision makers rely on cues to focus on the important variables of a situation. SCAs accomplish the same task. They control and filter impressions of Eouter so that the CA is not overwhelmed by unimportant sensory inputs. Goals play a central role in RPD. They help focus the decision maker to the task at hand and help guide the selection of actions to solve a problem. Likewise, a CA is goal driven. Based on the behaviors embedded in the RAs, various goals compete for satisfaction, which shape the overall response of the CA in the same way that goals influence human response.

process will most likely need to be enhanced to better replicate the role of mental simulation within RPD.

4.

There appears to be significant overlap between the characteristics of the RPD decision process and the intrinsic functioning of a CA. CAs perform situational recognition and use cues to filter environmental inputs. Clearly, CAs are goal driven with goal satisfaction playing a central role in determining the set of actions to be carried out in support of a decision. CAs monitor expectancies to ensure that they can adapt to any changes in the decision situation. CAs focus on determining one set of actions rather than analyzing multiple COAs. These points are the key characteristics of the RPD model. They are embodied by the CA, which makes CA technology a viable choice to implement and mimic the RPD decision process in a computational form. Doing so will improve upon the decision algorithms available for use in many domains.

5. [1]

The RPD model indicates that a decision maker will focus on one COA at a time to determine if it fits or can be made to fit the current decision situation. He or she does not weigh multiple COAs looking for an optimal solution. Similarly, the CA does not generate multiple COAs either. Instead, it allows a dominant set of actions to emerge from the various competing goals. Based on its programmed experience, the CA selects these actions to produce a “satisficing” solution. It does not try to mathematically optimize it. A CA handles expectancies by periodically monitoring the changes in its environment caused either by external effects or as a result of its actions. Monitoring allows the CA to determine the effects of its actions and to decide if these effects are expected. Monitoring also allows the CA to adjust its decisions based on dynamic external environmental influences. This is similar to the monitoring process of the RPD model depicted by the various loops in the model. The final point to address concerns RPD’s use of mental simulation to modify a selected COA to fit the specific context of a decision situation. A CA partially accomplishes mental simulation as it performs its goal management process to select the set of actions that it will carry out. However, there is no clear mechanism within the CA to modify its existing experiences to provide a better fitting decision solution. The mental simulation

Conclusion

[2] [3] [4] [5] [6] [7] [8] [9]

References Pew, R.W. and Mavor, A.S. Modeling Human and Organizational Behavior: Application to Military Simulations. National Academy Press, Washington, D. C., 1998. Erwin, S.I. "Simulation of Human Behavior Helps Military Training Models," National Defense, Vol. 85, No. 564, pp. 32-35, 2000. Erwin, S.I. "Commanders Want Realistic Simulations," National Defense, Vol. 85, No. 567, pp. 50-51, 2001. Department of Defense. "Modeling and Simulation Master Plan," Government Printing Office, Washington, D. C., 1995. von Neumann, J. and Morgenstern, O. Theory of Games and Economic Behavior. Princeton University Press, Princeton, NJ, 1953. Kahneman, D. and Tversky, A. "Prospect Theory: An Analysis of Decision Under Risk," Econometrica, Vol. 47, No. 2, pp. 263-291, 1979. Tversky, A. and Kahneman, D. "Judgment Under Uncertainty: Heuristics and Biases," Science, Vol. 185, pp. 1124-1131, 1974. Klein, G. Sources of Power: How People Make Decisions. The MIT Press, Cambridge, MA, 1998. Zsambok, C.E. "Naturalistic Decision Making: Where are We now?," In Naturalistic Decision Making, Zsambok, C.E. and Klein, G. Eds., pp. 316, Lawrence Erlbaum Associates, Publishers, Mahwah, New Jersey, 1997.

[10]

[11]

[12]

[13] [14] [15]

[16]

[17]

[18]

[19]

[20]

[21]

Orasanu, J. and Connolly, T. "The Reinvention of Decision Making," In Decision Making in Action: Models and Methods, Klein, G., Orasanu, J., Calderwood, R., and Zsambok, C.E. Eds., pp. 3-20, Ablex Publishing Corporation, Norwood, NJ, 1993. Klein, G. "A Recognition-Primed Decision (RPD) Model of Rapid Decision Making," In Decision Making in Action: Models and Methods, Klein, G., Orasanu, J., Calderwood, R., and Zsambok, C.E. Eds., pp. 138-147, Ablex Publishing Corporation, Norwood, NJ, 1993. Drillings, M. and Serfaty, D. "Naturalistic Decision Making in Command and Control," In Naturalistic Decision Making, Zsambok, C.E. and Klein, G. Eds., pp. 71-80, Lawrence Erlbaum Associates, Publishers, Mahwah, NJ, 1997. Kaempf, G.L. et al. "Decision Making in Complex Naval Command-and-Control Environments," Human Factors, Vol. 38, No. 2, pp. 220-231, 1996. Klein, G. "Strategies of Decision Making," Military Review, Vol. 69, No. 5, pp. 56-64, 1989. Pascual, R. and Henderson, S. "Evidence of Naturalistic Decision Making in Military Command and Control," In Naturalistic Decision Making, Zsambok, C.E. and Klein, G. Eds., pp. 217-226, Lawrence Erlbaum Associates, Publishers, Mahwah, NJ, 1997. Serfaty, D. et al. "The Decision-Making Expertise of Battle Commanders," In Naturalistic Decision Making, Zsambok, C.E. and Klein, G. Eds., pp. 233-246, Lawrence Erlbaum Associates, Publishers, Mahwah, NJ, 1997. Warwick, W. et al. "Developing Computational Models of Recognition-Primed Decision Making," In Proceedings of the Tenth Conference on Computer Generated Forces, Norfolk, VA., 15-17 May 2001, pp. 232-331, 2001. Hintzman, D.L. "MINERVA 2: A Simulation Model of Human Memory," Behavior Research Methods, Instruments & Computers , Vol. 16, pp. 96-101, 1984. Liang, Y. et al. "Implementing a Naturalistic Command Agent Design," In Proceedings of the Tenth Conference on Computer Generated Forces, Norfolk, VA., 15-17 May 2001, pp. 379-386, 2001. Robichaud, F. "Implementing a Naturalistic Command Agent Design," Presented at the Workshop on Computerized Representation of RPD, Boulder, CO., October 23 2001. Norling, E., Sonenberg, L., and Rönnquist, R. "Enhancing Multi-Agent Based Simulation with Human-Like Decision Making Strategies," In Multi-Agent Based Simulation: Proceedings of the Second International Workshop, MABS 2000,

[22]

Moss, S. and Davidsson, P. Eds., pp. 214-228, Springer, Boston, MA., 2000. Hiles, J. et al. "Innovations in Computer Generated Autonomy," Tech Report NPS-MV-02-002, Naval Postgraduate School, Monterey, CA, 2001.

Author Biography JOHN SOKOLOWSKI is a project scientist at Old Dominion University’s Virginia Modeling, Analysis & Simulation Center (VMASC). He holds a bachelors degree in Computer Science from Purdue University, a Master of Engineering Management from Old Dominion University (ODU), and is a PhD candidate in the Modeling and Simulation program at ODU. Prior to coming to VMASC, he spent 27 years in the Navy as a submarine officer with his final assignment as Head, Modeling and Simulation Division, Joint Warfighting Center, U.S. Joint Forces Command. His research interests include human behavior modeling and multiagent system simulation.

Suggest Documents