A Utility Based Approach to Cooperation among Agents 1 Introduction

2 downloads 0 Views 211KB Size Report
each alternative for the group as a whole when they choose their own plans for acting ... in particular point (2) above in the context of group activity. .... not only the steps he may execute to perform the action and the preconditions he may adopt as .... decided to apply the method just to the rst (leftmost) step of each recipe, ...
A Utility Based Approach to Cooperation among Agents Guido Boella, Rossana Damiano and Leonardo Lesmo Dipartimento di Informatica - Universita di Torino email: fguido,rossana,[email protected]

1 Introduction Most interactions among agents involve some form of cooperation. In particular, sometimes, the goals of an agent cannot be achieved without the help of other agents. If these goals are common to a group of agents (or an agent can convince other agents to help him), then agreement on a shared plan can be reached. Once a group of cooperating agents has formed, then some special behaviors occur concerned with the need to work together toward the common goal. In this paper, we focus on the problem of modeling how being part of a group in uences the behavior of a single agent, while we address only partially the process of group formation: in other words, we assume that a group has already formed and see how one agent in the group must operate in order to cooperate properly with the other members of the group. The basic ideas of our proposal are that the members of a group a) consider the advantage of each alternative for the group as a whole when they choose their own plans for acting and b) adopt their partners' goals if the group increases its gain from this adoption.

1.1 Goals, plans, intentions

In the discussion above, we have used terms like goal, intention, plan. We assume that: 1. every agent has a set of goals; 2. every agent decides which goal(s) to pursue; this decision must be taken on the basis of: (a) desire of achieving the goal; (b) cost of achieving the goal; 3. the cost of reaching one or more goals can be assessed in terms of courses of actions leading to the achievement of those goals; 4. since the decision on the goals to pursue depends on the solution which optimizes (2.b), then the decision includes a choice of the best course of action; 5. the best course of action in (4) is devised on the basis of general knowledge about plans and is expressed in terms of an instantiated plan: this plan constitutes the intention of the agent; 6. intentions persist as long as the agent can gain any advantage from it: i.e. until the associated goal is achieved or it reveals itself to be impossible to achieve, or becomes irrelevant [Cohen and Levesque, 1990]; The process of intention formation can be triggered by external events; in fact, they can lead to the formation of new goals, which must be balanced against the existing ones (i.e. step 2 above must be repeated). This general framework applies to both individual plans and shared plans. This paper addresses in particular point (2) above in the context of group activity. We will show that the planning architecture that we propose can satisfy the requirements put forward by theoretical approaches and, at the same time, it adds something concerning the selection of the best course of action, i.e. the process of intention formation. It is generally agreed upon that formal frameworks act as a speci cation of a computational system. For instance, Grosz and Kraus say: \The formalization is a speci cation, not input to a theorem prover" [Grosz and Kraus, 1998], footnote 20, p.19. And Jennings and Wooldridge state \they can be speci cations for future computer systems [Jennings and Wooldridge, 1999], p.4. 1

1.2 A decision theoretic approach

Since decisions must be taken, we had to adopt some techniques to balance the di erent possibilities. [Rao and George , 1991] is a rst attempt to integrate agent theories and decision theory. Another existing system, developed by [Haddawy and Hanks, 1998], provides an interesting solution for the comparison between alternative plans; so, in order to test our approach, we adopted the Decision Theoretic approach implemented in DRIPS and we built our planner on the top of it. DRIPS [Haddawy and Hanks, 1998], [Haddawy and Suwandi, 1994] is a hierarchical planner which merges some ideas of decision theory with standard planning techniques. It is based on a utility function, which is used to evaluate how promising a given plan is. The utility function is not just the payo of the possible outcomes of the plan: it is computed starting from simpler utility functions associated with the various goals of the agent, but it accounts also for the partial satisfaction of goals (since actions can fail to achieve a goal completely, and achieve it partially). Moreover, the consumption of resources is taken into account, by subtracting i value from the overall utility. Time, and, in particular deadlines, is also modeled. Notice that the choice to adopt an approach based on utility functions is partially in contrast with Jennings and Wooldridge's position that models based on the notion of utility \... are not computational models and ignore the practicalities of computing an appropriate action to perform" [Jennings and Wooldridge, 1999], p.5. Although DRIPS is based, at least partially, on decision theory, it is in fact a computational model based on the notion of utility, which exploits various techniques to account for probability and uncertainty: these are important factors that tend to be underestimated in logic-based formalizations.

1.3 Multi-agent decisions and communication

However, DRIPS does not account for multi-agents plans. So its basic planning mechanism was exploited to build a planner which is able to take into account the need to cooperate. This was achieved by following the suggestion of [Grosz and Kraus, 1996] that in de ning helpful behavior it must be taken into account the advantage that the group attains from adopting the partners' goals. Grosz and Kraus propose that an agent should help one of his partners in the group, only if he can attain the goal with a lower cost with respect to his partner. We try to generalize this idea by introducing the notion of a utility function that is shared by the group and that the group exploits to direct its behavior. In particular, the group utility function has an important role in case of helpful behavior: we claim that any member, beside doing his part in the shared plan, will consider adopting the goals of his partners and will adopt them only if the global utility of the group results in an increase. In our proposal, communicationis just a particular kind of cooperative behavior: if, for instance, a member of the group comes to know that the group's goal has been achieved, he will consider whether to communicate his discovery to the partners; the little overhead in his activity given by communication is compensated by an increase in the utility of the whole group: in fact, the other partners would waste their resources without any bene t trying to achieve a goal that is already true; therefore he decides to act. However, it could happen that the communication is estimated to be very expensive or to take long time; in such case, the agent could choose not to communicate since this action would result in a decrease of the global utility. The main advantage of this solution is that we do not need stipulate conditions for communication which are distinct from general cooperation, so that communication is not compulsory, but, instead, its opportunity is evaluated depending on the situation; The paper is organized as follows: in the next section, we introduce the notion of cooperation; in the third section we show how cooperation a ects the planning process of a group; in the fourth section, we will analyze the impact of reasoning on the actions of other agents; then, we will see which phenomena are covered by the model and an example of application. Finally, we will discuss the relationships with logic-based approaches; a Conclusion section closes the paper.

2 The de nition of cooperation We exploit the idea of [Haddawy and Hanks, 1998] by introducing a utility function that returns the advantage of a given outcome for the group. 2

Moreover, in our de nition of cooperative behavior we consider also the notion of goal adopdeveloped by [Castelfranchi and Falcone, 1997]: \an agent B performs an action for a given goal, since this action or the goal are included in the plan of another agent A" (see point 4 below). The cooperation of a group of agents GR composed of agents G1 .. . Gn in a shared plan for 1 with an associated recipe Rx composed of steps 1x;k1 .. . mx;kl is de ned as2 : 1. each agent Gk of the group GR has the single agent intention to perform his part rx;k of the whole shared plan for formed on the basis of the recipe Rx ; 2. the agents of GR have the mutual belief that each one (Gk ) has the intention to perform his part rx;k of the shared plan for ; 3. all agents share a utility function GF based on a weighted sum of the utility of the goal which the shared plan aims at and of the resource consumption of the single agents; when the agents plan their own part of the shared plan, they have to consider also this global utility as part of the their individual utility function Fk that drives their behavior; 4. when an agent Gi becomes aware that a partner Gj has a goal  that stems from his intention to do his part lx;j , Gi will consider whether to adopt it; if Gi believes that the adoption of a goal of his partners' produces an increase of the utility GF of the whole group, then he adopts that goal. 5. the group is maintained as long as the value of the utility function GF can be increased by executing the shared plan for or by adopting some of the goals of the partners. Notice that it is the termination condition that allows to predict communication after an agent has discovered that the group succeeded in achieving : in fact, leaving the group would make his partners waste their resources. For what concerns point 4, the goals that can be potentially adopted by an agent Gi are those stemming from the intention of a partner Gj to perform an action lx;j : therefore, he considers not only the steps he may execute to perform the action and the preconditions he may adopt as subgoals while acting, but also other goals that derive from the planning and reactive execution of lx;j . In particular, Gi will consider Gj 's goal of knowing how to perform lx;j , of monitoring of the e ects of its execution and, possibly, of replanning. These goals derive from the single agent intention of Gj to perform lx;j . But the members of the group do not necessarily know which actions the partners are currently doing. In fact, the plan shared by the group is normally a partial one and the other agents do not know how one will subsequently carry on his own part. Therefore, agents cannot help their partners if they do not know what their current intentions are. However, plan recognition techniques ([Carberry, 1990], [Ardissono et al., 1998]) can play an important role in helping agents to infer what their partners are doing and therefore improving the cooperation of the group. Moreover, plan recognition is independently motivated by communication needs ([Allen, 1979]). tion

3 The deliberative agent In [Ardissono and Boella, 1998], we describe the agent architecture that we assume underlies this work. Here, we just say that the knowledge about how to act is stored in three libraries of actions. For each action, its preconditions, its constraints and its e ects are speci ed. The libraries are organized as generalization/decomposition hierarchies. The generalization hierarchy enforces an inheritance schema; the decomposition of actions is given in terms of recipes: the execution of the subactions (steps) in a recipe constitutes the execution of the action. An agent has a set of goals; each of them is associated with a \desirability" degree, i.e. a real number specifying how much the overall utility is increased if the goal is achieved. Among these goals are the ones concerning the consumption of resources: their desirability is inversely proportional to the consumption of the resource. Notice that, as the resource case shows, we do not assume any direct connection between goals and desires (at least not in the positively connotated 1 Notice that is an action, not a goal; we assume here that the shared plan has a particular structure; i.e. it is a one-level plan composed of a top-level action ( ) decomposed into a sequence of steps. In a general plan, each step could in turn have been expanded into substeps, and so on recursively. 2 The notation x;i refers to the j-th step of the recipe Rx , a step which has been assigned to agent G , but can i j be contracted out to another agent or adopted by another agent.

3

sense of the term \desire") . We rather talk about \positive" and \negative" desirability of a goal, thus subsuming desires and fears under a single concept. Notice that we are claiming here that states have a desirability degree, independently of how they have been reached; in principle, however, the choice of a given course of action doesn't depend only on the reached situation, but also on the e ort required to attain it. On the other hand, we just inherited from DRIPS its way to account for the e ort required to accomplish the plan: DRIPS simply updates a set of \resources" which are part of the situation. So, the resulting situation already includes a description of the e ort required to accomplish the plan (in terms of remaining resources). After the planning phase, the (possibly partial) plan is executed in a reactive way, and actions are replanned if they fail. Moreover, if an event that triggers new goals occurs, the agent, exploiting his utility function, considers if it is possible to devise a new plan including the new objective therefore it becomes an intention beside the previous ones - or it is better to continue with his previous plan - the goal does not becomes an intention; i.e. he has both to consider the utility of adopting the possible goal as a new intention (and doing something to achieve it) and the consequences of not adopting it that results from the utility function. Since the agent's world is populated by other agents, the consequences of an action may a ect the actions performed afterwards by other agents. So, in case of interaction with other agents, an agent has to consider the consequences of his behavior on the behavior of his partners: in order to evaluate the real expected utility of the plan, he looks at the outcomes resulting from the possible reactions of other agents. A similar kind of reasoning can be exploited to explain conversational cooperation; when an agent is requested for cooperation by a partner, he can decide not to accept, but, usually, he still cooperates at the conversational level, by informing his partner about his refusal. The refusal of conversational cooperation is interpreted as an o ensive behavior, since a noti cation is a low-cost action, and its omission could result in a repetition of the request (a rather expensive action). So, no answer means that the utility of the partner has a very low weight in the agent's overall utility, which is a very o ensive situation. Note that this kind of forward looking is useful not only in cooperative settings. Also in case of con ict with another agent, before choosing what to do, it is advisable to consider the adversary's reaction (possibly to prevent his move, like in games). This is covered by our model: the main di erence is that, in this case, the utility of the opponent is subtracted (and not summed) when evaluating the overall utility. Certainly, no global utility nor adoption is considered. On the contrary, the agent will consider the fact that his partner will maximize his own utility, and, when he recognizes a goal of his partner, he also would work to make him unable to achieve it.

4 The planning algorithm

The construction of a full plan is carried out by an agent Gk in a stepwise fashion: if Gk is in charge of step mx;k of the shared recipe Rx for , then he rst has to nd the best recipe for mx;k y;k )), and then he can start re ning y;k ; this process (let's say Ry , with steps my;k1 , my;k2 , .. ., mn m1 m terminates when a rst subaction is an elementary action, i.e. it is executable. The approach of DRIPS to this process is to expand mx;k in all possible ways (i.e. applying to the current state S all existing recipes), and then, using the utility function (applied to the state resulting from the potential execution of the recipe), to choose the best recipe; then, it is possible to proceed onward to expand my;k1 (actually, the search goes on in parallel, unless the search tree can be pruned; for the sake of simplicity, we assume that, at each level of decomposition a single recipe is chosen). So, the utility function acts as a heuristics, able to exclude some possible ways (recipes) to execute an action.

4.1 Reasoning on group utility

In order to implement the ideas presented in the previous section, we decided to make somewhat more complex the evaluation of the heuristics. The method is as follows:3  Using DRIPS (playing the role of Gi), we expand the current state S according to all possible recipes for mx;i , thus producing the states S1 , S2 , . .. , Snr (where r is the number of di erent recipes for mx;i ). 3

For simplicity we have assumed a single partner Gj .

4



This set of states is transformed in the set of the same states as viewed by Gj , S1 , S2 , .. ., Snr .4 On each state Sm , we restart the planning process from the perspective of his partner Gj (i.e. trying to solve his current task hx;j ). This produces a set of sets of states SS = ffS11, ... , S1n1 g, fS21, .. ., S2n2 g, .. ., fSnr 1, .. ., Snr nnr gg. The group utility function is applied to these states, and the best state of each subset is identi ed: SSbest=fS1best1 , S2best2 , .. ., Snr bestnr g. These states are the ones assumed to be reached by Gj 's action, for each of the possible initial moves of Gi . The group utility function is applied to the corresponding states Skbestk from Gi's point of view. This models the perspective of Gi on what could happen. The best of these states is selected (Smax;bestmax ). This corresponds to the selection of the best recipe for Gi (i.e. Rmax ). 0

0

0

 

0

0

0

0

0

0

0

0



0

 

0

0

0

4.2 Uncertain e ects

Some more words must be devoted to the probability that an e ect holds after the execution of a recipe Rxi . Note that if a recipe Rxi makes a proposition Prop true only with probability p(Prop) the simulation of Gj 's planning phase must be carried on starting from both \possible" worlds resulting from the execution of Rxi (i.e. one where Prop is true and one where Prop is false).5 Therefore, we simulate separately what Gj would plan if Prop were true and if Prop were false; since also Gj 's recipes may involve uncertain e ects, we adopted a simple scheme of multiplying the probability of the di erent outcomes of Gj 's actions with the probability of Gj 's initial states in order obtain the set of worlds representing the possible outcomes of Gj 's reactions to the plan Rxi (see Figure 3). Most of the computations mentioned above are carried out using the mechanisms that DRIPS puts at disposal for dealing with uncertainty and nondeterminism, by allowing to model actions that have di erent e ects with di erent probabilities and utility functions that return the lower and upper bound of possible value of the function.6

4.3 Eciency

The mechanism described above for the choice of the best recipe is computationally expensive. So we were forced to simplify it someway in the actual implementation. A rst step toward a more ecient solution is provided by DRIPS, which enforces a pruning mechanism in the selection of recipes at di erent levels of detail. As stated above, an agent Gi rst looks for a recipe for doing an action lx;i , and then for recipes for all of its substeps, and so on until elementary actions are reached. In principle, all possible recipes should be expanded at the di erent levels in order to nd the best plan (full plan, i.e. composed of elementary actions). But DRIPS cuts o from the search tree all recipes whose maximum possible utility is less than the minimum of another recipe, so that only the best recipes (according to utility) are kept. But since our system admits partial plans (i.e. plans whose steps are not elementary) we decided to apply the method just to the rst (leftmost) step of each recipe, so that, before acting it is not necessary to expand the plan completely. This is a rather standard method in reactive planning because, as [Bratman et al., 1988] noticed, agents limit the search for solutions to partial ones, since working in a dynamic world make overdetailed plan often useless. Finally, we implemented the prototype assuming that: 1. Gi will not consider Gj 's reasoning about the future actions of Gi after the execution of Ry ; 2. There are no concurrent actions, but just interleaving of plans. 3. Agents are sincere in the communication with other agents. The problem of simulating another agent's planning is very dicult. For instance, in some situations, Gj could not be aware of Rx e ects. In our implementation, we adopted the simpli cation that Gj 's knowledge of a state is updated by an action of Gk only with the e ects which are explicitly mentioned inside a Bel operator whose rst argument is instantiated with Gj (for instance, as the result as a communicative action having Gj as receiver). 5 Using as G 's initial world one where Prop has p(Prop) probability to be true, would correspond to the situation j in which Gj is planning with uncertainty about Prop. 6 We do not discuss here the minor changes we made to DRIPS to manage uncertainty in case of cooperation. 4

5

/* in input the one-level plan of agent the step in charge of

Gj ,

i.e.

j m

Gi

(gi) for

kx;i

(plan-x-i-k), the identifier of agent

(action-j-m) and an initial world*/

Gj

(gj),

plan-shared-actions(Gi, Plan-x-i-k, Gj, action-j-m, initial-world) begin /* refinement of plan-x-i-k by selecting an alternative or adding the decomposition of an action belonging to the plan */ refined-plans := refine-plan (plan-x-i-k, gi, initial-world); final-worlds := nil; /* for each possible outcome of each possible alternative */ for-each plan in refined-plans begin /* outcomes of a plan of Gi from the initial worlds (their probability sums to one)*/ for-each world in resulting-worlds(plan, initial-world) begin /* save the probability of the outcome of plan */ prob := world.prob; /* simulate Gj planning from an outcome of plan as it were the only possible one */ world.prob := 1; primitive-plans-j := plan(action-j-m, gj, world); /* select best plan from Gj 's point of view: Gi considers only Gj 's best alternative */ chosen-plan-j := best-plan-EU(primitive-plans-j, gj, world); resulting-worlds := resulting-worlds(chosen-plan-j, gj, world); /* restore the probability of the outcomes w that come after world */ for-each w in resulting-worlds begin w.prob := w.prob * prob; end /* the probability of worlds in final worlds will sum to one */ final-worlds := final-worlds + resulting-worlds; end /* assign to each Gi 's alternative the expected utility from Gi 's perspective */ plan.EU := compute-EU(plan, final-worlds, gi); end /* eliminate plans that are not promising */ return(eliminate-plans(refined-plans, gi)); end

Figure 1: The function of the planner that given a plan, performs a step of re nement and discharges unpromising alternatives. In spite of these simpli cations, we believe that the model is scalable and that by exploiting more sophisticated planners, it is possible to drop some of these assumptions while preserving the basic ideas of cooperation and global utility.

5 Predictions from the model In this section we will consider the predictions deriving from our de nition of cooperation.

5.1 Helpful behavior and communication

In our model, helpful behavior (i.e. adoption) is at the basis of cooperation: the agent will consider the goals of other agents and, only if it is useful for the group, he will adopt them. Moreover, adoption of a subgoal is useful only in the presence of communication, otherwise the helpful agent would risk to con ict with his partners if they are not aware of the help. As we have seen before, communication adds an overhead to cooperation, but the utility function takes it into account. As a special case of helpful behavior, it is possible to predict the behaviors which are described by the de nition of joint intention of [Cohen and Levesque, 1991]. In particular, we can predict that an agent, whenever he comes to know that the shared goal has been achieved, is impossible to achieve or has become irrelevant, will notify this fact to his partners. In fact, as stated above, if an agent Gi knows that a partner Gj has a given goal Hj , Gi can infer that Gj will also have the subsidiary goals of knowing, for example, if he succeeded. Assume that, suddenly, Gi comes to know (without any further cost) that Hj holds. In his next planning phase he has to reconsider whether to adopt Gj 's goal of knowing if Hj holds; if Gi adopts this goal, he has just to communicate to Gj that Hj holds, an action that adds a little overhead 6

if c= 0 more general action

if c= 0

more general action

eff if c= 1

eff if c= 1

e=0 f=1

d = 1 (0.8) d = 0 (0.2) e=0 f = (1 4) d = 1 (0.8) d = 0 (0.2) e=1 f = (1 4)

more general action

e=1 f=1

eff 1

d = 1 (0.8) d = 0 (0.2) f=4

Figure 2: The generalization hierarchy: note that the e ects of the more general action subsumes those of the more speci c ones and that also the conditions on e ects are preserved. Probability of the resulting outcomes not q

0.2 * 0.3 = 0.06

0.7

q

0.2 * 0.7 = 0.14

eff

q

0.3 0.2 A’s possible plan

not p

best plan chosen by B

p

best plan chosen by B

eff 0.8

eff

1

0.8 * 1 = 0.8

Figure 3: Splitting the worlds. to Gi and therefore to the group's utility. But, when Ai considers his alternatives, like going on with his own plan and not adopting Gj 's goal, he discovers that the group's utility would be lower, even if Gi 's own utility would be greater. In fact, if he does not adopt his goal and Gj is not aware of his success, Gj would waste his time going on in his activity or, at best, looking whether he succeeded or not. The agent Gi has to consider the cost of communicating with Gj . If communicationis expensive, it is not convenient for the group to waste resources in kindly communicating, since Gj could discover whether he succeeded in a less expensive manner. The same holds if communication is not reliable (the message gets lost) or slow: there is a probability that communication has not the desired e ect (or it gets it too late). Therefore, even if an agent decides that is better (for the group) not to communicate, his choice does not disrupt the group: in fact, communication is not predicted by our de nition of cooperation. Similar reasoning can be performed to predict the noti cations that an action is impossible or irrelevant. Our approach has the advantage that it does not require the assumption of perfect communication. For example, [Cohen and Levesque, 1991] set the simpli catory assumption that communication never fails, since otherwise the joint intention would be disrupted when an agent fails in notifying that he succeeded to the other partners. A group based on a shared utility function is tolerant to communication failures: if an agent who wants to communicate to the partner that their joint goal is achieved has faulty communication capabilities, he can simply give up the plan to notify his partners and go on in his activity. The e ect is that the partners may waste their e orts in making true an already achieved goal or in monitoring whether the goal holds, but the group is not disrupted by the agent opting out.

5.2 Con ict avoidance

Since agents share a group utility function, we can predict that they will (try to) avoid con icts with other agents' intentions; in fact, performing an action that interferes with the plans of other team members would decrease the utility of the whole team. Considering the di erential in utility among the various alternatives of an agent Gi does not rule out every possible con ict with others' intentions; when Gi considers the possible developments of his partial plan, he examines what e ects his action will have on the partners' plans. Therefore, he will choose the alternative that maximizes the utility of the group even if it compels partners to add actions for restoring preconditions and so on. 7

Finally, if the agent discovers that there is no viable alternative, the shared plan is impossible and therefore it must be discharged. In the following, we will consider the example of [Grosz and Kraus, 1996] concerning cooperation in preparing a dinner: Kate and Dan have the shared intention of making dinner together. Kate is preparing a appetizer and Dan the lasagna. Both agents have at their disposal a plan for doing their; that plan requires a given resource with low costs but also more expensive alternatives that do not need the given resource. Various scenarios are possible: 1. the needed resource is reusable: if the other alternatives have a lower cost it is possible to add a low cost action that resolves the con ict (e.g. a pan can be washed); 2. they need a non reusable resource (e.g. eggs) and the partner is aware of the choice the other will do: therefore, if Kate decides to use the resource, Dan should select the alternative plan; 3. they need the same not reusable resource but they are not aware of the plan the other will select: if agents independently commit to incompatible plans, they cannot recover and must give up the shared plan. For example, once Kate and Dan are committed to two recipes that both need eggs, they cannot backtrack and select another one, since they could have already wasted some necessary ingredients. From Kate's point of view, there are three possible situations: (a) it is unlikely that Dan will choose the recipe that needs the resource; therefore, she will not consider the more expensive recipe that does not employ the eggs; (b) it is very likely that Dan will choose the recipe that needs the resource: even if the recipe that does not employ the resource is more expensive, she will adopt it; (c) if the overhead for communication is lower than the cost of Kate's alternative, she noti es Dan the that the alternative which needs the eggs is not feasible.

5.3 Contracting out

An agent A can delegate a step R of his own part to another agent C that does not belongs to the group GR; A and C will form a group sharing the goal to perform R and a corresponding utility function. Note that in this case C does not become a member of the group: in fact, he will not necessarily know which are the group's goal and utility function. Therefore it is possible that he interferes with GR while performing R.

5.4 Ending cooperation

We can predict also the termination condition for the group cooperation: when all members know that an action succeeded, is become impossible or irrelevant, every other actions than stopping would result in a waste of resources: in fact, the stopping alternative gains utility from saving resources. Therefore, the shared plan is naturally ruled out, without the need of stipulating conditions for its termination. Increasing the utility of the group sometimes produces an unacceptable decrease of the private utility. But the basis on which an agent decides his behavior is his own utility function, of which the global one is just a component. In this case he has to choose to retire from the shared activity: this is not an harmless choice since the other agent can retaliate for being abandoned (for example by not helping him in future situations). When the cost of not abandoning the group is greater than the consequences of leaving it, the agent will chose to go on with his private goals. However, there are di erent strategies for leaving the group, as well as there are various ways of refusing cooperation when requested.

6 An example

In Figure 4 we report some actions representing the situation where agents A and B are looking for an object: the shared plan is composed of the single actions of searching in separate places and they are not aware whether the partner succeeded or not. Until the object has not been found, the expected utility of the action of looking for is di erent form zero while the utility of doing nothing is zero. When A nds the desired object (Afound = 1), he has three alternatives: a) going on looking for the object (that amounts to wasting resource without further utility; b) communicating to B that 8

/* action of giving up the plan: A does nothing more */ (add-action donothingA (cond (t (1 ())))) /* action of communicating to B that the object they are looking for has been found (the effect Bfound means that B knows this information). The Bfound effects is achieved only with a probability of 0.9 */ (add-action communicateA (cond (t (1 (time = time + 5) 1 (f = f - 2) 1 (Bfound = 1) 9/10 (Bfound = 0) 1/10)))) /* looking for an action can make A find it, but B will not be aware of this fact */ (add-action searchA (cond ((Afound = 0) (1 (time = time + 30) 1 (Afound = 1) 4/10 (Afound = 0) 6/10 (f = f - 3) 1)) ((Afound = 1) (1 (time = time + 30) 1 (f = f - 3) 1)))) /* the action that is given in input by A to the planning procedure: it represents the alternatives of searching, communicating or leaving the group */ (add-action planA (cond (t (1 (time = time + (0 5)) 1 (f = f - (0 2)) 1 (Bfound = (0 1)) 1))) /* list of more specific actions */ (list communicateA donothingA))

Figure 4: The part of agent A in the shared plan of looking for an object. he has found the object (Bfound = 1), as a consequence of the adoption of B 's goal of knowing whether the group succeeded in their goal; c) leaving the group since the goal has been obtained. If A chooses alternative (a) or (c), he knows that B will go on in searching the object, since Bfound = 0. Only if A communicates with B, the utility of the group results in an increase: in fact, if the action succeeds, Bfound = 1 and, therefore, B will choose to leave the group. In this situation, the group will gather the best utility, since the action of communicating is less costly than B's action of continuing to search. Note that in this case the consumption of resources is weighted in a uniform way for both members of the group; but we can induce a sort of hierarchy in the group by allowing actions that are weighted di erently depending on who execute them. For example, if B's communication is weighted more than A's one and B rst succeeds in nding the object, he will not communicate this fact to A.

7 Speci cation and architecture As stated before, we aim at building a computational system that satis es the speci cations of formal models. So, it seems interesting to compare the di erent proposals in order to establish the degree of correspondence between them, and to single out the respective weaknesses. We base our discussion on [Jennings and Wooldridge, 1999] (W-T henceforth), since this paper provides an account of the rst steps of the cooperative activity. In fact, they de ne the Cooperative Problem Solving Process in terms of four stages:  Recognition: an agent recognizes the potential for cooperative action in order to reach a goal.  Team formation: the agent solicits assistance from other agents. If he succeeds, then a group is formed.  Plan formation: the agent attempts to negotiate a plan with the other agents in order to form a joint plan.  Team action: the joint plan is executed by the group.

7.1 Recognition

The conditions for group cooperation reported in [Jennings and Wooldridge, 1999], pp.16-17, are the following: 9

  

There is some group g such that i believes that g can jointly achieve ; i can't achieve  in isolation, or i believes that for every action that it could perform that achieves , it has a goal of not performing .

Although we agree on the principles that underlie these conditions, we believe that this is a rst place where utility must enter into play: in practical settings, the rst condition is achieved by nding a suitable recipe R that enables the achievement of . But there could be other possibilities for achieving , whose main di erence with respect to R is that they are associated with a lower utility value. So, the conditions for group cooperation become (where we replace the original reference to \actions" and , with a reference to recipes R  and R  ):  

There is some group g such that i believes that g can jointly achieve  through recipe R  ; i believes that for every recipe R  that it (or any group of agents equal to g or di erent from g) could perform that achieves , it believes that R  is less preferable than R  .

Of course, preference is de ned in terms of the utility function, so that it is possible to take into account the costs of the actions and their assumed probability to achieve the goal. Finally, notice that the process of team formation has its own cost, so that if i believes that a potential (but necessary) member of g is hard to convince to cooperate, it could decide that individual action is, after all, the best solution.

7.2 Team formation

So an agent enters stage 2 (Team Formation) with some ideas in mind on how the group can attain the goal. In order to formalize this stage, W-J introduce the operator Attempt: (Attempt i  ) means that i tries to achieve  via , but with being an acceptable result in case  is not reached. This is used to specify that a team formation activity may fail, but that at least i must produce in the group the mutual belief that he has the goal  and that he believes that the group can reach . We are almost in line with these requirement, with the proviso that we don't need state them explicitly. In general, in a reactive planner any action may fail, so that the execution of any action is just an Attempt. A relevant di erence with respect to W-J is that the minimal conditions for terminating the attempt ( ) are not stated explicitly, but the basic idea is that the action is terminated when its utility value is overcome by some other action (perhaps not concerning  at all). For instance, if i realizes that the group cannot be formed because one of the potential members of i declares that he is not interested in cooperating, we fall within W-J speci cation; on the other hand, if it happens that a member can only be reached and informed with a very high cost, the attempt could be abandoned even before the minimal requirement is obtained. In general, however, the need for the various Mutual Beliefs that characterize group activity should be explicitly stated. This is obtained in the model by specifying them as preconditions of the general action Group-Action, which is a generic action subsuming all actions whose recipe involves more than one participant. Since  any joint activity is a speci cation of Group-Action,  preconditions are inherited, and  the planning process requires that the preconditions have to be made true before any action is executed, then  the communication between i and the other members is automatically started up, and cannot be terminated until the standard conditions for the termination of an action mentioned above are met. We nally note that the fact that a potential group action has already been found in the previous stage, can put at disposal some pieces of information that can be useful in team formation. For instance, if a potential member of the group expresses his doubts on the possibility of achieving the goal, i could mention the found action in order to convince him. We partially addressed the problem of \convincing" in our work on politeness [Ardissono et al., 1999]. 10

7.3 Plan formation

Although we have kept distinct the phases of Team Formation and Plan Formation, according to W-J speci cation, we think that they should be merged in a single one. The reason is that we believe it is simpler, at least from a planning point of view, to consider a group formed around a plan than to consider it formed around a goal. In other words, the success of the plan formation activity seems to be rather a prerequisite of having a group than a subsequent stage. Notice, however, that this is accounted for by W-J by keeping apart the predicates Pre-Team and Team, where the rst involves commitment to a goal, and the second involves a joint intention to achieve the goal via a given action. Again, the key point are preconditions: the preconditions of a group activity specify that all agents involved must \Intend to" carry out their part of the plan. So, the standard attempt to satisfy the preconditions produces the communication between an agent and his potential partners. In particular, if an agent i is examining a plan p, as a candidate for achieving g, then he must check that all of its steps can be executed; in a multi-agent plan, this requires that i:      

identify a (multi-agent) plan; check, for each step, which are the agents (possibly i himself) who, according to his knowledge, are able to perform the step; decide whether each step must be assigned to an agent at the outset, or if the choice can be delayed (perhaps for easy steps, i.e. steps that all agents in the group can care about); inform all involved agents about his desire to achieve the goal g; suggest each of them to Intend to do the step assigned to it (possibly describing it the overall structure of the plan p); wait for a reply, which can be an agreement, or a new hypothesis about the responsible of the step, or a completely new plan. So, another agent can take the initiative and the assessment of the joint plan proceeds.

So, the main di erence with respect to W-J is the fourth step, where the cooperating agents come to have among their goals g (with a desirability degree which may depend on various factors); but having among the goals does not imply being committed to it, since this requires the further step of evaluation of the utility function for the adoption of the goal. We must observe, however, that the discussion of this stage is rather speculative, since the current implementation assumes that the various goals are already installed in the agents, so that no negotiation is carried out.

7.4 Plan execution

In analyzing this last stage, W-J basically refer to [Levesque et al., 1990], so we also revert to this paper. Most of our comments on the proposal reported therein have already been made. In particular, we faced the problem of Joint Commitment in a radically di erent way. We recall that the problem [Levesque et al., 1990] wants to solve is the one of \group cohesion": an agent in a group should not stop acting (abandon the goal) until all members of the group are informed about this (in particular in case an agent has found that the goal is impossible to achieve). The reason we adopted a di erent solution (i.e. without introducing Weak Goals) is that we believe that the requirement of group cohesion is too strong. We assume, instead, that an agent can always abandon the common goal and exit the group (though in some cases this can be impolite and unfriendly). Notice that [Levesque et al., 1990] says: \A further possibility (that we will not deal with) is for an agent to discover that it is impossible to make the status of p known to the group as a whole, for example, when communication is impossible" (p.97). We agree with this possibility, but we rather change the last word from \impossible" to \too expensive for the group", in order to make the agent base his decision on the utility value, which guides all actions of the group members.

8 Conclusion In this paper, we have presented a (preliminary) implementation of plan-based reasoning on group activity. The main contribution of this work is the integration of a plan-based approach with tools 11

for dealing with uncertainty and probabilities. This has been achieved by basing the implementation on the DRIPS planner [Haddawy and Suwandi, 1994]. The plan-based approach has also been compared with a prominent logic-based formalization of cooperative team activity [Jennings and Wooldridge, 1999]. We showed that the implementation shares many features with the logical speci cation, and we have singled out the main points where they di er. Another logical approach, that would have been interesting to examine is presented in the works by Grosz and Kraus, but space constraints prevented us from analyzing it. We must say, however, that their formalization of PSP is particularly interesting with respect to the possibility of assigning subactions to subgroups (instead of individual agents) and with respect to the recursive nature of the expansion of subactions. Finally, we must observe that the presented implementation is very preliminary, but it has been very useful to evaluate the impact of uncertainty in plan development. On the basis of the results, we are (jointly) planning to extend the larger prototype we have developed in the last years [Ardissono et al., 1998].

References [Allen, 1979] Allen, J. (1979). A plan-based approach to speech act recognition. PhD thesis, University of Toronto. [Ardissono and Boella, 1998] Ardissono, L. and Boella, G. (1998). An agent model for NL dialog interfaces. In Lecture Notes in Arti cial Intelligence n. 1480: Arti cial Intelligence: Methodology, Systems and Applications, pages 14{27. Springer Verlag, Berlin. [Ardissono et al., 1998] Ardissono, L., Boella, G., and Lesmo, L. (1998). An agent architecture for NL dialog modeling. In Proc. Second Workshop on Human-Computer Conversation, Bellagio. [Ardissono et al., 1999] Ardissono, L., Boella, G., and Lesmo, L. (1999). The role of social goals in planning polite speech acts. Submitted to conference review. [Bratman et al., 1988] Bratman, M., Israel, D., and Pollack, M. (1988). Plans and resourcebounded practical reasoning. Computational Intelligence, 4:349{355. [Carberry, 1990] Carberry, S. (1990). Plan Recognition in Natural Language Dialogue. MIT Press. [Castelfranchi and Falcone, 1997] Castelfranchi, C. and Falcone, R. (1997). From task delegation to role delegation. In Lenzerini, M., editor, LNAI 1321. AI*IA 97: Advances in Arti cial Intelligence, pages 278{289. Springer Verlag, Berlin. [Cohen and Levesque, 1990] Cohen, P. and Levesque, H. (1990). Intention is choice with commitment. Arti cial Intelligence, 42:213{261. [Cohen and Levesque, 1991] Cohen, P. and Levesque, H. (1991). Con rmation and joint action. In Proc. 12th IJCAI, pages 951{957, Sydney. [Grosz and Kraus, 1996] Grosz, B. and Kraus, S. (1996). Collaborative plans for complex group action. Arti cial Intelligence, 86(2):269{357. [Grosz and Kraus, 1998] Grosz, B. and Kraus, S. (1998). The evolution of Shared Plans. In Rao, A. and Wooldridge, M., editors, Foundations and Theories of Rational Agencies. to appear. [Haddawy and Hanks, 1998] Haddawy, P. and Hanks, S. (1998). Utility models for goal-directed, decision-theoretic planners. Computational Intelligence, 14:392{429. [Haddawy and Suwandi, 1994] Haddawy, P. and Suwandi, M. (1994). Decision-theoretic re nement planning using inheritance abstraction. In Proc. of 2nd Int. Conference on Arti cial Intelligence Planning Systems, pages 266{271, Menlo Park, CA. [Jennings and Wooldridge, 1999] Jennings, N. R. and Wooldridge, M. J. (1999). Cooperative problem solving. Journal of Logic and Computation. [Levesque et al., 1990] Levesque, H., Cohen, P., and Nunes, J. (1990). On acting together. In Proc. of 8th National Conference on Arti cial Intelligence (AAAI-90), pages 94{99, Boston. [Rao and George , 1991] Rao, A. and George , M. (1991). Deliberation and intentions. In Proc. of 7th Conference on Uncertainity in Arti cial Intelligence, Los Angeles. 12

Suggest Documents