Dec 15, 1999 - communication with the Internet, and development of various types of ... the domain can cause a reasoning agent to withdraw its previously ...... Now suppose that Jack can buy his ticket either using a credit card or paying ...
Reasoning Agents in Dynamic Domains Chitta Baral and Michael Gelfond
Arizona State University, University of Texas at El Paso
December 15, 1999 Abstract. The paper discusses an architecture for intelligent agents based on the use of A-Prolog - a language of logic programs under the answer set semantics. A-Prolog is used to represent the agent's knowledge about the domain and to formulate the agent's reasoning tasks. We outline how these tasks can be reduced to answering questions about properties of simple logic programs and demonstrate the methodology of constructing these programs.
Keywords: intelligent agents, logic programming and nonmonotonic reasoning
1. Introduction This paper is a report on the attempt by the authors to better understand the design of software components of intelligent agents capable of reasoning, planning and acting in a changing environment. The class of such agents includes, but is not limited to, intelligent mobile robots, softbots, immobots, intelligent information systems, expert systems, and decision-making systems. The ability to design intelligent agents (IA) is crucial for such diverse tasks as space exploration, intelligent communication with the Internet, and development of various types of control systems. Despite the substantial progress in the work on IA achieved in the last decade we are still far from a clear understanding of the basic principles and techniques needed for their design. The problem is complicated by the fact that IA dier from more traditional software systems in several important aspects: An agent should have a large amount of knowledge about the domain in which it is intended to act, and about its own capabilities and goals. It should be able to frequently expand this knowledge by new information coming from observations, communication with other agents, and awareness of its own actions. All this knowledge cannot be explicitly represented in the agent's memory. This implies that the agent should be able to reason, i.e. to extract knowledge stored there implicitly. Finally, the agent should be able to use its knowledge and its ability to reason to rationally plan and execute its actions.
c 1999 Kluwer Academic Publishers.
Printed in the Netherlands.
final.tex; 15/12/1999; 13:45; p.1
2
Chitta Baral and Michael Gelfond
These observations imply that solid theoretical foundations of agent design should be based on theories of knowledge representation and reasoning. Work reported in this paper is based on two such theories: theory of logic programming and nonmonotonic reasoning (C. Baral and M. Gelfond, 1994), (V. Lifschitz, 1996), (W. Marek and M. Truszczynski, 1993) and theory of actions and change (ETAI, 1998).
2. Modeling the agent In this paper we adopt the following simplifying assumptions about the agents and their environments: 1. The dynamics of the agent's environment is viewed as a (possibly in nite) transition diagram whose states are sets of uents (i.e., statements whose truth depends on time) and whose arcs are labeled by actions. 2. The agent is capable of making correct observations, performing actions, and remembering the domain history. These assumptions hold in many realistic domains and are suitable for a broad class of applications. In many domains, however, the eects of actions and the truth of observations can only be known with a substantial degree of uncertainty which cannot be ignored in the modeling process. In this case our basic approach can be expanded by introducing probabilistic information. We prefer to start our investigation with the simplest case and work our way up the ladder of complexity only after this simple case is reasonably well understood. The assumptions above determine the structure of the agent's knowledge base. It consists of two parts. The rst part, called an action description, speci es the transition diagram of the agent. It contains description of actions and
uents relevant to the domain together with the de nition of possible successor states in which the system can move after an action a is executed in a state . Due to the size of the diagram the problem of nding its concise speci cation is far from being trivial. Its solution requires a good understanding of the nature of causal eects of actions in the presence of complex interrelations between uents. Additional level of complexity is added by the fact that, unlike standard mathematical reasoning, causal reasoning is nonmonotonic, i.e., new information about the domain can cause a reasoning agent to withdraw its previously made conclusions. Nonmonotonicity is partly caused by the need for representing defaults, i.e., statements of the form \Normally, objects of type A have property P ". A default that is often used in causal reasoning is the so called inertia axiom (J. McCarthy and P. Hayes, 1969).
final.tex; 15/12/1999; 13:45; p.2
Reasoning Agents in Dynamic Domains
3
It says that \Normally, actions do not change values of uents". The problem of nding a concise and accurate representation of this default, known as the Frame Problem, substantially in uenced AI research during the last twenty years (M. Shanahan, 1997). The second part of the agent's knowledge contains observations made by the agent together with a record of its own actions. It de nes a collection of paths in the diagram which can be interpreted as the domain's possible trajectories. If the agent's knowledge is complete (i.e., it has complete information about the initial state and the action occurrences) and its actions are deterministic then there is only one such path. We believe that a good theory of agents should contain a logical language (equipped with a consequence relation) which would be capable of representing both types of the agent's knowledge. A-Prolog
In this paper the language of choice is A-Prolog - a language of logic programs under the answer set semantics (M. Gelfond and V. Lifschitz, 1988; M. Gelfond and V. Lifschitz, 1991). A program of A-Prolog consists of a signature and a collection of rules of the form
head
body
(1)
where the head is empty or consists of a literal l0 and body is of the form l1 ; : : : ; lm ; not lm+1 ; : : : ; not ln where li 's are literals over . A literal is an atom p or its negation :p. If the body is empty we replace by a period. While :p says that p is false, not p has an epistemic character and can be read as \there is no reason to believe that p is true". The symbol not denotes a non-standard logical connective often called default negation or negation as failure. A program of A-Prolog can be viewed as a speci cation given to a rational agent for constructing beliefs about possible states of the world. Technically these beliefs are captured by the notion of answer set of a program . By ground() we denote a program obtained from by replacing variables by the ground terms of . By answer sets of we mean answer sets of ground(). If consists of rules not containing default negation then its answer set S is the smallest set of ground literals of which satis es two conditions: 1. S is closed under the rules of ground(), i.e. for every rule (1) in , either there is a literal l in its body such that l 62 S or its non-empty head l0 2 S . 2. if S contains an atom p and its negation :p then S contains all ground literals of the language.
final.tex; 15/12/1999; 13:45; p.3
4
Chitta Baral and Michael Gelfond
It is not dicult to show that there is at most one set (Cn()) satisfying these conditions. Now let be an arbitrary ground program of A-Prolog. For any set S of ground literals of its signature , let S be the program obtained from by deleting (i) each rule that has an occurrence of not l in its body with l 2 S , (ii) all occurrences of not l in the bodies of the remaining rules. Then S is an answer set of if S = Cn(S ): (2) In this paper we limit our attention to consistent programs, i.e. programs with at least one consistent answer set. Let S be an answer set of . A ground literal l is true in S if l 2 S ; false in S if :l 2 S . This is expanded to conjunctions and disjunctions of literals in a standard way. A query Q is entailed by a program ( j= Q) if Q is true in all answer sets of . Queries l1 ^ : : : ^ ln and l1 _ : : : _ ln are called complementary. If Q and Q are complementary queries then 's answer to Q is yes if j= Q; no if j= Q, and unknown otherwise. Here are some examples. Assume that signature contains two object constants a and b. The program 1 = f:p(X ) not q(X ): q(a):g has the unique answer set S = fq(a); :p(b)g. The program 2 = fp(a) not p(b): p(b) not p(a):g has two answer sets, fp(a)g and fp(b)g. The programs 3 = fp(a) not p(a):g and 3 = fp(a): p(a):g have no answer sets. It is easy to see that programs of A-Prolog are nonmonotonic. (1 j= :p(b) but 1 [ fq(b):g 6j= :p(b).) A-Prolog is closely connected with more general nonmonotonic theories. In particular, as was shown in (W. Marek and M. Truszczynski, 1989; M. Gelfond and V. Lifschitz, 1991), there is a simple and natural mapping of programs of A-Prolog into a subclass of Reiter's default theories (R. Reiter, 1980). (Similar results are also available for Autoepistemic Logic (R. Moore, 1985).)
final.tex; 15/12/1999; 13:45; p.4
Reasoning Agents in Dynamic Domains
5
Specifying transition diagrams
To describe the agent's transition diagram, the domain's history and the type of queries available to the agent we use action languages (M. Gelfond and V. Lifschitz, 1992) which can be viewed as formal models of parts of natural language that are used for talking about the eects of actions. A particular action language re ects properties of a domain and the abilities of an agent. In this paper we de ne a class of action languages AL which combine ideas from (N. McCain and H. Turner, 1995; C. Baral, M. Gelfond and A. Provetti, 1997; E. Guinchiglia and V. Lifschitz, 1998). Languages AL() from this class will be parameterized with respect to a signature which we normally assume xed and omit from our notation. We follow a slightly modi ed view of (V. Lifschitz, 1997) and divide an action language into three parts: action description language, history description language, and query language. We start with de ning the action description part of AL, denoted by ALd . Its signature will consist of two disjoint, non-empty sets of symbols: the set F of uents and the set A of elementary actions. A non-empty set fa1 ; : : : ; ang of elementary actions is called a compound action. It is interpreted as a collection of elementary actions performed simultaneously. By actions we mean both, elementary and compound actions. To simplify the presentation we identify a compound action fag with elementary action a. By uent literals we mean uents and their negations. By l we denote the uent literal complementary to l. A set S of uent literals is called complete if for any f 2 F, f 2 S or :f 2 S . An action description of ALd () is a collection of propositions of the form 1. causes(a; l0 ; [l1 ; : : : ; ln ]). 2. caused(l0 ; [l1 ; : : : ; ln ]). 3. impossible if (a; [l1 ; : : : ; ln ]). where a is an action and l0 ; : : : ; ln are uent literals from . The rst proposition says that, if the action a were to be executed in a situation in which l1 ; : : : ; ln hold, the uent literal l0 will be caused to hold in the resulting situation. Such propositions are called dynamic causal laws. (For simplicity we only consider such laws for elementary actions.) The second proposition, called a static causal law, says that, in an arbitrary situation, the truth of uent literals, l1 ; : : : ; ln , (often called the body of (2)) is sucient to cause the truth of l0 . The last proposition says that action a can not be performed in any situation in which l1 ; : : : ; ln hold.
final.tex; 15/12/1999; 13:45; p.5
6
Chitta Baral and Michael Gelfond
An action description A of ALd de nes a transition diagram describing eects of actions on the possible states of the domain. A state is a complete and consistent set of uent literals such that is closed under the static causal laws of A, i.e. for any static causal law (2) of A, if fl1 ; : : : ; lng then l0 2 . States serve as the nodes of the transition diagram. Nodes 1 and 2 are connected by a directed arc labeled by an action a if 2 may result from executing a in 1 . The set of all states that may result from executing a in a state will be denoted by res(a; ). De ning this set for action descriptions of increasingly complex action languages seems to be one of the main issues in the development of action theories. Several xpoint de nitions of this set were recently given by dierent authors; see for instance (N. McCain and H. Turner, 1995; F. Lin, 1995; E. Guinchiglia and V. Lifschitz, 1998). The de nition we give in this paper is based on the semantics of A-Prolog and is similar in spirit to the work in (C. Baral, 1995). First let us consider the following program: 1: impossible if (A; P ) 2: causes(A; L; P ) 3: holds at(L; T 0 ) 4: holds at(L; T ) 5: holds at(L; T 0 ) 6: hold at([]; T ): 7: hold at([LjRest]; T ) 8: :occurs at(A; T ) 9: :occurs at(A0 ; T ) 10: occurs at(A; T ) 11:
9 subset(A0 ; A); > impossible if (A0 ; P ); > > > hold at(P; T ): > > member(A0 ; A); > > causes(A0 ; L; P ): > 0 > next(T; T ); > > causes(A; L; P ); > > occurs at(A; T ); > > hold at(P; T ): > > caused(L; P ); > > hold at(P; T ): > 0 > next(T; T ); = holds at(L; T ); (N ) not holds at(L; T 0 ): > > >
> > > holds at(L; T ); > > hold at(Rest; T ): > > member(A0 ; A); > not occurs at(A; T ): > > > member(A0 ; A); > > occurs at(A; T ): > not :occurs at(A; T ): > > impossible if (A; P ); > > > occurs at(A; T ); ; hold at(P; T ):
final.tex; 15/12/1999; 13:45; p.6
Reasoning Agents in Dynamic Domains
7
Here A0 ; A are variables for actions, L and P are variables for uent literals and ( nite) sets of uent literals respectively, and T and T 0 are variables for integers from some interval [0; N ]. Integers from this interval will be later interpreted as time points. The program uses the list notation [ ] and predicate symbols member and subset denoting the standard membership and proper subset relations; member(A0 ; A) holds when A0 is an elementary action not belonging to A. The rst rule of says that a (compound) action A is impossible in a state in which some combination of its components can not be executed. The second rule reduces the eects of compound action to the eects of its components. Rules three and four describe the eects of causal laws. The predicate symbol holds at(L; T ) denotes the relation "A
uent literal L is true at moment T "; hold at(P; T ) denotes the expansion of this relation to sets of uents; occurs at(A; T ) indicates that a compound action A occurs at moment T . The relation next(T; T 0 ) is satis ed by two consecutive moments of time from the interval [0; N ]. Rule ve encodes the law of inertia. The default nature of this law is nicely captured by the use of default negation of A-Prolog. Rules six and seven de ne hold at(P; T ). Next three rules de ne the relation occurs at(A; T ) in terms of occurrences of elementary actions. The last rule with the empty head is used to insure that impossible actions are indeed impossible. Let a = fa1 ; : : : ; an g be a compound action. Now we can de ne the set res(a; 1 ) of successor states of a on 1 . DEFINITION 1. For any action a and state 1 , a state 2 is a successor state of a on 1 if there is an answer set S of (1) [ fholds at(l; 0) : l 2 1 g [ foccurs at(ai ; 0) : ai 2 ag such that 2 = fl : holds at(l; 1) 2 S g. It is essential to notice that De nition 1 allows us to reduce computing successor states of the transition diagram representing our dynamic domain to computing answer sets of comparatively simple logic programs. For domains without concurrent actions this de nition is equivalent to one from (N. McCain and H. Turner, 1995). The idea however is not limited to ALd and can be used to de ne a larger range of transition functions. We say that an action description D of ALd is deterministic if for any state and any action a from there is at most one successor state, i.e., the cardinality of the set res(a; ) is at most one. The following example shows that action descriptions of ALd can be nondeterministic. EXAMPLE 1. Let A1 be an action description
final.tex; 15/12/1999; 13:45; p.7
8
Chitta Baral and Michael Gelfond
causes(a; f; [ ]): caused(:g1 ; [f; g2 ]): caused(:g2 ; [f; g1 ]):
Using the De nition 1 it is easy to show that the transition diagram of A1 can be given as follows:
res(a; ff; g1 ; :g2 g) = fff; g1 ; :g2 gg res(a; ff; :g1 ; g2 g) = fff; :g1 ; g2 gg res(a; ff; :g1 ; :g2 g) = fff; :g1 ; :g2 gg res(a; f:f; g1 ; g2 g) = fff; g1 ; :g2 g; ff; :g1 ; g2 gg res(a; f:f; g1 ; :g2 g) = fff; g1 ; :g2 gg res(a; f:f; :g1 ; g2 g) = fff; :g1 ; g2 gg res(a; f:f; :g1 ; :g2 g) = fff; :g1 ; :g2 gg Action descriptions not containing static causal laws are deterministic and hence the addition of such laws adds to expressive power of the language. The following proposition gives a sucient condition guaranteeing that this property holds. PROPOSITION 1. Any action description A of ALd with static causal laws containing at most one literal in the body is deterministic. Specifying the history
We now describe a language ALh for specifying the history of the agent's domain. To do that we expand action signature by integers 0; 1; 2; : : :, which are used to denote time points in the actual evolution of the system. If such evolution is caused by a sequence of consecutive actions a1 ; : : : ; an then 0 corresponds to the initial situation and k (0 < k n) corresponds to the end of the execution of a1 ; : : : ; ak . The domain's past is described by a set, ?, of axioms of the form 1. happened(a; k): 2. observed(l; k): where a's are elementary actions. The rst axiom says that (elementary) action a has been executed at time k; the second axiom indicates that uent literal l was observed to be true at moment k. The axioms of ? will be often referred to as observations. A set of axioms de nes the collection of paths in a transition diagram T which can be interpreted as possible histories of the domain represented by T . (As usual, by a path we mean a sequence h0 ; a1 ; 1 ; : : : ; an ; n i, such that for 1 i n, i 2 res(ai ; i?1 ), and 0 is a state.) If our knowledge about the initial situation is complete and actions are deterministic then there is only
final.tex; 15/12/1999; 13:45; p.8
Reasoning Agents in Dynamic Domains
9
one such path. A pair hA; ?i, where A is an action description and ? is a non-empty set of observations, is called a domain description. The following de nition re nes the intuition behind the meaning of observations. DEFINITION 2. Let D = hA; ?i be a domain description, T be the transition diagram de ned by A, and H = h0 ; a1 ; 1 ; : : : ; an ; n i be a path in T . We say that H is a possible history of D if 1. ai = fa : happened(a; i ? 1) 2 ?g, 2. If observed(l; k) 2 ? then l 2 k . A domain description D is called consistent if it has a non-empty set of possible histories. It is easy to see that all possible histories have the same length. We call this length the current moment of time and denote it by tc. Queries and the consequence relation
Let us now assume that at each moment of time an agent maintains a knowledge base in the form of a domain description D. Various reasoning tasks of an agent can be reduced to answering queries about properties of the agent's domain. The corresponding query language, Lq , includes the following queries: 1. holds at(l; t): 2. currently(l): 3. holds after(l; [an ; : : : ; a1 ]; k): Query (1) asks if uent literal l holds at time t. Query (2) asks if l holds at the current moment of time. The last query is hypothetical and is read as: \Is it true that a sequence a1 ; : : : ; an of actions could have been executed at the time 0 k tc , and if it were, then uent literal l would be true immediately afterwards". If k < tc and the sequence a1 ; : : : ; an is dierent from the one that actually occurs at k then the corresponding query expresses a counterfactual. If k = tc then the query expresses a hypothesis about the system's future behavior. (In this case we often omit k from the query and simply write holds after(l; [an ; : : : ; a1 ]).) The de nition below formalizes this intuition. Let D = hA; ?i be a domain description and T be the transition diagram de ned by A. 1. a query holds at(l; k) is a consequence of D if for every possible history 0 ; a1 ; 1 ; : : : ; an ; n of D, 0 k n and l 2 k ;
final.tex; 15/12/1999; 13:45; p.9
10
Chitta Baral and Michael Gelfond
2. a query currently(l) is a consequence of D if for every possible history 0 ; a1 ; 1 ; : : : ; an ; n of D, l 2 n ; 3. a query holds after(l; [a0m ; : : : ; a01 ]; k) is a consequence of D if for every possible history 0 ; a1 ; 1 ; : : : ; an ; n of D, a01 ; : : : ; a0m is executable in k and for any path 00 ; a01 ; 10 ; : : : ; a0m ; m0 of T such that 00 = k , l 2 m0 . The consequence relation between query Q and domain description D = hA; ?i will be denoted by ? j=A Q. Architecture for intelligent agents
We now demonstrate how the notion of consistency of a domain description and the consequence relation ? j=A Q can be used to describe an architecture of an intelligent agent. In this architecture the set of elementary actions is divided into two parts: the actions which can be performed by an agent and exogenous actions performed by nature or other agents in the domain. We assume that at each moment t of time the agent's memory contains domain description D = hA; ?i and a partially ordered set G of agent's goals. By a goal we mean a nite set of uent literals the agent wants to make true. Partial ordering corresponds to comparative importance of goals. The agent operates under the assumption that it was able to observe all the exogenous actions. This assumption can however be contradicted by observations which may force the agent to conclude that some exogenous actions in the past remained unobserved. An agent will repeatedly execute the following steps 1. observe the world and incorporate the observations in ?; 2. select a most important goal g 2 G to be achieved; 3. nd plan a1 ; : : : ; an to achieve g; 4. execute a1 ; During the rst step the agent does two things. First it observes exogenous actions fb1 ; : : : ; bk g which happen at the current moment of time, t0 . These observations are recorded by simply adding the statements happened(bi ; t0 ) (1 i k) to the current set of observations, ?0 . Now the agent's history is encoded in the new set of axioms, ?0 with the current moment of time, t0 . Second, the agent observes the truth of some uent literals, O. If the agent observed all the exogenous actions which happen at time t0 then the domain description hA; ?0 [ Oi where
final.tex; 15/12/1999; 13:45; p.10
Reasoning Agents in Dynamic Domains
11
O = fobserved(li ; t0 ) : li 2 O and li is observed to be true at t0g will
be consistent. Otherwise it may be inconsistent. In the latter case new observations should be explained by assuming some unobserved occurrences of exogenous actions. This suggests that instead of adding observations to the domain description the agent should rst check their consistency and, if necessary, provide a suitable explanation. In precise terms, by explanation we mean a collection of statements H0 = fhappened(ai ; tk ) : tk < tc ; ai is an elementary exogenous actiong such that ? [ H0 [ O is consistent (3)
If O can be explained in several possible ways the agent should be supplied with some mechanism of selecting a most plausible one. (To simplify the discussion we assume that this is possible). Let us denote the set of axioms of the agent after the rst step of the loop by ?1 and its current time by t1 . During the second stage the agent selects a goal g 2 G which will, at least for a while, determine its future actions. The goals in G may have priorities which depend on the current state of the domain and hence the agent may change its immediate goal during the next execution of the loop. Meanwhile however it goes to step three and looks for a plan to achieve g. This planning problem can be reduced to nding a sequence of actions such that (4) ? j=A holds after(g; ) After (4) is solved and a plan = a1 ; : : : ; an is found the agent proceeds to execute the rst action, a1 , records occurs at(a1 ; t1 ) in the domain history, and goes back to step one of the loop. Of course this architecture is only valid if computation performed during the execution of the body of the loop is fast enough to not allow the possibility of occurrence of exogenous actions, that may change the relevant characteristics of the domain, before a1 is performed. In many situations the process can be made considerably faster by precomputing solutions of (3) and (4) for certain goals. A solution for (4) for instance can be stored in the agent's memory as rules of the form, say, if ? j=A currently(l) then execute(a1 ) else execute(a2 ) Agents whose ability to plan is limited to the use of such rules are called reactive. Otherwise, we characterize them as deliberative. Normally, agents should have both deliberative and reactive components. Notice, however, that even reactive actions depend on the current knowledge of an agent and the choice of such actions may require certain amount of reasoning. (Of course, checking if l holds in the current
final.tex; 15/12/1999; 13:45; p.11
12
Chitta Baral and Michael Gelfond
situation requires substantially less eort than planning or explaining observations.) EXAMPLE 2. Consider a domain with two locations, home and airport, and two objects, money and ticket. An agent, Jack, is capable of driving from his current location to the other one and of getting money and ticket. The last action however is possible only if Jack has money and is located at the airport. Jack views the world in terms of two uents, jack at(L) and has jack(O). The only exogenous action relevant to Jack is that of loosing an object. Let us construct an action description A of Jack's world and use it to model Jack's behavior. Types : location(home): object(money): location(airport): object(ticket): Fluents : fluent(jack at(L)) location(L): object(O): fluent(has jack(O)) Actions : location(L): agent action(drive(L)) object(O): agent action(get(O)) exogenous action(lose(O)) object(O): 9 > Causal Laws : > impossible if (drive(L); [jack at(L)]): > > causes(drive(L); jack at(L); []): > impossible if (get(ticket); [:has jack(money)]): > > impossible if (get(ticket); [:jack at(airport)]): = A causes(get(O); has jack(O); []): > > causes(lose(O); :has jack(O); []): > caused(:jack at(L1); [jack at(L2); L1 6= L2]): > > > impossible if ([get(O); drive(L)]; []): > ; impossible if ([get(O1 ); lose(O2 )]; []): Let us now assume that Jack's goal is to have a ticket. He starts with step one of the algorithm and records the following observations: 9 Axioms : > observed(has jack(money); 0) = ? 0 observed(:has jack(ticket); 0) > ; observed(jack at(home); 0): Since Jack has only one goal the selection is trivial. He goes to the planning stage, solves equation ?0 j=A holds after(has jack(ticket); X )
final.tex; 15/12/1999; 13:45; p.12
Reasoning Agents in Dynamic Domains
13
and nds a plan, 1 = [drive(airport); get(ticket)]. (We assume that he has enough time to nd the shortest plan). Jack proceeds by executing the rst action of this plan and recording this execution in the history of domain. The new collection of axioms looks as follows: ?1 = ?0 [ fhappened(drive(airport); 0)g. Now the current time is one and Jack continues his observations. Suppose he observes that his money is missing. This observation contradicts hA; ?1 i and hence needs an explanation. The only explanation, obtained by solving equation ?1 [ Y [ fobserved(:has jack(money); 1)g is consistent, is Y = fhappened(lose(money); 0)g, and hence ?2 = ?1 [ fhappened(lose(money); 0); observed(:has jack(money); 1)g. Assuming that Jack's goal is not changed, he proceeds to solve the equation ?2 j=A holds after(has jack(ticket); X ), nds a new plan, 2 = [get(money); get(ticket)], gets money, goes back to step one and hopefully proceeds toward his goal without further interruptions.
3. Reasoning algorithms The logic programming community has developed a large number of reasoning algorithms which range from the SLDNF resolution implemented in traditional Prolog systems to the XSB resolution (W. Chen, T. Swift, and D. Warren, 1995) implementing the well-founded semantics (A. Van Gelder, K. Ross, and J. Schlipf, 1991), and comparatively recent techniques for computing answer sets of A-Prolog (P. Cholewinski,W. Marek and M. Truszczynski, 1996; I. Niemela and P. Simons, 1996; T. Eiter et al., 2000). The latter forms the basis for answer set programming advocated in (I. Niemela, 1999), (W. Marek and M. Truszczynski, 1999). In this section we illustrate how these algorithms can be used to implement the above architecture and to perform the agent's reasoning tasks. Planning
In this subsection we discuss model-theoretic planning using A-Prolog. In this approach, the answers sets of the program encode possible plans. We slightly dier from the previous work on answer set planning by emphasizing the use of \planning modules" to restrict the kind of
final.tex; 15/12/1999; 13:45; p.13
14
Chitta Baral and Michael Gelfond
plans (such as allowing concurrency, forcing sequentiality, specifying preferences, etc.) that we are looking for. We start with assuming that the corresponding domain description D = hA; ?i is consistent and deterministic, the set ? of axioms is complete, i.e. uniquely determines the initial situation, and the goal g is a set of
uent literals. We also assume that time is represented by integers from [0; N ], that m is the maximum length of a plan the agent is willing to search for, and tc + m N . (Recall that tc is the current moment of time which can be easily obtained from ?). To nd a solution of equation (4) we consider a program d = (N ) [ A [ ? [ R where R consists of rules occurs at(A; T ) happened(A; T ): holds at(L; T ) observed(L; R): Next we construct a planning module, PM, describing the agent's search space. We start with a simple planning module, PM0 (m): 9 > found tc T < tc + m; > hold at(g; T ): > > not found: > > occurs at(A; T ) tc T < tc + m; > > not hold at(g; T ); = agent action(A); PM0 (m) not :occurs at(A; T ): > > > :occurs at(A; T ) tc T < tc + m; > > not hold at(g; T ); > > agent action(A); > not occurs at(A; T ): ; Let P0 (m) = d [ PM0 (m) The use of PM0 (m) for planning is based on the following observation: Let 0 k < m. A sequence = a0 ; : : : ; ak of actions is a plan for g i there is an answer set S of P0 (m) such that for any ai from , ai = fa : a is elementary and occurs at(a; tc + i) 2 S g. (A plan of minimal length can be found by checking existence of answer sets of P0 (1); P0 (2); : : : ; P0 (m) or by varying the depth of a plan is some other way).
final.tex; 15/12/1999; 13:45; p.14
Reasoning Agents in Dynamic Domains
15
Similar technique can be used to look for plans in incomplete domain descriptions. If, for instance, the values of uents f1 ; : : : ; fk in the initial situation are unknown the planning can be done as follows: 1. Expand P0 by the rules: observed(fi ; 0) not observed(:fi ; 0): observed(:fi ; 0) not observed(fi; 0): 2. Find an answer set S of the resulting program P00 ; 3. Let X be the set of propositions of the form occurs at(a; t) from S and check if P00 [ X j=A g. If the test succeeds than X de nes a plan. Otherwise the process can be repeated, i.e. we can look for another answer set. If all answer sets are analyzed and a plan is still not found it does not exist. Alternatively, we can look for conditional plans, incorporate testing of values of uents into planning or use a variety of other techniques. A-Prolog can be used to implement planning modules which are dierent and often substantially more sophisticated than PM0 (m). For instance, the module PM1 (m) consisting of the rst two rules of PM0 (m) and the rules 9 tc T < tc + m; occurs at(A; T ) > > agent action(A); > not other occurs(A; T ); > > = not hold at(g; T ): PM1 other occurs(A; T ) tc T < tc + m; > agent action(A1 ); A 6= A1 ; > > > occurs at(A1 ; T ); > ; not hold at(g; T ): restricts the agent's attention to sequential plans. (These rules can be viewed as A-Prolog implementation of the choice operator from (D. Sacca and C. Zaniolo , 1997).) As before, nding a sequential plan for g of the length bounded by m can be reduced to computing answer sets of P1 (m) = d [ PM1 (m). To illustrate a slightly more sophisticated planning module let us go back to Example 2 and assume that Jack always follows a prudent policy of reporting the loss of his money to the police. This knowledge should be incorporated in Jack's planning module which can be done by adding to it the following rules: occurs at(report; tc) T0 < tc happened(lose(money); T0 ); not reported after(T0 ): reported after(T0 ) T0 < T < t c ; happened(report; T ):
final.tex; 15/12/1999; 13:45; p.15
16
Chitta Baral and Michael Gelfond
Now suppose that Jack can buy his ticket either using a credit card or paying cash, and that he prefers to pay cash if he has enough. To represent this information we introduce two new objects, cash and card and two static causal laws caused(has jack(money); [has jack(card)]) and caused(has jack(money); [has jack(cash)]). Jack's preference will be represented by adding the following rules to his planning module: occurs at(pay(cash); T ) occurs at(get(O); T ); holds at(has jack(cash); T ): occurs at(pay(card); T ) occurs at(get(O); T ); holds at(:has jack(cash); T ): Explaining Observations
Now we demonstrate how answer set programming can be used for nding explanations, i.e. for solving equation 3. This can be achieved by nding answer sets of a program E (m) = d [ EM0 (m) where m determines the time interval in the past the agent is willing to consider in its search for explanations and EM0 (m) is an explanation module consisting of rules 9 unexplained member(l; O); > not holds at(l; tc ): > >
> unexplained: > > occurs at(A; T ) tc ? m T < tc ; = exogenous action(A); > EM0 (m) not :occurs at(A; T ): > > :occurs at(A; T ) tc ? m < T < tc; > exogenous action(A); > > not occurs at(A; T ): ; Let S be an answer set of E (m) and let H0 (m) be the set of statements of the form happened(ai ; t) such that tc ?m t < tc , ai is an elementary exogenous action, occurs at(ai ; t) 2 S and happened(ai ; t) 62 S (i.e. this occurrence of ai has not been explicitly observed). Then H0 (m) is an explanation of O. Moreover, any explanation of O of the depth m can be obtained in this way. As in the case of the planning modules, EM0 (m) can be elaborated and tailored to a particular domain. For instance possible explanations for certain observations may be represented in the agent's knowledge base by a list of statements of the form poss exp(l; a)
final.tex; 15/12/1999; 13:45; p.16
17
Reasoning Agents in Dynamic Domains
where l is a uent literal and a is an exogenous action. (Often such a list can be extracted automatically from the corresponding action description). The following explanation module, EM1 (m), uses this information and looks for explanations of a literal l consisting of an occurrence of a single (possibly compound) action in the interval [t0 ; tc ) where t0 = tc ? m: 9 t 0 T < tc ; occurs at(A; T ) > > poss exp(l; A); > not other occurs(A; T ): > > > other occurs(A; T ) t0 T < tc; >
other occurs(A; T )
t0 T 0 < tc; T 0 6= T; occurs at(A; T 0 ); not happened(A; T 0 ): t 0 T < tc ; t0 T 0 < tc; A 6= A0 ; T 6= T 0; occurs at(A0 ; T 0 ); not happened(A0 ; T 0 ):
> > > = EM1 (m) > > > > > > > > > > ;
Checking the entailment
Now let us consider an agent's reactive component. Implementation of this component requires checking the condition ? j=A currently(l) (5) It can be shown that this can be reduced to checking the condition d j= currently(l) (6) For a very broad class of complete deterministic domain descriptions this condition can be checked by using Prolog or XSB systems with a query currently(l) and a simple modi cation of program d as an input. The modi cation requires addition of information about types of variables in the bodies of some rules. It is needed to make sure that reasoning done by Prolog (XSB) on d is sound and complete with respect to queries of the type currently(l). (The corresponding proof follows the lines of the similar proof in (C. Baral, M. Gelfond and A. Provetti, 1997) and demonstrates that for this input the Prolog interpreter always terminates, does not ounder, and does not require the occur check. This allows to reduce the proof of soundness and completeness of the interpreter to soundness and completeness of the SLDNF resolution.)
final.tex; 15/12/1999; 13:45; p.17
18
Chitta Baral and Michael Gelfond
Finally, let us look at the situation in which the agent found a plan a1 ; : : : ; an, executed action a1 , modi ed ? to record new observations and discovered that its goal, g, is not changed. In this case it seems natural to start the planning with simply checking if (7) ? j=A holds after(g; [an ; : : : ; a2 ]) As before for complete, deterministic, and acyclic domain descriptions (R. Watson, 1999) this can be done by running Prolog interpreter on the query (8) holds after(g; [an ; : : : ; a2 ]; T ) and program c obtained by expanding d by rules 9 impossible(A; S; T ) impossible if (A; P ) > > hold after(P; S; T ): > > possible(A; S; T ) not impossible(A; S; T ): > > holds at(L; T ): holds after(L; []; T ) > > holds after(L; [AjS ]; T ) possible(A; S; T ); > = causes(A; L; P ); hold after(P; S; T ): > > holds after(L; S; T ) caused(L; P ); > > hold after(P; S; T ): > > holds after(L; [AjS ]; T ) possible(A; S; T ); > > holds after(L; S; T ); > not holds after(L; [AjS ]; T ): ; As before typing information should be added to the program to guarantee soundness and completeness of this method.
4. Conclusion We have proposed a model of a simple intelligent agent acting in a changing environment. A characteristic feature of this model is its extensive use of a declarative language A-Prolog. The language is used for describing the agent's knowledge about the world and its own abilities as well as for precise mathematical characterization of the agent's tasks. We demonstrated how the agent's reasoning tasks can be formulated as questions about programs of A-Prolog and how these questions can be answered using various logic programming algorithms. A high expressiveness of the language allows to represent rather complex domains not easily expressible in more traditional languages. (Even higher eciency can be obtained by using disjunctive version of A-Prolog together with the DLV inference procedure (T. Eiter et al., 2000).) The corresponding
final.tex; 15/12/1999; 13:45; p.18
Reasoning Agents in Dynamic Domains
19
knowledge bases have a reasonably high degree of elaboration tolerance. Recent advances in logic programming theory allow to prove correctness of these algorithms for a broad range of domains. New logic programming systems allow their implementation. Four such systems, CCALC, DeResS, DLV, and SMODELS, were demonstrated at the LBAI workshop. These systems were used to nd plans from Example 2 as well as to solve much more complex tasks. Some experimental results aimed at testing the eciency of the system are rather encouraging. The current rate of improvement of the systems performance and rapid advances in our understanding of methodology of programming in A-Prolog allow us to believe in practicality of this approach. Some applications using XSB and CCALC (ccalc, 1999) can be found in (R. Watson, 1999).
References C. Baral. Reasoning about Actions : Non-deterministic eects, Constraints and Quali cation. Proceedings of IJCAI 95, 2017-2023, 1995. C. Baral and M. Gelfond. Logic programming and knowledge representation. Journal of Logic Programming, 12:1{80, 1994. C. Baral, M. Gelfond, and A. Provetti. Reasoning about actions: laws, observations, and hypotheses. Journal of Logic Programming, 31:201{244, 1997. W. Chen, T. Swift, and D. Warren. Ecient top-down computation of queries under the well-founded semantics. Journal of Logic Programming, 24(3):161{201, 1995. P. Cholewinski, W. Marek and M. Truszczynski. Default reasoning system seres. in Proc. KR-96 pp. 518{528, 1996. Y. Dimopoulos, and B. Nebel, and J. Koehler. Encoding planning problems in nonmonotonic logic programs. In Proc. of European Conf. on Planning 1997, pages 169{181, 1997. T. Eiter,W. Faber,G. Gottlob,C. Koch,C.Mateis,N. Leone,G. Pfeifer,F. Scarcello. The DLV system This volume K. Eshghi. Diagnoses as stable models. ETAI Electronic Transactions on Arti cial Intelligence, Vol 2, issue 3-4, 1998, URL http://www.ep.liu.se/ej/etai/ http://www.cs.utexas.edu/users/tag/cc M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In Robert Kowalski and Kenneth Bowen, editors, Logic Programming: Proc. of the Fifth Int'l Conf. and Symp., pages 1070{1080, 1988. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Computing, pp. 365{387, 1991. M. Gelfond and V. Lifschitz. Representing Actions in Extended Logic Programs. In Proc. of Joint International Conference and Symposium on Logic Programming, pp. 559{573, 1992. E. Guinchiglia and V. Lifschitz. An action language based on causal explanation: Preliminary report. In Proc. AAAI-98, pp. 623{630, 1998. V. Lifschitz. Foundations of declarative logic programming. In Gerhard Brewka, editor, Principles of Knowledge Representation, pages 69{128. CSLI Publications, 1996.
final.tex; 15/12/1999; 13:45; p.19
20
Chitta Baral and Michael Gelfond
V. Lifschitz. Two components of an action language. Annals of Mathematics and Arti cial Intelligence, 21:305{320, 1997. V. Lifschitz. Action languages, answer sets and planning. In The Logic Programming Paradigm: a 25-Year perspective, pages 357{373, Springer Verlag, 1999. V. Lifschitz. A causal language for describing actions. In Workshop on logic-based AI F. Lin. Embracing causality in specifying the indirect eects of actions. In Proc. IJCAI-95, pp. 1985{1991, 1995. W. Marek and M. Truszczynski. Stable semantics for logic programs and default theories. In Proc. of NACLP89, pp. 243{256, 1989. W. Marek and M. Truszczynski. Nonmonotonic logics: Context-Dependent Reasoning. Springer, 1993. W. Marek and M. Truszczynski. Stable models and an alternative logic programming paradigm. In The Logic Programming Paradigm: a 25-Year perspective, pages 375{398, Springer Verlag, 1999. N. McCain and H. Turner. A causal theory of rami cations and quali cations. In Proc. of IJCAI 95, pp. 1978{1984, 1995. J. McCarthy and P. Hayes Some philosophical problems from the standpoint of arti cial intelligence. In B. Meltzer and D. Michie, editors, Machine Intelligence, volume 4, pp. 463{502. Edinburgh University Press, Edinburgh, 1969. R. Moore. Semantical Considerations on Nonmonotonic Logic. Arti cial Intelligence, 25(1):75{94, 1985. I. Niemela and P. Simons. Ecient implementation of the well-founded and stable model semantics. In Proc. JICSLP-96, 1996. I. Niemela. Logic Programs with stable model semantics as a constraint programming paradigm. Annals of Mathematics and Arti cial Intelligence, 1999. R. Reiter. A logic for default reasoning. Arti cial Intelligence, 13(1,2):81{132, 1980. M. Shanahan. Solving the Frame Problem: A Mathematical Investigation of the Common Sense Law of Inertia. MIT Press, 1997. V. S. Subrahmanian and C. Zaniolo. Relating stable models and AI planning domains in Proc. ICLP-95, 1995. D. Sacca and C. Zaniolo. Deterministic and Non-Deterministic Stable Models. J. of Logic and Computation 7(5), pp. 555-579, 1997. H. Turner. Representing actions in logic programs and default theories: A situation calculus approach. In Journal of Logic Programming, Vol. 31, No. 1-3, pp. 245{ 298, 1997. A. Van Gelder, K. Ross, and J. Schlipf. The well-founded semantics for general logic programs. Journal of ACM, 38(3):620{650, 1991. R. Watson. Action languages and domain modeling Doctoral Dissertation, Department of Computer Science, University of Texas at EL Paso, 1999. R. Watson. An application of action theory to the space shuttle. In G. Gupta, editor, Lecture Notes in Computer Science - Proc. of Practical Aspects of Declarative Languages '99, volume 1551, pp. 290{304, 1999.
final.tex; 15/12/1999; 13:45; p.20