Rule-Based Agents for the Semantic Web J. Dietrich a , A. Kozlenkov b , M. Schroeder b , G. Wagner c,∗ a Institute
of Information Sciences and Technology, Massey University Palmerston North, New Zealand b Department
c Eindhoven
of Computing, City University, London, UK
University of Technology, Department Information and Technology, Den Dolech 2, 5600 MB Eindhoven, The Netherlands
Abstract Artificial agents, subsuming both robots and software agents, represent a new paradigm in software engineering and Artificial Intelligence. Depending on the technologies used in their implementation, they may exhibit various skills; in particular, they may act more or less autonomously, they may be able to learn and to adapt to a changing environment, and they may be able to pursue their goals pro-actively. An artificial agent is called rule-based, if its behaviour and/or its knowledge is expressed by means of rules. In this paper, we discuss a general architecture for rule-based agents and how it can be realized with the help of Semantic Web languages. We also show how such agents can go live on the Web by presenting an implementation in Mandarax, a Java rule platform. The concept and implementation are complemented by a running example, the portfolio agent. Key words: Agent architecture, agent meta-interpreter, reaction rules, Mandarax.
1
Introduction
An artificial agent is called rule-based, if its behaviour and/or its knowledge is expressed by means of rules. In this paper, we discuss a general architecture for rule-based agents originally proposed in [28,25] and show how it can be realized with the help of Semantic Web languages and the Java rule platform ∗ Corresponding author. Email addresses:
[email protected] (J. Dietrich),
[email protected] (A. Kozlenkov),
[email protected] (M. Schroeder),
[email protected] (G. Wagner).
Preprint submitted to Elsevier Science
8 May 2003
Mandarax [10]. We also discuss how such agents can go live on the Web and show their usefulness in an example application. Example 1 Consider the following example: A personal portfolio software agent monitors the price development of the shares of the portfolio of its owner and reacts in response to significant drops in value, e.g. by sending an alert to its owner. The behaviour of the agent is specified by the following five reaction rules: RR1 If a share price dropped by more than 5% and the corresponding investment is exempt from profit taxes, then sell the investment. RR2 If a share price dropped by more than 5% and the corresponding investment is not exempt from profit taxes, then send an alert with high priority. RR3 If a share price dropped by more than 3% and the corresponding investment is significant and is exempt from profit taxes, then sell the investment. RR4 If a share price dropped by more than 3% and the corresponding investment is significant and is not exempt from profit taxes, then send an alert with high priority. RR5 If a share price dropped by more than 3% and the corresponding investment is not significant, then send an alert. The predicates “ is exempt from profit taxes” and “ is significant” are defined by means of the following two derivation rules: DR1 An investment is exempt from profit taxes, if it is for more than 1 year in the portfolio. DR2 An investment is significant, if the value of the investment in the portfolio is more than 10% of the total value of all investments. To date there exist different approaches, frameworks and platforms to develop agents on different levels of abstraction. For example, web services, the common object request broker architecture [21,22] and the parallel virtual machine [11] cater for the infrastructure of distributed systems, in general, and multiagent systems, in particular. Such systems tackle low-level issues of distributed systems and form the technical basis of high-level conceptual frameworks for designing and developing agent applications. Programming languages such as April [19], or PVM-Prolog [7,8], support distributed computing and declarative programming. Unfortunately, on the highest level of abstraction one finds either agent theories which are not operational, such as the BDI logics of [24], or implemented systems with certain data structures corresponding to beliefs and goals [16,18] but without a formal semantics, such as PRS, dMARS and ARCHON. We will set out to bridge this gap by defining an agent framework and its web-enabled implementation. 2
Agents are situated in an environment and exhibit reactive and possibly proactive behaviour [30]. Rules are natural means to specify these forms of agent behaviour.
2
An Abstract Architecture for Rule-Based Agents
In philosophy and AI, there is a strong tradition to describe rational agents in terms of the three mental components beliefs, desires, and intentions (BDI) [5,6]. While many researchers (see e.g. [24]) follow the philosophical logic tradition in modeling mental state components with the help of highly abstract multi-modal logics, another strand of research takes a more grounded approach by investigating the semantics of mental state components on the basis of their operational semantics in terms of suitable state transition systems. In his seminal paper [26], Shoham coins the term agent-oriented programming (AOP), which is based on the three mental components beliefs, capabilities, and commitments. Notice that in both lists of basic mental state components, two important components are missing: perceptions, e.g. in the form of incoming messages and signals representing communication and environment events, and memory of past events and actions. In fact, although perceptions are temporally not as stable as beliefs, they form the basis of reactive behavior, and are therefore more fundamental than many other mental components such as desires and intentions. In the sequel, we will present an abstract agent architecture based on knowledge and perceptions.
2.1
Knowledge- and Perception-Based (KP) Agents
While we can associate implicit notions of goals and intentions with any ”intentional system”, be it natural or artifical (according to Dennett [9]), it is only the explicit notion (of a goal or an intention) which counts for an artificial agent from the programming point of view. Having an explicit goal requires that there is some identifiable data item in the agent program which represents exactly this goal, or the corresponding sentence. Having explicit goals makes only sense for an agent, if it is capable of generating and executing plans in order to achieve its goals. Simple agents, however, which are purely reactive, do not generate and execute plans for achieving explicit goals assigned to them at run time (i.e. do not behave pro-actively), but only react to events according to their reactive behaviour specification. Of course, a reaction pattern can be viewed as encoding a certain task or goal which is implicit in it. But unlike explicit goals, such implicitly encoded tasks have to be assigned to the agent at design time by hard-coding them into the agent system. 3
So what are the basic components shared by all important – and even very simple – types of agents? At any moment, the state of any such agent comprises beliefs (about the current state of affairs) and perceptions (of communication and environment events), and possibly other components such as memory (of past events and actions), commitments, tasks/goals, intentions, emotions, etc. While the agent’s beliefs are represented in its knowledge base (KB), which may also contain derivation rules defining derived beliefs, its perceptions are represented in its perception event queue (EQ). We obtain the following picture: agent state = beliefs + perceptions + ... or, formally, A = h KB, EQ, . . . i And the state of a purely reactive agent may very well consist of just these two components and nothing else. The core of any agent is its knowledge base. Technically, the beliefs in a knowledge base are expressed in some representation language. Example 2 For instance, beliefs may have the form of simple attribute-value equations like myName = 007 shareprice[BMW] = 60.45 such as in a conventional program, or atomic sentences like i am(007) shareprice( BMW, 60.45) such as represented in the table rows of a relational database, or the facts of a Prolog program. In certain cases, beliefs may have to be qualified, e.g. by a degree of uncertainty, a valid-time span, or a security classification, like in price(sunshine Ltd, rise) : very likely strategy(cautious) @ [2001/05/01–∞] price( dodgyCorp, fall) / top secret Perceptions may refer to (non-communicative) environment events in the form of percept expressions, possibly labeled with the name of the responsible sensor subsystem, such as h observed( dog(approaching, 300m):0.7), camera 1 i,
4
or to communication events in the form of typed message expressions labeled with the name of the message sender: h tell( price( BMW, rise), aFriend i and FIFO-buffered in the event queue EQ. Thus, all interesting types of artificial agents are knowledge- and perceptionbased (KP), but only the more sophisticated (pro-active) agents will have (explicit) goals and intentions. A KP agent is a software-controlled system whose state comprises beliefs and perceptions. The basic functionality of a KP agent comprises a knowledge subsystem, a perception (event handling) subsystem, and an action subsystem (for performing actions in response to events according to the agent’s reaction patterns). The basic behaviour of a KP agent is determined by its reaction patterns whose semantics can be formally defined in terms of a high-level state transition system, where perceptions and reactions are nondeterministically interleaved. A general model of KP agents will have to account for the syntactic and semantic variety of simple and qualified beliefs. Thus, standard first order logic is certainly not adequate for the knowledge system of a KP agent. Below, we will present an abstract architecture for KP agents, which has been called vivid agents in [28]. This architecture is generic in the sense that it treats the knowledge base KB and the event queue EQ as black-boxes, but requires (1) that the KB of an agent is a conservative extension of a relational database, and (2) that reaction patterns are specified by means of reaction rules. In the agent-oriented programming language AGENT0, defined in [26], an agent is specified by its initial beliefs, its ‘capability rules’, and its ‘commitment rules’. AGENT0 agents can be viewed as a particular form of KP agents: their reactions to incoming messages are specified by their ‘commitment rules’ (which are a particular form of reaction rules).
2.2
Vivid Knowledge Systems
One can distinguish knowledge systems of various expressivity and complexity (see Figure 1). The most fundamental knowledge system is the system of relational databases which allows to represent only atomic sentences (in the 5
Relational Database Relational Factbase Factbase with derivation rules Temporal, Disjunctive, Fuzzy Factbases Fig. 1. Knowledge systems with different complexity.
form of table rows) and is based on the assumption of complete information (also called ‘Closed-World Assumption’). Definition 1 Abstract Knowledge System Let L be a set of formulas, and L0 its restriction to closed formulas (sentences). An abstract knowledge system [29] consists of three languages and two basic operations: a knowledge representation language LKB , a query language LQuery , and an input language LInput ; an inference relation `, such that X ` F holds if F ∈ L0Query can be inferred from X ∈ LKB ; and an update operation Upd, such that the result of updating X ∈ LKB with F ∈ L0Input is the knowledge base Upd(X, F ). An abstract knowledge system is called vivid in [27], if it is a conservative extension of the knowledge system of relational databases. Positive vivid knowledge systems use a general ‘closed-world’ (or completeness) assumption, whereas general vivid knowledge systems employ predicate-specific completeness assumptions and possibly two kinds of negation. Relational databases can be extended to a general vivid knowledge system, called relational factbases, by allowing for literals instead of atoms as information units. Further important examples of positive vivid knowledge systems are temporal, fuzzy, and disjunctive factbases. All these kinds of knowledge bases can be extended to rule knowledge bases by adding derivation rules of the form F ← G [29]. Example 3 For the knowledge system of the portfolio agent, the input language LInput consists of appropriate instantiations of the atomic formulas investment( CompanySymbol, CompanyName, DateTime, Quantity) and shareprice( CompanySymbol, LastSharePrice), as well as their negations (for expressing deletions/retractions), where the predicate investment is used to store the investments to be managed by the portfolio agent at a given time and shareprice is used to record the current share price. The input language defines what the agent can be told (i.e. what it is able to assimilate into its KB); the query language defines what the agent can be asked. A knowledge base consisting of a consistent set of ground literals, considered as positive and negative facts, is called a (relational) factbase. In a factbase, the completeness assumption does not in general apply to all predicates, and therefore in the case of an incomplete predicate, negative information is stored 6
along with positive. This contrasts with relational databases which allow only positive information and infer negated queries on the basis of a general completeness (or ‘Closed-World’) assumption. The schema of a factbase stipulates which predicates are complete by means of a special set CPred. Explicit negative information is represented by means of a strong negation ¬. Factbases with derivation rules are also called extended logic programs. Their syntax is defined as follows. Definition 2 Syntax of Extended Logic Programs An extended logic program is a set of rules of the form L0 ← L1 , . . . , Lm , not Lm+1 , . . . , not Ln
(0 ≤ m ≤ n)
where each Li is an atom A or its strong negation ¬A. Definition 3 Inference in Factbases As a kind of natural deduction from positive and negative facts an inference relation ` between a factbase X and an if-query is defined in the following way: (not)
X ` not p(c) if p(c) 6∈ X
(¬)
X ` ¬p(c)
(CWA) X ` ¬p(c)
if ¬p(c) ∈ X if p ∈ CPred & X ` not p(c)
where p(c) stands for an atomic sentence with predicate p and constant (tuple) c. The negations ¬ and not are called strong and weak since the coherence principle holds and thus X ` ¬F implies X ` not F . A factbase X answers an if-query F by yes if X ` F , by no if X ` ¬F , and by unknown otherwise. Definition 4 Updates in Factbases Updates are recency-preferring revisions: X ∪ {p(c)}
Upd(X, p(c)) := Upd(X, not p(c)) :=
if p ∈ CPred
X − {¬p(c)} ∪ {p(c)} else
X − {p(c)}
if p ∈ CPred
X − {p(c)} ∪ {¬p(c)} else
The extension of factbases by adding derivation rules leads to extended logic programs with two kinds of negation. Inference in extended logic programs 7
can be defined either by a fixpoint semantics [13,23,1], or model-theoretically by preferential entailment based on stable generated partial models [15].
2.3
Reaction Rules
Reaction rules encode the reaction patterns of KP agents. They specify which actions to take in response to perception events created by the agent’s perception subsystems, and to communication events created by communication acts of other agents. They are an extension of the concept of event-conditionaction (ECA) rules known from ‘active’ databases [20,17]. We may distinguish between mental, physical and communicative reaction rules. Definition 5 Reaction Rule Let S (the ‘sender’) be an agent term, evaluating to an agent ID, and A (the ‘addressees’) be an agent set term, evaluating to a non-empty set of agent IDs. Let LEvt , LCom and LAct be an environment event language, a communication event language, and a physical action language. Then rules of the form Eff
←
recvMsg(ε(U ), S), Cond
Mental
do(α(V )), Eff
←
recvMsg(ε(U ), S), Cond
Physical
sendMsg(η(V ), A), Eff
← recvMsg(ε(U ), S), Cond
Communicative
where Cond ∈ LQuery , Eff ∈ LInput , ε(U ) ∈ LEvt ∪ LCom , α(V ) ∈ LAct , η(V ) ∈ LCom are called mental, physical and communicative reaction rules. The event condition recvMsg(ε(U ), S) is a test whether the event queue of the agent contains a perception expression of the form ε(U ) representing either a signal/percept created by some sensor of the agent or an incoming message sent by another agent identified by S, where ε(U ) represents an environment or a communication event, and U is a suitable list of parameters. The (mental) condition Cond refers to the current (mental) state of the agent, and the (mental) effect Eff specifies a postcondition corresponding to an update of the current state. For physical reactions, LAct is the language of all elementary physical actions available to an agent; for an action α(V ), do(α(V )) calls a procedure realizing the action α with parameters V . For communicative reactions, sendMsg(η(V ), A) sends the message η(V ) with parameters V to the addressee(s) A. Example 4 The portfolio agent described in example 1 on page 2 is only involved in communication; it has no sensors, and is not capable of performing any physical action, so LEvt = ∅ and LAct = ∅. 8
In general, reactions are based both on perceptions and on knowledge. Immediate reactions do not allow for deliberation. They are represented by rules with an ‘empty’ (mental) state condition, i.e. C ond = true. Timely reactions can be achieved by guaranteeing fast response times for checking the precondition of a reaction rule. This will be the case, for instance, if the precondition can be checked by simple table look-up such as in relational databases or factbases. Reaction rules are triggered by events. The agent interpreter continuously checks the event queue of the agent. If there is a new event message, it is matched with the event condition of all reaction rules, and the epistemic conditions of those rules matching the event are evaluated. If they are satisfiable in the current knowledge base, all free variables in the rules are instantiated accordingly resulting in a set of triggered actions with associated mental effects. All these actions are then executed, leading to physical actions and to sending messages to other agents, and their mental effects are assimilated into the current knowledge base.
2.4
Specification of KP Agents
A KP agent A = hX, EQ, RRi consists of a knowledge base X associated with a (vivid) knowledge system, an event queue EQ, and a set RR of reaction rules. We assume that RR is a consistent encoding of reactive behaviour in the sense that whenever a set of reaction rules is triggered by an event, the resulting actions are compatible with each other, i.e. the mental effects (or post-conditions) associated with them do not cancel out (are consistent with) each other. We now briefly review the execution semantics of KP agents, which is based on a perception-reaction-cycle.
2.5
The Execution Semantics of KP agents
The execution semantics of KP agents is based on a perception-reaction-cycle, which continuously carries out the following steps: (1) It fetches a message/event from the event queue EQ, (2) selects the reaction rules triggered by that message/event, (3) refines this selection to those rules, whose conditions are satisfied with respect to the current state of the knowledge base X, (4) and finally carries out the actions specified in the selected reaction rules (sending messages and/or executing other actions the agent is capa9
ble of) and updates the knowledge base X by assimilating all the effects/postconditions specified in the rules along with the actions. This perception-reaction-cycle can be formalized by a high-level transition system with two basic transition types: perception and reaction (see [28,25]).
3
Implementing a KP Agent Example with Mandarax
We now describe Mandarax [10], a Java rule platform, which we use to implement the simplest case of KP agents where beliefs are atomic sentences.
3.1
The Mandarax Rule Mark-Up Language XKB
In principle, a KP agent may be specified using • a knowledge base schema definition language (such as RDFS) for specifying the schema of the agent’s mental state; • a suitable knowledge representation language (such as RDF) for specifying its factual (extensional) knowledge; • a suitable language for integrity constraints (e.g., RuleML) to exclude nonadmissible mental states, • a suitable language for derivation rules (e.g., RuleML) for specifying terminological and heuristic (intensional) knowledge; • a suitable languages expressing reaction rules (e.g., RuleML) for specifying the agent’s behaviour in response to communication and environment events. Mandarax provides a rule mark-up language, called XKB, which allows to express rules and facts that may refer to Java objects in a way that is compatible with RuleML. There are some fundamental issues to be considered when using XML to represent knowledge in an object-oriented format. Since arbitrary objects are considered as (constant) terms, and these objects typically have complex/objectvalued attributes, any XML representation must be able to serialize/de-serialize arbitrary objects. The easiest approach is to use reflection, which is supported by many OO languages, such as Java and Smalltalk. The Mandarax XKB drivers use this approach, and the latest Java version (JDK 1.4) contains utility classes for this kind of object persistency. However, this approach is not appropriate to represent/store binary large objects such as images. On the other hand, these objects are in particular interesting for many applications. 10
For instance, it is possible to express rules like the following ones in Mandarax: “If an image is black and white, then send it to the LaserPrinter 1. If an image has more than 256 colors and is larger than A4, then send it to the ColorPrinter 2.”. In these rules, the image would be an image object (an instance of a Java class representing image information such as ImageIcon). The XKB format has not been designed as a general-purpose knowledge specification language, but rather as a format for serializing Mandarax knowledge bases in order to support persistency and communication services. Therefore, XKB has a number of special features: (1) In XKB, individual constants, variables, functions and predicates are typed. Types are Java classes including interfaces and wrapped primitives. These types are only referenced by name. It is the responsibility of the application to resolve these references. In practical terms, this means that the respective classes must be in the classpath before a knowledge base is loaded from an XKB source. This task is facilitated by the fact that Java classes use a URI-like (unique resource identifier) naming scheme. (2) XKB supports clause sets, a language construct that allows to include external facts and rules, such as query results from a remote database. (3) XKB can (de-) serialize arbitrary Java objects based on Java reflection. (4) XKB also (de-) serializes Java methods. (5) XKB is not a fixed format but a framework based on atomic mappings (adapters) associating objects in knowledge bases (terms, rules, etc.) and XML elements (using their JDOM object representation). This is to make XKB as open and flexible as possible. Some common configurations of adapters are frozen and versioned as XKB drivers. Figure 2 shows a fragment from an XKB document specifying the rule DR1 from example 1. The XKB syntax is using XML tags such as imp, body, head and others, from RuleML [3] for marking up a rule and its constituents. In contrast to RuleML 0.8, individual constant and variable elements (tagged with ind, resp. var) have a type subelement, which has an attribute class name referring to a Java class, like in lines 8–10 and 22–24 of Figure 2. Also, in Mandarax, objects are considered as individual constants. An object tag is used for specifying an object identifier by means of an id attribute, and a properties subelement is used for including the properties of the object. When objects are imported from an XKB file, Mandarax employs a cache to ensure that exactly one object is associated with all obj elements having the same ID. The cache is built upon a hash table that associates IDs with objects. This approach is popular in middleware systems linking object-oriented applications and relational databases. 11
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46