Bayesian Agencies on the Internet

0 downloads 0 Views 402KB Size Report
namely Enterprise JavaBeans, to develop and deploy our Bayesian agencies on the Internet. 1. Introduction .... developed a message-passing algorithm for singly connected networks .... Web: http://bayes.cs.ucla.edu/csl_papers.html. [12] N. R. ...
IAWTIC 2001 Proceedings/ISBN 0858898489: M. Mohammadian (Ed.)

Bayesian Agencies on the Internet Anet Potgieter and Judith Bishop Department of Computer Science, University of Pretoria, Pretoria, 0002, South Africa E-mail: [email protected], [email protected]

Abstract Agencies consisting of simple agents can be deployed on the Internet to solve problems that are too complex for single autonomous Internet agents. We deploy Bayesian agencies on the Internet to achieve emergent belief propagation through the collective behavior of simple agents organized into heterarchies of agencies. Collectively the agents use probabilistic reasoning to deal with the uncertainty and non-determinism that they will encounter on the Internet. The simplicity of the agents as well as the minimal interaction between them allows us to use a commercially available application framework, namely Enterprise JavaBeans, to develop and deploy our Bayesian agencies on the Internet.

1. Introduction The Internet is characterized by its uncertainty and non-determinism. Issues that must be addressed include uncertainty, asynchronicity and locality. Agencies consisting of simple agents can be deployed on the Internet to solve problems that are too complex for single autonomous Internet agents. Our research focuses on the development of a bottom-up coordination model for agents in open environments. Section 2 defines all the underlying technologies that we use to address uncertainty, asynchronicity and locality. We define agents, agencies, heterarchies, intelligence and artificial life. We further describe Bayesian belief propagation and Bayesian learning algorithms. These algorithms are very complex and difficult to distribute on the Internet, but are commonly used to reason in the presence of uncertainty. In Section 3, we discuss the software engineering approach that is followed in this research. We use the coordination model approach that evolved from object-oriented and componentbased software engineering for societies of agents in open environments. In Section 4, we describe the coordination model that we developed for the agencies that we deploy on the Internet. These agents are organized into heterarchies of agencies and they handle uncertainty by collectively propagating belief in the presence of evidence from their environment. As our agents are simple and the communication between them minimal, we deploy these agents on the Internet by using Enterprise JavaBeans. In Section 5 we discuss applications, and we give our conclusion and describe future research in Section 6.

318

IAWTIC 2001 Proceedings/ISBN 0858898489

2. Distributing intelligence on the Internet 2.1 Agents, agencies and heterarchies Jennings, Sycara and Wooldridge [1] define an agent as follows: “An agent is a computer system situated in some environment that is capable of flexible, autonomous action in order to meet its design objectives.” Brooks [2] and Maes [3] refer to the independent components of intelligence within a single creature as “agents”. Maes [3] implements “competence modules” as simple mindless entities organized into single autonomous agents using Behavior Nets. In “Society of Mind” [4], Marvin Minsky describes the concepts of agents, agencies and heterarchies. An agency consists of a society of simple agents. The intelligence lies in the behavior of agencies and not in the individual agents. If structured into a heterarchy, the agencies can collectively solve complicated problems. We exploit this implicit power of heterarchies in our research. Our work is based on Minsky’s heterarchies of agencies [4]. We do not mix the concepts of agents and agencies. Our agents are simple, and together form agencies, organized into heterarchies. Collectively the agents can solve problems that are too complex for single autonomous agents. We define the relationship between agents, agencies and heterarchies as follows: An agency consists of a society of agents that inhabit some complex dynamic environment, where the agents collectively sense and act in this environment so that the agency accomplishes what its composite agents set out to accomplish by interacting with each other. If agents in a society belong to more than one agency, the set of “overlapping” agencies forms a heterarchy. Collective intelligence of a heterarchy emerges through the interaction of the agents within the agencies in the heterarchy. 2.2 Intelligence Brooks [2] characterizes intelligence in terms of situatedness, embodiment, intelligence and emergence. Our agents are situated in the Internet. Their actions are part of dynamic interaction with the Internet environment (embodiment). Our agencies are observed to be intelligent (intelligence). The intelligence of our agencies emerges from the interaction between the “mindless” agents and the Internet environment, and between themselves and other agencies (emergence). Traditional artificial intelligence attempts to model the world, and then reason about it using complex control strategies. The control strategies can either be centralized or distributed. The traditional distributed control strategies employ computation units that are very complex and with a high communication overhead. Brooks [2] argues that the world is its own best model, and it cannot be modeled successfully. An agent must be situated in the world and must react to inputs from the world, rather than attempt to model the world internally. Just as agents can be

319

IAWTIC 2001 Proceedings/ISBN 0858898489

situated in physical environments, agents can be situated in the Internet and they must be able to handle the dynamics and uncertainty in this environment. 2.3 Artificial life On the Artificial Life website (http://alife.org/index.php?page=alife&context=alife), artificial life is described follows: “Artificial Life is often described as attempting to understand high-level behavior from lowlevel rules; for example, how the simple rules of Darwinian evolution lead to high-level structure, or the way in which the simple interactions between ants and their environment lead to complex trail-following behavior. Understanding this relationship in particular systems promises to provide novel solutions to complex real-world problems, such as disease prevention, stock-market prediction, and data-mining on the Internet.“ Examples of artificial life research include Aintz (http://www.daimi.au.dk/~kroll/alife/html/), where the emergent properties in a model of ant foraging is studied. Dorigo, Di Caro and Gambardella [5] developed algorithms based on an ant colony for collective optimization. Following this new trend, we developed a model for collective belief propagation in Bayesian networks. Ants are capable of finding a shortest path from a food source to the nest given a set of external constraints. The variables of interest in their problem domain are the different paths between the different food sources and the nest, as well as the amount of pheromone on each path. The constraints are obstacles that they might encounter on the way. Ants “learn” and maintain the shortest path between the nest and a food source by using pheromone trails. Ants deposit a certain amount of pheromone while walking, and each ant probabilistically prefers to choose a path rich in pheromone rather than a poorer one [5]. The probability that an ant chooses a particular path can be expressed as the conditional probability of that choice, given the amount of pheromone on the path. Only through collective behavior do ants achieve their global goal namely to gather food using the shortest paths between the food sources and the nest. We deploy simple agents on the Internet that use simple interactions to collectively distribute and maintain a Bayesian network. Only through collective behavior do our agents solve the joint probability distribution of the Bayesian network distributed between them in response to evidence received from the environment. 2.4 Bayesian networks Bayesian networks provide a powerful technology for probabilistic reasoning and statistical inference that can be used to reason in uncertain environments such as the Internet. A Bayesian network (BN) represents a set of random variables in an application domain. The nodes represent variables of interest. The links represent informational or causal dependencies among the variables. The dependencies are given in terms of conditional probabilities of states that a node can have given the values of the parent nodes. Each probability reflects a degree of belief rather than a frequency of occurrence.

320 317

IAWTIC 2001 Proceedings/ISBN 0858898489

Figure 1 illustrates a Bayesian network that models the relationships between customer behavior, vendor behavior, perceived security and electronic exchanges on the Internet. In this simple model, an electronic exchange (N1) is related to the level of trust in a vendor (N2), the perceived security of the transaction (N3) and a customer profile (N4). Customers are profiled only by gender (N11) and income (N12). Security is described in terms of degree of concern caused by credit card usage (N10) or third party account usage (N9). Trust in a vendor depends on his reliability (N8), convenience (N7), competitiveness (N6) and ease of access to descriptive product information (N5). The degree of belief in the absence of evidence is indicated on each of the nodes below.

Figure 1. Electronic Exchange Bayesian Network

321

IAWTIC 2001 Proceedings/ISBN 0858898489

Dechter [6] uses the following notation: Let X = {X1, …, Xn} be a set of random variables over multivalued domains, D1, …, Dn . A BN is a pair (G, P) where G is a directed acyclic graph and P = {Pi}. Pi is the conditional probability matrices associated with Xi, Pi = {P(Xi | pa(Xi))}. An assignment (X1 = x1, …, Xn = xn ) can be abbreviated to x = (x1, …, xn ). The BN represents a global joint probability distribution over X having the product form n

P (x1, …, xn ) = П i=1 P (xi | x pa(x ) )

(1)

i

The example Bayesian network represents the joint probability P(N1,N2…..N12) = P(N1) P(N2 | N1) P(N3 | N1) P(N4 | N1) P(N5 | N2) P(N6 | N2) P(N7 | N2) P(N8 | N2) P(N9 | N3) P(N10 | N3) P(N11 | N4) P(N12 | N4)

(2)

From (1) and (2) it can be seen that the global distribution is described in terms of local distributions. Each node’s local information consists of the arcs originating and terminating from it and the local conditional distribution. The localized nature of Bayesian networks makes this technology ideal for distributed implementation. 2.4.1 Belief propagation Belief Propagation is the process of finding the most probable explanation (MPE) in the presence of evidence (e). Dechter [6] defines the MPE as the maximum assignment x in n max x P(x) = max П P (x | x ,e ) (3) x i=1 i pa(x ) i Belief propagation is NP-hard [6]. A number of algorithms have been proposed. Judea Pearl developed a message-passing algorithm for singly connected networks (http://www.cs.cmu.edu/~mkant/Public/AI/util/areas/reasonng/probabl/bayes/bayes.aug). This algorithm was extended to general multi-connected networks by different researchers. Three main approaches exist, namely Judea Pearl’s cycle-cutset approach (also referred to as conditioning), the tree-clustering approach and the elimination approach. The cycle-cutset and tree-clustering approaches work well only for sparse networks with small cycle-cutsets or clusters [6]. Elimination is based on non-serial dynamic programming algorithms [6], which suffer from exponential space and exponential time difficulties. Dechter [6] combined elimination and conditioning in order to address the problems associated with dynamic programming. The belief propagation algorithms for general multi-connected networks generally have two phases of execution. In the first phase, a secondary tree is constructed. This can for example be a “good” cycle-cutset used during conditioning [7] or an optimal junction tree used by treeclustering algorithms [8]. In the second phase, the secondary tree structure is used for inference. Becker et al. argue that finding a “good” cycle-cutset is NP-complete [7] and finding an optimal junction tree is also NP-complete [9]. A number of approximation algorithms were developed to find the secondary trees, as in [7,9]. Figure 2 illustrates the results of belief propagation in the presence of evidence. The evidence nodes are circled. In the presence of a male customer with a high income that uses a third party

322

IAWTIC 2001 Proceedings/ISBN 0858898489

account rather than a credit card and who views the vendor as convenient, the probability that an electronic transfer will be done rises from 0.3 to 0.69.

Figure 2. Belief Propagation in the presence of evidence

Our agents distribute a single- or multi-connected Bayesian network amongst themselves and perform collective belief propagation based on an extension of Judea Pearl’s message passing algorithm. 2.4.2 Bayesian learning Bayesian learning can be viewed as finding the local maxima on the likelihood surface defined by the Bayesian network variables [10]. Assume that the network must be trained from D, a set of data cases D1,..., Dm generated independently from some underlying distribution. In each data case, values are given for some subset of the variables; this subset may differ from case to case – in other words, the data can be incomplete. Russel et al. [10] describes the learning task as the calculation of the parameters ω of the conditional probability matrices that best model the data. If the assumption is made that each setting of ω is equally likely a priori, the maximum likelihood model is appropriate. This means that Pω (D) must be maximized. In other words, the probability assigned by the network to the observed data when the parameters of the conditional probability matrices are set to ω must be maximized.

323

IAWTIC 2001 Proceedings/ISBN 0858898489

Russel et al. [10] describes a local gradient-descent learning algorithm. Another learning algorithm that is widely used is called the EM algorithm. It is also possible to learn the structure of a Bayesian network from data [11]. Agents can collectively self-organize themselves into a heterarchy of agencies representing a Bayesian network. The structure of this network can either be known, or learnt from data. Our current implementation does not include learning. Learning will however form a very important part of our future research. 3. Coordination models Agent-oriented software engineering provides a new software engineering approach to the analysis, design and implementation of complex software systems. Using this approach, a complex system is decomposed into multiple interacting autonomous agents. Agent-oriented software engineering extends existing paradigms such as the object-oriented and componentbased software engineering approaches [12]. Zambonelli [13] claims that most current agent-oriented methodologies are ill-suited to open environments, as they do not adequately address agencies. Zambonelli developed a coordination model for software engineering suited to open environments, which makes provision for global laws that agents in an agency must obey when interacting with other agents. Zambonelli’s model consists of three elements, namely · · ·

the coordinables: entities whose mutual interaction is ruled by the model (the agents); the coordination media: the abstractions enabling agent interactions, as well as the core around which components are organized. Examples include semaphores, monitors and blackboards; the coordination laws: the behavior of coordination media in response to interaction events.

We developed a coordination model for collective belief propagation. As our agents are simple, and the interaction patterns between them are pre-determined, we use an existing application framework, namely Enterprise JavaBeans to develop and deploy our agents. We refer to our agents as Bayesian agents, and our agencies as Bayesian agencies. In the next section the coordination model elements will be applied to the Bayesian agencies. 4. A coordination model for Bayesian agencies 4.1 The coordinables We use a heterarchy of Bayesian agencies to implement a Bayesian network. Our Bayesian agents implement the nodes in this Bayesian network. We implement our agents as message beans. Each agent has local information to define the incoming and outgoing arcs from the node. It also maintains the local probability distribution for that node using an entity bean. Belief propagation is transformed into the collective behaviors of agents organized into agencies

324

IAWTIC 2001 Proceedings/ISBN 0858898489

according to the interdependencies between the nodes in the underlying Bayesian network. Each agency implements a subtree of the underlying Bayesian network. The rules for identifying the subtrees for the agencies in a heterarchy are as follows: · any node in the Bayesian network must be implemented by at least one agent; · all subtrees must be singly-connected; · a subtree cannot contain some children of a node. Children of a node can be included in a subtree only if all the children of that node are included. The identification of subtrees is done manually in our current implementation. Figure 3 illustrates a possible configuration of subtrees, agents and agencies for the Bayesian network in figures 1 and 2. Each subtree below consists of a parent node and its children nodes.

Figure 3. Bayesian Agencies implementing the Electronic Exchange Bayesian Network

An optimal grouping of agents into agencies will result in a heterarchy of agencies consisting of the minimum number of agencies that satisfy all of the rules for identifying the subtrees for the agencies. A single agency will optimally group the agents in figure 3.

325

IAWTIC 2001 Proceedings/ISBN 0858898489

4.2 The coordination media Bayesian agents can only propagate beliefs between themselves and their direct neighbors along links defined by the structure of the underlying Bayesian network. Communication between agents is handled within the EJB Application Framework. 4.3 The coordination laws The coordination laws are defined by collective belief propagation in response to evidence gathered from the environment. Belief propagation is localized within the agencies. The agents in each agency of a heterarchy cooperate to find the local MPE of the subtree of that agency in the presence of evidence received from the environment by its agents. The agents within an agency communicate λ and Π messages amongst themselves as in Judea Pearl’s message passing algorithm. An agent that belongs to more than one agency must contribute to the local MPE’s of all the agencies it belongs to in such a way that the maximum local MPE of each agency is found. Collectively the agents in a heterarchy of agencies will maximize the global MPE of the underlying Bayesian network. If the configuration of subtrees, agents and agencies is optimal, the global MPE will emerge faster than in a configuration that is not optimal. 5. Applications Heterarchies of Bayesian agencies can be distributed on the Internet to perform complex intelligent tasks. Agents in a heterarchy of Bayesian agencies can for example collaborate to analyze profiles of market participants to alert them to opportunities, such as swapping the inventory of one supplier that's low on a particular product with that of another supplier holding a surplus. Agents in a heterarchy can collectively analyze usage patterns to predict future shortages from shelf lives of products so that timely purchases can be advised. Agents can also collectively analyze supplier and vendor behavior patterns in supply chains extended onto the Internet. Other applications include the collective data mining of distributed databases on the Internet or the collective mining of knowledge or data “trapped” within virtual enterprises on the Internet. 6. Conclusion and future research We have presented a coordination model based on artificial life principles for emergent belief propagation between simple agents in a heterarchy of agencies that collectively implement a Bayesian network on the Internet. We want to extend the coordination model to handle incremental emergent learning from evidence received from the environment. Further research will include the self-organization of agents into agencies. The agents must be able to organize themselves into an optimal heterarchy consisting of the minimum number of agencies that satisfy all of the rules for constructing a heterarchy of agencies for a Bayesian network.

326

IAWTIC 2001 Proceedings/ISBN 0858898489

References [1]

N. R. Jennings, S. Sycara & M. Wooldridge, A Roadmap of Agent Research and Development, International Journal on Autonomous Agents and Multi-Agent Systems, Vol. 1, Nr. 1, 7-38, 1998. Retrieved July 11, 2000, from the World Wide Web: http://www.ecs.soton.ac.uk/~nrj/pubs.html#1998

[2]

R. A. Brooks, Intelligence without Reason, MIT AI Memo 1293, April 1991. Retrieved July 18, 2000, from the World Wide Web: http://www.ai.mit.edu/people/brooks/papers.html

[3]

P. Maes, How to Do the Right Thing, MIT AI Memo 1180, 1989. Retrieved July 18, 2000, from the World Wide Web: http://agents.www.media.mit.edu/groups/agents/publications/

[4]

M. Minsky, The Society of Mind, 1st Touchstone ed., Simon and Schuster Inc., New York, 1988.

[5]

M. Dorigo, G. Di Caro & L. M. Gambardella, Ant Algorithms for Discrete Optimization, Artificial Life, Vol. 5, Nr. 2, 137-172, 1999. Retrieved May 5, 2001, from the World Wide Web: http://iridia.ulb.ac.be/~mdorigo/ACO/TSP

[6]

R. Dechter, Bucket Elimination: A Unifying Framework for Probabilistic Inference, Uncertainty in Artificial Intelligence, UA196, 211-219, 1996. Retrieved October 8, 2000, from the World Wide Web: http://www.ics.uci.edu/~dechter/publications/

[7]

A. Becker, R. Bar-Yehuda & D. Geiger, Randomized Algorithms for the Loop Cutset Problem, Journal of Artificial Intelligence Research, Vol. 12, 219-234, 2000. Retrieved March 7, 2001 from the World Wide Web: http://www.cs.washington.edu/research/jair/abstracts/becker00a.html

[8]

F. Jensen, F. V. Jensen & S. L. Dittmer, From Influence Diagrams to Junction Trees, Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, R. L. Mantaras & D. Poole (Ed.), Morgan Kaufman Publishers, San Fransisco, 367-373, 1994. Retrieved February 13, 2001, from the World Wide Web: http://www.cs.auc.dk/research/DSS/abstracts/jensen:jensen:dittmer:94.html

[9]

A. Becker & A. Geiger, A Sufficiently Fast Algorithm for Finding Close to Optimal Junction Trees, Proceedings of the Twelfth Conference on Artificial Intelligence, Morgan Kaufman, 81-89, 1996. Retrieved March 8, 2001 from the World Wide Web: http://www.cs.technion.ac.il/~dang/

327

IAWTIC 2001 Proceedings/ISBN 0858898489

[10] S.J. Russel, J. Binder, D. Koller & K. Kanazawa, Local Learning in probabilistic networks with hidden variables, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp 1146-1152, August 1995. Retrieved September 19, 2000, from the World Wide Web: http://robotics.stanford.edu/~koller/papers/apnijcai.html [11] J. Pearl & S. Russel, Bayesian Networks, Technical Report R-277, UCLA Cognitive Systems Laboratory, November 2000. Retrieved May 5, 2001, from the World Wide Web: http://bayes.cs.ucla.edu/csl_papers.html [12] N. R. Jennings & M. Wooldridge, Agent-Oriented Software Engineering, Handbook of Agent Technology, J. Bradshaw (Ed.), AAAI/MIT Press, 2000. Retrieved January 23, 2001, from the World Wide Web: http://www.elec.qmw.ac.uk/dai/pubs/#2000 [13] F. Zambonelli, N. R. Jennings, A. Omicini & M. Wooldridge, Agent-Oriented Software Engineering for Internet Applications, Coordination of Internet Agents, A. Omicini, F. Zambonelli, M. Klusch & R. Tolksdorf (Ed.), Springer Verlag, 2000. Retrieved January 23, 2001, from the World Wide Web: http://www.csc.liv.ac.uk/~mjw/pubs/

328