Int. J. Agent-Oriented Software Engineering, Vol. x, No. x, xxxx
1
An Architecture for Exception Management in Multi-Agent Systems Eric Platon1,2,∗ , Nicolas Sabouret2 , and Shinichi Honiden1 1
National Institute of Informatics, Sokendai 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430 Tokyo, Japan 2 Laboratoire d’informatique de Paris 6 104, Avenue du Pr´esident Kennedy, 75016 Paris, France E-mail: {platon, honiden}@nii.ac.jp,
[email protected] ∗ Corresponding author. Abstract: Multi-agent systems (MAS) are open, heterogeneous, and distributed software systems of autonomous agents. The management of exception differs in MAS from what is known in usual engineering approaches, owing to specific situations to handle, such as the agent death, knowledge inconsistencies, or collaborative handling. Existing work does not fully address the properties of MAS, notably agent autonomy, and the mechanisms related to exceptions are often ad hoc. In this article, we define the concept of agent exception so as to satisfy the characteristics of the agent paradigm, and we propose a MAS architecture to support design and development of agent systems with exception management facilities. This architecture provides designers with an exception mechanism integrated to usual agent models, so that the work let to the designer is the definition of application-dependent handlers that are automatically invoked by the architecture when required. Keywords: engineering.
Exception management; Multi-agent systems; Autonomy
Reference to this paper should be made as follows: Platon, E., Sabouret, N., and Honiden, S. (xxxx) ‘An Architecture for Exception Management in Multi-Agent Systems’, Int. J. Agent-Oriented Software Engineering, Vol. x, No. x, pp.xxx–xxx.
1
Preliminary
Exception handling techniques were developed in the 1970s to increase the reliability of software without hampering the ease of programming. In the era of procedural languages and the advent of object-oriented programming, the term ‘exception’ has acquired a specialised meaning, tightly attached to high-level programming paradigms, as illustrated by the definition of Goodenough.
c 2006 xxx Copyright
2
E. Platon, N. Sabouret, and S. Honiden Of the conditions detected while attempting to perform some operation, exception conditions are those brought to the attention of the operation’s invoker. The invoker is then permitted (or required) to respond to the condition [12, 13, 11].
This definition and the subsequent lineage of exception handling mechanisms are operation-centric approaches [30]. When an operation is invoked, e.g. by a method of an object, conditions are checked before the actual execution to validate the invocation context. Typical conditions are the correctness of the types and values of the operation input parameters. If a condition is not met, the operation is not executed and an exception is signalled to the invoker to initiate appropriate handling mechanisms. This description allows to recognise modern models such as the ones in Java/C++ and Eiffel [27, 45, 14]. In the case of agent systems, the usual definition of exception applies as agents are software, but it also misses characteristics of the agent concept that stand at a higher level. Traditional definitions state that an exception is entirely determined when conditions are violated in the invocation context of an operation. Agents are however free to evaluate whether the result of invoking an operation is ‘normal’ or ‘exceptional’, due to the autonomy assumption and to the loose coupling among agents and resources. Agents can also evaluate differently the type of exception they encounter depending on their individual contexts and inner mechanisms. And lastly, the usual definition of exceptions is mostly implemented as specific constructs in a programming language, whereas it is not clear whether a language construct is appropriate to address the agent case, due to autonomy, openness, heterogeneity, and systemic effects [22]. These characteristics lead to consider an additional ‘agentcentric’ approach orthogonal to the notion of exception in programming languages, and that is akin to architectural considerations, as can be observed in related work on exceptions [18, 3]. 1.1
Agent exception
We define an agent exception with regards to the characteristics of openness, heterogeneity, and agent autonomy of MAS [33]. An agent exception is the evaluation by the agent of a perceived event as unexpected. The source of exception is the essential difference with traditional definitions. The source is the agent taking the decision that an event is exceptional, instead of having the agent merely receiving an event that has been deemed as exceptional by an external entity. Owing to autonomy, agents can evaluate any percept and thus choose to engage either exceptional or normal execution code. An event should be understood in a broad sense of any observable action or state in the system. For example, events are the sending and reception of ACL messages, or the perception of artificial pheromones in stigmergic systems. Exceptions are qualified as unexpected events. By unexpected, we mean the agent does not anticipate the arrival of the event in the current execution context (time, resources, value of parameters, etc.). In other words the agent is not ‘ready’ or unable to process the event when it occurs. The unexpected characteristic of an
An Architecture for Exception Management in Multi-Agent Systems
3
event then depends on the kind of agent that evaluates it and two different agents may react differently to the same event. This definition is compatible with the openness and agent autonomy in MAS as it is elaborated on a loosely-coupled model of MAS and agents can autonomously interpret an event equivocally. This definition provides the basis of what an agent exception is and it is not concerned with the social interdependency overlays that modulate individual interpretations. Typically, a power relationship or a reputation model can lead autonomous agents to consider an event as exceptional because they were told to do so by a superior or a trusted party. The modulations are optional capabilities that can influence the choice of agents, but they remain distinct matters. The remainder of this text focuses on the essential characteristics of the definition and leaves the study of the modulations for future work. 1.2
Case study
The case study is a simple agent-based simulation where agents sell and buy items following the contract net protocol (CNet) [43, 10]. Agents play a single role, either retailer or consumer. We assume that the target system shall be FIPAcompliant, i.e. the system features the infrastructure recommended by FIPA (directory facilitation, agent management, etc.). Fundamental architecture, algorithm, and protocol of the case. Fig. 1 presents on the left the architecture of the agents and a usual execution algorithm, based on the standard ‘sense-process-act’ models in the agent community [4, 39]. The right part of the figure shows a version of the CNet from the FIPA. The protocol is slightly simplified compared to the standard version to save space, but the main characteristics are preserved. The notation for the protocol follows the FIPA recommendation, except for the ‘*’ (star) symbol that is introduced to represent zero or more elements. Therefore, the CNet accepts an arbitrary number of retailers, but only one consumer, and messages can be multiple (multicasting). The agents receive percepts from others through the environment (input parameter) with the sense functionality, which corresponds to the sensor component of the architecture. They process percepts to produce actions, where the process is represented by the agent internal mechanisms of the architecture and knowledge for processing is in the internal representation. Agents act eventually to apply an action in the environment with the actuator component. The two types of agents in the case study differ in their process functionalities in order to fulfil their respective roles in the CNet protocol. Retailers process messages to decide prices, sells, and produce ordered items. These processes output reactions to messages received consumers (right-hand life-line). On the other hand, consumers produce call-for-proposals (CFP) and decide the winner of the call. Consumers initiate CNet protocols with CFP messages and react to messages from retailers (left-hand life-line). Example of agent exceptions. The agents of the case study are first designed according to the CNet protocol. Flaws in the design of agents or non-determinism in their decision process can produce messages that do not follow the sequence of the protocol or do not arrive on time. Such a situation is exacerbated in open
4
E. Platon, N. Sabouret, and S. Honiden Contract Net Protocol
Agent
Consumer
Agent Internal Mechanisms
Retailer * cfp
Internal Representation refuse *
Actuator
Sensor
Application Environment Execution Cycle
[timeout]
propose *
reject *
R/W Access
accept
Input: environment while true do percept ← sense(environment) action ← process(percept) act(action,environment) end Figure 1
failure
result
Basic agent architecture, execution algorithm, and a version of CNet
systems, where agents are developed independently. Despite standard and public specification of the protocol, implementations can be over- or under-specified with potential for design flaws. We will consider two situations here: over-specification and cancellation meta-protocol [10]. A retailer can implement extra functionalities (over-specification) that comply with the characteristics of the CNet, although they are unexpected events. In a FIPA-compliant system, it is possible for retailers to exploit the directory facilitator to know about consumers in the system and initiate CNet protocols. Such CNet do not start with the CFP however, but directly with a ‘propose’ message from the retailer to the target consumer. This initiative from retailers is an unexpected event, since consumer agents are not ‘ready’ to process them if they follow closely the CNet. It is however a desired property of autonomous agents to adapt to such kind of exceptional situation and take advantage of the opportunity of unexpected offers. It is also sound in this example as a retailer would just propose to execute a legal CNet, although initiated in an exceptional fashion. FIPA recommends to implement a cancellation meta-protocol in addition to the CNet, in case the consumer decides to abandon the protocol. In open systems, it is possible that some consumer agents do not implement this extra protocol (‘underspecification’), and it is also possible that retailer agents decide to cancel their proposals for example (thus returning to the previous example). In other words, agents are likely to encounter situations were the protocol is cancelled and they are not informed about it. They have then to deal with the absence of events, which can also be thought of as an unexpected situation. The usual way to cope with such absence is the setting of timeouts in the agent for a given activity, as can be observed at the beginning of the CNet. The standard FIPA model of the CNet does not specify timeouts for the particular case of the cancellation, so that it is hard to expect designers implementing them.
An Architecture for Exception Management in Multi-Agent Systems
5
A last example in the case study is the ‘agent death’, in which an agent faces the situation where a peer prematurely terminates [23]. The problem is for the agents to react to the death and remain in a consistent state to pursue their activities. A fundamental issue is for the agent to detect the death of the peer [35]. The basic detection methods are to set up a ‘heart-beat’ mechanism [28, 16], or a time limit for answer so that a peer is considered ‘dead’ whenever the limit is reached. In this latter case, the time event produced by the system clock when the limit is reached can be considered as an exceptional event by an agent. For another agent, the same time event can be simply ignored as irrelevant or normal, depending on an autonomous choice. 1.3
Research issues and purpose
Agent exceptions differ from programming exceptions, which means the existing work may not be fully appropriate to handle them. As for Agent-oriented software engineering issues, it is of high importance to avoid ad hoc exception management and provide designers with appropriate models and tools. These research endeavours are essential to cope with the issue of exception in MAS, and further with the issue of fault tolerance. A number of paths can be considered to study the question of exception in MAS. Most notably, we considered the introduction of a new performative in the FIPA-ACL to have a declarative mean to deal with exceptions. We discarded this option however, due to our conclusions on the nature of agent exceptions [33]. A new performative can help in a range of situations to inform agents about an exception, but the autonomy of agents should let them decide whether the content (or performative) of a message is exceptional depending on its knowledge and context. An inform performative with appropriate contents is therefore sufficient for the pragmatics. More fundamentally, an ‘exception performative’ would lead to confound the semantics of programming exceptions with the one for agent exceptions (yet to be defined). We think it is at risk to inherit characteristics of usual models that are not adapted to agents. This claim is supported by the proliferation of innovative models in distributed systems [55, 17, 28], where usual exception models cannot cope with the concurrency of exception signals. Another path that we considered and followed is the architecture of agent systems, e.g. [49]. Our definition of exception stresses that agent exceptions are based on perception, functional part of the interface between an agent and its environment. This relation to perception has consequences on the type of agent architectures that is required to deal with exception management. Appropriate architectures and guidelines can support designers in producing agents that feature exception-safe capabilities, to some extent. In addition, an approach based on the architecture is appropriate in the case of MAS, where the properties of interest are their distribution, openness, heterogeneity, and the agent autonomy. All these properties impact the architectural styles that are acceptable for agents, in addition to the requirements for exception management. The purpose of this article is then to present a software architecture for MAS that encompasses agent exception mechanisms. As open and heterogeneous system that host autonomous agents, MAS have properties such as dynamic binding of its elements and automatic reorganisation capabilities (as for Autonomic Computing).
6
E. Platon, N. Sabouret, and S. Honiden
Although mechanisms were proposed to handle agent exception, we will show they are usually not integrated in appropriate architectures and do not usually fulfil the requirements exposed by the aforementioned definition. Although this article focuses explicitly on the issue of MAS architecture, the research on agent exception is only at the beginning. We identified a number of other research directions in related publication. For example, the automatic generation of exception handler, asynchronous exception management, and concert exception management are specific mechanisms that were shown necessary to manage agent exception [44, 34, 33]. 1.4
Organisation
The structure of this article is as follows. Section 2 reviews the literature akin to exceptions in MAS and other types of systems. The aim of this part is to show how the current state of research relates to the definition of agent exception, and why it is not completely appropriate. Section 3 presents in detail the MAS architecture we propose to take agent exceptions into account. Section 4 discusses the case study presented in the introduction, the limitations of the approach, and a comparison to related work. Finally, section 5 concludes the article.
2
Related Work in Exception Management
The literature related to this article pertains to MAS and distributed systems at large. In the following, theoretical and operational research is presented and related to the definition of agent exception, in the aim to demonstrate shortcomings relative to requirements for agents and their engineering. This survey cannot be exhaustive, and the most representative work has been selected for presentation. 2.1 2.1.1
Exceptions in Distributed Systems Approaches for distributed and active objects
Distributed and active objects (D/AO) have received particular attention regarding exception handling, since usual mechanisms are not adapted to properties of such systems, such as concurrency and the global scope of some exceptional situations. The exception handling models from Xu et al., Issarny, or Miller and Tripathi rely on close concepts to cope with the concurrency of exception signals in D/AO systems [55, 17, 28, 9]. For example, Miller and Tripathi proposed ‘the guardian’ as a set of software constructs to handle exceptions in a distributed-object system. The guardian is a dedicated object that encapsulates rules to handle ‘global exceptions’ involving several threads, thus dealing with concurrent exceptions. A detailed example of exception handling is presented by Miller and Tripathi where the direct relationship with Java facilities can be observed. The guardian assists a client-server system that implements the ‘primary-backup’ approach to deal with server-side failures [47]. If the primary server fails, a ‘global exception’ is raised,
An Architecture for Exception Management in Multi-Agent Systems
7
so that the guardian handles the error by asking the backup to take the role of primary, and by starting a new backup. The specification of this example is related to reorganisation of teams in MAS, and the server failure can be thought of as an agent-level exception. The guardian and related work do not capture however the characteristics of agent systems. They initially target D/AO with (remote) procedure call, and the coupling is higher than an agent system architecture. Concretely, the interaction model of D/AO has a very similar semantics to usual object-oriented handling facilities which ‘binds’ invoker and operation. Agent interactions rely on other models with ‘weaker bindings’, typically message passing. Malicious agents can be part of open MAS, along with benevolent and ill-designed agents. The approaches for D/AO assume that agents are benevolent and they do not cope currently with arbitrary agent profiles, even though security concerns are considered. Some agent exceptions such as the agent death are also not taken into account [23]. Finally, the encapsulation of agents cannot be fully satisfied. It guarantees some independence to the agent, which can ignore messages explicitly or answer false results, but access to the agent state is granted (members and methods) in D/AO approaches. Typically, the guardian is allowed to ‘command’ an agent, e.g. to wait or to restart a task, which are undesired possibilities in MAS. 2.1.2
Exceptions and Software Architecture
Research in Software architecture proposes exception handling related to architecture description languages (ADL), which target Software engineering directly at the architecture level. The motivation for this approach is that the architecture of complex systems is not always guaranteed to be optimal in the life-time of the system. Some events do not always require, but would benefit from, architectural adaptation to new conditions of executions. Such adaptation can be observed in banking applications, where banks cannot afford to revise completely their IT system architecture at each evolution. The architecture is often extended, instead of adapted, to the cost of increasing complexity and performance drops. One notable instance of architecture-related exception handling is the work of Issarny and Banˆ atre that introduces exception handling constructs and runtime support to an ADL [18]. The use of this extended ADL allows to specify how the architecture reacts to some exceptions. Examples of such architecture-level exceptions are related to the client-server architecture. The language allows to specify that a base architecture (e.g. an RPC link) can evolve automatically for dynamic binding of component instances, enhanced availability (replication), or enhanced response-time (pre-fetching), whenever such evolution is necessary to maintain the system performance. Such work at the architecture-level is relevant to MAS, which are open and heterogeneous architectures. However, the extended ADL proposed in current work mostly aims at cooperative components, so that further extensions are required to deal with autonomous entities. 2.1.3
Exceptions in Component-based Software Development
In relation to Software architecture, the development of software based on COTS (Components-On-The-Shelf) aims at building systems by assembling generic ‘ready-
8
E. Platon, N. Sabouret, and S. Honiden
to-use’ components [46]. The issue with COTS in practise is the actual integration of arbitrary components into a robust application. The implementation details of components are usually not known, and only some details about the provided functionalities are delivered with a given component. Integration of components is therefore difficult as ‘systemic exceptions’ can occur due to their assembling [7]. In addition to traditional exceptions handled inside components as individual subparts of the system, system-level exceptions need specific mechanisms, in the same way agent exceptions call for novel approaches. Dellarocas proposes a model developed in relation to the work of H¨agg and Klein et al. in MAS [15, 7, 23]. The approach is to introduce pluggable ‘sentinel components’ in the assembling of COTS and request the components of the application to implement a set of interfaces that lets sentinels detect and deal with exceptional behaviours. Sentinels actively monitor the execution system-wide for symptoms and they exploit a knowledge base of handling recipes to recover a variety of situations. A later approach relies on the Coordinated exception handling approach, aforementioned in the case of D/AO [55, 37]. The execution of components is organised into atomic actions that define a scope wherein exceptional situations must be managed. Action scopes can be nested so that the usual recursive handling schemes are reproduced: An exception that cannot be handled inside an action scope is propagated to the enclosing action scope. The main advantage of this approach is to provide a dynamic mean to organise the execution of components into actions, and to manage the occurrence of concurrent exceptions inside these actions. In addition, this work proposes guidelines to software integrators for introducing this exception handling mechanism in the development process of COTS assembling. The component-based approach assumes that application components are observable and commandable (through the required set of interfaces), and this hypothesis is not acceptable with agents. In the case of sentinel components, the approach based on system-wide observation does not hold in MAS where agents only have a local scope and scalability issues arise as the number of agents or the complexity of their interactions increase. The structuring offered by the action model is also not fully applicable in the case of agent exceptions. One of the assumption of this work is in fact that ‘components have deterministic behaviour and do not change their state spontaneously’ [37]. In other words, components need an invocation to ever react, similarly to an object in Object-oriented programming. Although agents can be predictable, they usually evolve spontaneously as they execute autonomously.
2.2 2.2.1
Exceptions in MAS research The sentinel-based architecture
Sentinels are special agents introduced by H¨ agg in a MAS applications to provide a fault-tolerance service layer for BDI agents [15]. The approach has been extended in the work of Klein et al. with an exception handler repository [22, 23]. Another extension has been developed by Shah et al. to focus on an exception diagnosis mechanism for detecting when sentinels must react [42, 40, 41]. Each sentinel assists an application agent in its interactions with other agents. Sentinels are specialised in error detection and recovery, with the capability to inspect the state of agents
An Architecture for Exception Management in Multi-Agent Systems
9
(including here their ‘beliefs’ [36]). When an exception is detected in interactions or agent states, the sentinels execute specific code to recover a desired state. A detailed application from H¨ agg is a system and its sentinels for a power distribution company. Application agents negotiate energy consumption credits for load-balancing on an electric grid. Sentinels can detect and remedy to erroneous behaviours in negotiation processes by inspecting ‘checkpoints’ in the agent code. The problem with the sentinel approach is that it does not satisfy the properties of agent encapsulation and autonomy. Encapsulation is not respected since sentinels can access and execute code in the so-called ‘agent-head’ [15], which should be a black-box to respect agent autonomy. In addition, the latter extension is declared to be part of the hosting system where agents can freely join and leave [42]. As sentinels are allowed to fully inspect agents, this architectural style does not satisfy further the assumptions of openness and agent autonomy. Finally, agents are supposed benevolent and this hypothesis cannot hold in open systems. 2.2.2
Exceptions and mobile agents
Mobile agents have specific needs regarding fault tolerance, as they execute in heterogeneous deployment environments and the potential for communication issues is higher than fixed systems (problems of location, tracking, etc.). The agent community has proposed several approaches, notably middleware layers to support mobile agents in case of exceptions [48, 1, 16, 6]. The CAMA middleware is an example of endeavours that elaborates a notion of scope in the coordination space of mobile agents. Different levels of scopes target a location, an agent, or a role, and different types of exceptions are attached to each scope. This organisation of the exception types provides designers with appropriate guidelines to reason about exceptions, and tools to define adequate agent reactions to unexpected events. The coordination space serves to propagate exception signals among the different scopes when required. The number of approaches for exceptions in mobile agent systems has grown recently, which demonstrates the importance of the underlying issues. The advantage of these approaches is that they build upon sound engineering foundations, often related to earlier work presented in the section on distributed systems. The mobile agent model rises however specific challenges that are often not included in other agent approaches, such as all others presented in this section. Consequently, we think the present work on mobile agents is essential for its topic and provides concrete grounds for the technical aspects of exceptions in MAS. It does not cope however with our definition of agent exception, since agents do not decide by themselves what is exceptions and what is normal. We believe nevertheless that the present models are useful for agent exceptions and may evolve adequately. 2.2.3
Agent exceptions in commitment protocols
Representative work related to commitment protocols includes mechanisms to deal with exceptions in the fulfilment of commitments, which is particularly interesting as for expressing the responsibilities of the agent software [54, 53, 52, 26]. The work of Xing et al. focuses on patterns of interactions among agents that lead to commitments, and some of the patterns can serve to model the reaction
10
E. Platon, N. Sabouret, and S. Honiden
of agents to exceptional situations. A typical example is a pattern that describes the need for revising a commitment after the context of the agent has changed and evolved to exceptional conditions. The advantage of this pattern-based approach is the possibility to describe agent behaviours with state-charts that are suitable for engineering them, notably at requirement analysis and design stages. The research on commitment patterns needs however further endeavours towards the later development stages and a framework where commitment patterns could fit in down to implementation, similarly to the work on exceptions in workflow systems [2]. Another approach to commitments from Mallya and Singh deals with exception handling for proactive agents that execute commitment protocols, a model of agent interactions that guarantees autonomy. When such a protocol is not respected, an exception is signalled to handle expected and unexpected situations. Expected exceptions are foreseen by the designer who wrote a specific handler beforehand (here, another protocol), which is the most common case in software programs. Unexpected exceptions are not coded beforehand and some constructs allow to dynamically build a handler by merging basic protocols together. This method has been illustrated for a hotel reservation protocol. An expected exception can be the case where there is no vacancy in the hotel. The designer usually foresees this issue and a specific handler is available in the system to deal with it. An unexpected exception can be the occurrence of a dramatic event (a fire) that destroys the hotel. Customer agents usually know how to behave facing a cancellation, and it is reasonable to consider such situation as ‘expected’, in the sense that a handler is available. The hotel agent is unlikely however to have a handler that serves cancelling all reservations. At design time, the handling of such an exception might not have been fully prepared. Mallya and Singh propose to rely on an external repository to fetch specific handlers and merge them automatically for an adequate handling. Although this approach is very attractive and verifies the agent paradigm (encapsulation, heterogeneity, and openness), it still remains theoretical and it lacks validated results concerning scalability, even in later work [25]. The current issues are indeed the computational complexity of the selection of handlers and the dynamic assembly of new handlers. 2.2.4
Stigmergic systems
Stigmergy is an interaction model where agents put marks in the environment (messages with no intended recipient) that other agents exploit to determine their next actions [5]. Stigmergy models the behaviour of social insects such as termites. One termite starts to build a nest by putting a piece of material on the ground (a mark in the environment). Other termites use this information to determine where to pile the piece they carry. Stigmergy is thus an indirect interaction model as there is no direct message passing. Stigmergic systems are shown to be particularly robust to exceptions such as the death or the failure of agents [31]. The robustness of these systems is mostly due to the high redundancy of agents, which reminds the choice for modularity of software architectures that could limit the impact of exceptions in sequential systems. The study of these systems are particularly relevant as an advanced coordination model for agents, with a number of applications, notably for self-organising systems.
An Architecture for Exception Management in Multi-Agent Systems
11
Little work on stigmergic systems discusses robustness issues, and no work on exception handling to our knowledge. The essential feature of this approach is the particular role of the environment in the mediation of interactions. We see the environment as an adequate element for supporting agent exceptions mechanisms. It is a modular element of the MAS architecture that can serve agents without infringing their encapsulation (part of the architecture ‘outside’ agents), and the case of stigmergy shows how information about events can flow in the system, so that ant-like agents can cope with events like the agent death. 2.3
Survey conclusion
The work akin to exception contributes to our model of exception. None addresses completely however the concerns of openness, heterogeneity, and agent autonomy. The survey shows important directions that are compatible with the definition and that can serve for a comprehensive agent exception approach. It also supports the research directions that we develop in this article, i.e. the elaboration of architectural foundations for exception management in MAS.
3
An Architecture for Exception Management in MAS
The definition of agent exception and related work have consequences on the architecture of MAS. Usual architectures, as introduced in the running example on page 3 and illustrated by the Jade or Jason frameworks [19, 20], do not integrate exception management facilities for agent exception. In this section, a base architecture is presented for this purpose, after enumerating the constraints due to the definition of agent exception. 3.1
Architectural constraints
Agents are autonomous pieces of software, i.e. they can take their own decisions, notably concerning normal and exceptional events, which is the central idea of our definition of agent exception. The first architectural constraint is therefore that exceptions make sense inside agents. Exception management is confined to the inner agent architecture, independently from other agents, and the application environment is not involved in the decision mechanisms related to exceptions. The place of the environment in the system architecture leads to a second constraint: An explicit environment should be exploited to structure event propagation, thus including events that may cause exceptions. The environment is responsible for general functions of the architecture that supports implicitly exception management. In other words, the environment can provide event notification facilities (c.f. stigmergy), but it is up to the agents to interpret the events as exceptional. The environment can however give access to additional support, such as the handler repository of Klein et al. [23]. Such repository is then qualified as ‘resource’ in the environment [50]. The third and last constraint that steered our work is that agent exceptions require internal representation (e.g. knowledge). An internal representation is nec-
12
E. Platon, N. Sabouret, and S. Honiden
essary as reference for agents to evaluate incoming events and be able to distinguish what is ‘normal’ from what is ‘exceptional’, from the individual and subjective point of view of an agent. Consequently, agent architectures without explicit representation, such as the pure subsumption architecture from Brooks [4], are not in principle. Practical reactive models encompass however some built-in implicit representations, so that they deal in fact with agent exceptions but in a less flexible pre-wired manner. 3.2
Model of exception-ready agents
The architecture. The architectural constraints lead us to the following revision of the architecture of an agent. Fig 2 depicts such an architecture, with necessary and optional components [33]. The necessary components are defined by the base architecture in the agent community, introduced for the case study in Fig. 1. The particularity of the architecture hereafter pertains to the elaboration of the agent perception and actuation components. The novelty is the management of relevance and expectation criteria to classify input events (the ‘percepts’) and let the agent initiate potential exception management when required internally. This model can be related to the proposal from Shah et al. for exception diagnosis [42], and the work of Weyns et al. on active perception and the notion of focus [51]. Agent Agent Internal Mechanisms Exception Mechanisms Base Mechanisms
Internal Representation
Actuation
Generation
Evaluation
Expectation
Expectation Filter
Relevance
Relevance Filter
Actuator
Sensor
Perception
Execution Cycle Read Access
Application Environment
Figure 2
R/W Access
Agent architecture with exception management mechanisms
The architecture reproduces in white the necessary components of the base architecture, and the optional components are introduced in grey, with the aim to manage agent exceptions. This distinction separates the application logic in white, from the exception handling logic in grey, so that designers can choose whether the exception management part is necessary depending on their target application. The perception component of the present architecture encompasses the sensor and evaluation sub-components. Sensors receive events from the environment and
An Architecture for Exception Management in Multi-Agent Systems
13
pass them to the evaluation. This latter element is first responsible for distinguishing relevant from irrelevant events. Relevance appears to be essential feature to filter out unnecessary information (potential for exceptions that do not concern the agent in any way) and avoid the high-bandwidth issue [24]. The evaluation then identifies unexpected events depending on the criteria of the agent. One example of criteria in planning agents can be that unexpected events are those who are not ‘scheduled’ in the plan. Such criteria is independent of the architecture, and it is up to the designer to choose one in the development stages, depending on the target application. The evaluation uses the internal representation as the reference by which the agent can distinguish events. The internal representation refers to any representation inside the agent. For example, the BDI and KGP architectures have a set of knowledge bases [36, 21], whereas ant-like agents have simpler internal fixed data structures. It should be noted here that the evaluation component has not to distinguish mechanisms for relevance and expectation, as they can be merged together conveniently. The aim of the distinction is to show the difference of purpose, and to link our work to existing endeavours [51]. Events classified by the evaluation are forwarded to the agent internal mechanisms component, where they are processed. The two functional layers presented in this component separate the exception mechanisms from the ‘base’ that aims at the application logic of the agent. The exception layer introduces appropriate mechanisms to deal with exceptions, and its output should be directed to the base layer, so that the agent can continue its activity despite the occurrence of an unexpected event. The component as a whole manipulates the internal representation and its output is an action passed to the actuation for producing an effect in the environment. The actuation component prepares eventually the relevance and expectation criteria of the agent in its future interactions. The criteria can be dynamically adapted by the agent to fit its context in the system, and it is up to the designer to decide the kind of evolution of criteria. Finally, the actuator component serves to commit the agent action in the environment. Fundamental algorithms associated with the architecture. The algorithm related to the base agent architecture needs to be adapted (see Fig. 1). Alg. 1 presents an updated version. Input: environment while true do percept ← sense(environment); internal percept ← evaluate(percept); if internal percept is flagged as exception then internal percept ← handling(internal percept); end action ← process(internal percept); generate(action); act(action,environment); end Algorithm 1: Algorithm of an exception-ready agent Accordingly to the new architecture, agents sense the environment and evaluate the percepts. If the evaluation flags the percept as exception, the exception man-
14
E. Platon, N. Sabouret, and S. Honiden
agement architectural component is exploited in the handling instruction of the algorithm. Handling can modify the agent state and knowledge, but the execution eventually returns to processing the internal percept (which may have been modified by the handling) to produce an action in the environment, if required. Before committing the action in the environment, the agent generates the next relevance and expectation criteria that matter for it. The generation does not modify the action, as decided by the agent, but it exploits it to determine the appropriate criteria. The algorithm shows that no exception ‘enters’ the agent. The agent decides by itself when it needs to initiate exception handling capabilities after the evaluation. This mechanism is the architectural mean to ensure the agent encapsulation and autonomous decisions. The evaluation and generation functionalities are the central mechanisms of our exception management approach. The architecture must provide adequate components for implementing their algorithms, and we detail hereafter their essential stages. Alg. 2 describes how the evaluation component evaluates an incoming percept. Input: percept,relevanceKB,expectationKB Output: internal percept if relevanceKB contains relevance(percept) then signature ← expectation(percept); if expectationKB contains signature then internal percept ← (percept,nil); else internal percept ← (percept,signature); end end Algorithm 2: Algorithm for the evaluation component The purpose of the algorithm is to discard irrelevant percepts, to distinguish expected from unexpected percepts, and to provide inner components of the agent with useful information in case of unexpected percepts (i.e. an ‘internal percept’). The relevance function is an application-dependent function that extracts salient information from a percept for comparison with the knowledge base ‘relevanceKB’. If the extracted information is relevant (e.g. appropriate receiver field), the percept is kept for further processing. Otherwise, the percept is considered irrelevant and the algorithm exits. The application-dependent expectation function further extracts information from relevant percepts for comparison with ‘expectationKB’. The resulting signature is matched with the knowledge base. If the agent was waiting for the percept (e.g. an inform message), the base contains information about it (e.g. a filter on inform messages). The output of the algorithm is then a tuple (percept,nil), where nil indicates that no unexpected situation has occurred. When an exception is detected, the output of the algorithm is a similar tuple, where nil is replaced by the signature. This signature serves other components of the agent to determine how to handle the situation (i.e. handler selection scheme). Alg. 3 presents an algorithm to generate relevance and expectation criteria necessary for the aforementioned relevance and expectation functions. This algorithm focuses on the case of agents executing interaction protocols, and it does not apply as-is for planning agents and ant-like agents, to name a few.
An Architecture for Exception Management in Multi-Agent Systems
15
Input: action,protocol Input/Output: relevanceKB,expectationKB if action initiates a new protocol then relevanceKB ← relevanceKB∪{format(protocol)}; expectationKB ← expectationKB∪{next(action,protocol)}; else if action terminates a protocol then relevanceKB ← relevanceKB\{format(protocol)}; expectationKB ← expectationKB\{format(protocol)}; else expectationKB ← expectationKB\{current(action,protocol)}; expectationKB ← expectationKB∪{next(action,protocol)}; end Algorithm 3: Algorithm for the generation component in the restricted case of interaction protocols At the stage of this algorithm, the agent has decided an action in a given protocol (the action is here sending a message). If this action initiates the protocol, the knowledge base of the agent is updated. relevanceKB receives formatted criteria that states that ‘any message related to this protocol is relevant’, which can be implemented as a simple predicate rule. expectationKB receives the next message after the action in the protocol. That is, the agent expects to receive from others one of the immediately following messages defined in the protocol. Similarly, the second case updates the knowledge of the agent when the action terminates the protocol. The case of cancellation introduced with the example in section 1.2 is also managed here. Finally, all other actions replace the current expectation of the agent by the next one with respect to the protocol. Basic environment for event notification. The case study introduced agent exceptions such as the ‘agent death’, with the difficulty to define how agents detect the actual death. The environment can be conveniently tailored to support such kind of exception, without infringing the agent paradigm, notably their autonomy and local sensing capabilities. The essential functionality of this support is to notify agents about changes in the environment, such as the termination of heart-beats from one agent, or similar approaches. The application environment architecture does not explicitly refer to exception mechanisms, owing to the definition of agent exception. However, the role of the environment is essential in the proper notification of events with respect to agent autonomy. Figure 3 focuses on such application environment. This architecture is inspired by the reference model proposed in the agent community [50]. Actions from agents are collected in the collecting function and passed to the internal mechanisms for processing. Agents can have three types of actions, namely exchanging messages with other agents (messaging), observing resources and agents (observing), or acting on resources and agents (acting). Depending on the action types, either of the three components is activated. The three action components can access the state of the environment, but only acting can modify it, for example to update the state of the agent in the environment (e.g. heart-beat action). The state of the environment encompasses a variety of data useful to application agents, such
16
E. Platon, N. Sabouret, and S. Honiden Agent
Application Environment Collecting Interface for Distributed Environments
Effecting
State (e.g. topology)
Internal Mechanisms Acting Observing Messaging
Interface to Deployment Context
Figure 3
Environment base architecture for exception management
as the topology of the system network (if applicable) or the state of pheromones in the case of stigmergic MAS. Once the actions are processed, the resulting events are sent to agents by the effecting function. The interface to deployment context allows agents to exploit and receive notifications from external resources, such as databases and web-services. The internal mechanisms use the deployment context to access some facilities. For example, the handler repository proposed by Klein et al. could be advantageously integrated as a knowledge base accessible by agents in search for specific handlers [23]. All internal mechanisms can send events (e.g. database query) to the deployment context, which reacts by replying on the messaging mechanism toward agents. In addition, the ‘interface to the deployment context’ element maintains data in the environment state, for example concerning time (essential for the timeouts introduced earlier). The last component of the architecture is for the interface for distributed environments, that serves to synchronise the global state of the environment, passing events, and agent mobility, when the environment is separated into ‘pieces’ over a distributed infrastructure, e.g. in [29].
4
Application to the case study and limitations of the architecture
This section revisits the case study presented in section 1.2, describing how the architecture supports the exception occurrence. This description then serves to present the limitations of the architecture and discuss it relative to other approaches.
An Architecture for Exception Management in Multi-Agent Systems 4.1
17
Application to the case study
Section 1.2 refers to two types of agent exceptions in the case study, namely unexpected messages in a CNet and the agent death. Let us consider a consumer agent C and a retailer agent R. C normally initiates CNet, but it also considers relevant any CNet-related message from other agents, i.e. it accepts taking opportunities. Let us assume that R sends a ‘propose’ message to C relative to a fresh protocol P. According to the new architecture and Alg. 1, C will sense the message and evaluate it. At this stage, the knowledge base of C does not contain any reference to P. Alg. 2 states that P is relevant as it is an offer directed to C. It is not expected however, so the internal percept of C is deemed as an exception. This mechanism has allowed C to decide an exception autonomously, without any external intervention. C can now decide how to handle the situation, which is independent from architectural matters. For example, the handling can be to accept the opportunity and engage in P. C would then generate an action and criteria, according to Alg. 1 and 3 respectively. The action would be sending an ‘accept’ message, and the latter algorithm would generate expectations for a ‘result’ or ‘failure’ message from R. Let us assume that R dies at this point in the protocol, and that the environment is configured to receive heart-beats from agents to notify death in its topology. Whenever R disappears, the environment notifies agents about the death, including C. The perception of such a message by C is deemed as an exception in Alg. 2, so that C can initiate appropriate handling for the termination of the protocol. 4.2
Limitations of the architecture
The case study illustrates how the architectural choices support designers in building exception-ready agents. Agents compliant with the architecture are able to distinguish normal from exceptional situations depending on their knowledge. In other words, the designer is not let with an ad hoc approach. The architecture implements the base mechanisms, and the designer needs to provide applicationdependent data and functionalities: 1. Application logic of the agent (Base mechanisms) 2. Base knowledge of the agent (Internal representation) 3. Handlers and extra knowledge (e.g. address of a handler repository) The items (1) and (2) refer to the code that needs to be provided for a given application. Item (3) refers to the exception-related part of the code, which is reduced to handlers and necessary extra knowledge. The architecture manages the semantics of switching the agent execution to handlers when necessary. However, the support from the architecture cannot help defining handlers, which depends on design choices and the application. In the case study, we chose agents that execute protocols, so that handlers can be designed as ‘protocols’, with the essential difference that a handler can lead to internal actions, whereas a protocol is for interactions. An agent design based on planning or other models requires specific care about the form of handlers. We think however that the architecture itself is an appropriate foundation for other types of agents.
18
E. Platon, N. Sabouret, and S. Honiden
Concerning openness, the architecture relies on message-passing techniques, so that interoperability concerns are mostly confined to the choice of message formats. However, the architecture is not as flexible as recent work on open distributed systems, where components and connectors can be dynamically assembled according to configurations. The architectural elements in agent and environment have limited flexibility, as there exists few possible configurations of their execution cycles, e.g. the perception-processing-action for agents. The only flexibility in these elements is their detailed descriptions, where different architectures can be adopted and dynamically changed. Most of the flexibility in our architecture relies on the connectors between agents and environment, which are dynamically instantiated as message-passing connectors at arrival and departure of agents. It is expected that such flexibility is sufficient in practise, similarly to service-oriented architectures. Detailed design of a MAS can introduce more flexibility ‘inside’ agents and environment, but this remains out of the focus of this article. 4.3
Comparison to related work and perspectives
The recurrent issue in related work is the lack of respect for agent encapsulation and autonomy. Most approaches in distributed systems and MAS allow controlling agents from ‘outside’, with the exception of the work on commitment protocols and stigmergy (sections 2.2.3 and 2.2.4). Our architecture is designed to comply with the definition of agent exception, so that agent encapsulation and autonomy are enforced. The work on commitment protocols also complies with the definition, but the present achievements relative to this work depend on the commitment protocol model, so that the question of engineering this model might be specific to this approach. Our architecture targets a more general concept of exception that can be implemented for various approaches. It is applicable to FIPA-compliant interaction protocols, as illustrated in the case study, and we think it is also applicable for commitment protocols and other agent models, such as the ones based on planning. As for stigmergic systems, our architecture offers a more flexible approach to robustness. The extended flexibility may not be necessary in current stigmergic systems however, but we believe it can serve in future cognitive stigmergic models [32]. In relation to the work on sentinels and mobile agents, our architecture takes another approach to exception management. Sentinel-based approach separate MAS into two distinct sub-MAS: Application agents fulfil the functional requirements of the system, and sentinel agents implement the quality requirements. The sentinel ‘layer’ has therefore a global point of view on the system, so that they can deal with ‘local’ and ‘global’ exceptions. Local exceptions pertain to unexpected events local to an agent, e.g. inconsistent knowledge. Global exceptions pertain to unexpected events that impact several agents, such as a circular wait cycle. Mobile agent approaches are often based on mechanisms developed for distributed computing, e.g. coordinated atomic action model [55], which specifically target global exceptions. Our architecture is not focusing on the difference between local and global exceptions, as the primary purpose was to ensure the agent encapsulation and autonomy. It is however more appropriate for local exceptions, since the mechanisms provided by the architecture endows agents with capabilities relative to their accessible context, thus local to the agent. Global exceptions can nevertheless be managed by the architecture. For instance, a circular wait cycle could be handled by provid-
An Architecture for Exception Management in Multi-Agent Systems
19
ing appropriate handlers to agents implementing the architecture, but experience shows that it would be significantly less efficient than an approach like sentinels or coordinated atomic action. In consequence, a comprehensive model of agent exception should rely on our architecture for preserving encapsulation and autonomy, and also on robust external support for global exceptions. Such double approach makes sense when considering MAS where agents are part of a hierarchy. ‘Manager agents’ can detect exceptions and propose a remedy to some global issues. The problem with this view is that external support must appropriately serve agents without supposing external control is possible. This problem is difficult as global approaches must be robust to non-collaborative agents. Typically, an autonomous agent can decide that an event presented as an exception is not an issue, so that the agent refuses to participate in any collaborative handling. Alternative must then be searched dynamically.
5
Conclusions
Exception management is not a novel research question, but the present state of the art in MAS demonstrates the need for further research. The properties of openness, heterogeneity, and agent autonomy require appropriate model of exception in MAS. We proposed an architecture of Multi-Agent Systems (MAS) that fulfils the requirements for the notion of ‘agent exceptions’ defined in this article. Our approach guarantees a loose coupling of agents in the system and respects their autonomy. The architecture supports designers in building exception-ready agents and it avoids ad hoc implementation of exception detection mechanisms. The designer needs to just provide handlers, i.e. how to process an exceptional situation, in addition to the usual application logic and knowledge of the agent. The main limitation of the architecture is that handlers depend on design choices (agents using protocols, or plans, etc.) and that further work is required for improving the management of global exceptions. Our current endeavours in the study of agent exceptions will be first to provide an API for the architecture. In particular, we expect the integration of several agent architectures in this model to extend their capabilities to exception handling. Our present code implements the architecture and algorithms of this article, in the case of agents executing protocols in market simulations. We plan to extract the API from the present experiments. Concerning open issues that should be addressed in future work, we focus on handling strategies on top of our architecture, especially for concurrent and concert exceptions with a priori non-collaborative agents. In the case of MAS, we are seeking for an appropriate approach to allow agents detect situation where a temporary collaboration is preferable, so that they have an incentive to form coalitions with other parties. The present candidate approach to define such an incentive is to rely on a trust model. A second challenging issue pertains to unexpected exceptions. The work of Klein et al. and Mallya allow agents to deal with unexpected exceptions to some extent. In the perspective of our architecture, we aim at pursuing these research endeavours.
20
E. Platon, N. Sabouret, and S. Honiden
Acknowledgements This research is partially supported by the French Ministry of Foreign Affairs under the reference BFE/2006-484446G, Lavoisier grant program. The authors thank Jos´e Ghislain Quenum for discussing this work in detail.
References [1] Budi Arief, Alexei Iliasov, and Alexander Romanovsky. On Using the CAMA Framework for Developing Open Mobile Fault Tolerant Agent Systems. In Software Engineering for Large-Scale Multi-Agent Systems, pages 29–35, 2006. [2] Alexander Borgida and Takahiro Murata. Tolerating exceptions in workflows: a unified framework for data and processes. In WACC, pages 59–68. ACM, 1999. [3] Marco Brambilla, Sara Comai, and Christina Tziviskou. Exception management within web applications implementing business processes. In Dony et al. [8], pages 101–120. [4] Rodney Brooks. Intelligence without representation. Artificial Intelligence, 47(1–3):139–159, 1991. [5] Sven Brueckner. Return from the Ant — Synthetic Ecosystems for Manufacturing Control. PhD thesis, Humboldt University, Berlin, Germany, 2000. [6] Karla Damasceno, Nelio Cacho, Alessandro Garcia, Alexander Romanosky, and Carlos Lucena. Context-Aware Exception Handling in Mobile Agent Systems: The MoCA Case. In Software Engineering for Large-Scale Multi-Agent Systems, pages 37–43, 2006. [7] Chrysanthos Dellarocas. Toward Exception Handling Infrastructures in Component-based Software. In Proceedings of the International Workshop on Component-based Software Engineering., 1998. [8] Christophe Dony, Jørgen Lindskov Knudsen, Alexander B. Romanovsky, and Anand Tripathi, editors. Advanced Topics in Exception Handling Techniques, volume 4119 of Lecture Notes in Computer Science. Springer, 2006. [9] Christophe Dony, Christelle Urtado, and Sylvain Vauttier. Exception Handling and Asynchronous Active Objects: Issues and Proposal. In Dony et al. [8], pages 81–100. [10] Foundation for Intelligent Pysical Agents. FIPA Contract Net Interaction Protocol Specification. http://www.fipa.org/specs/fipa00029/SC00029H.html. Document number SC00029H, Accessed in October 2006. [11] John B. Goodenough. Exception handling design issues. SIGPLAN Not., 10(7):41–45, 1975.
An Architecture for Exception Management in Multi-Agent Systems
21
[12] John B. Goodenough. Exception Handling: Issues and a Proposed Notation. Commun. ACM, 18(12):683–696, 1975. [13] John B. Goodenough. Structured exception handling. In POPL ’75: Proceedings of the 2nd ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 204–224, New York, NY, USA, 1975. ACM Press. [14] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, editors. The JavaT M Language Specification, Third Edition. Addison-Wesley, 2005. [15] Staffan H¨ agg. A Sentinel Approach to Fault Handling in Multi-Agent Systems. In Chengqi Zhang and Dickson Lukose, editors, Distributed AI, volume 1286 of Lecture Notes in Computer Science, pages 181–195. Springer, 1996. [16] Alexei Iliasov and Alexander Romanovsky. Structured Coordination Spaces for Fault Tolerant Mobile Agents. In Dony et al. [8], pages 181–199. [17] Val´erie Issarny. Concurrent Exception Handling. In Romanovsky et al. [38], pages 111–127. [18] Val´erie Issarny and Jean-Pierre Banˆ atre. Architecture-based Exception Handling. In Hawaii International Conference on System Sciences, 2001. [19] Jade agent framework. http://jade.tilab.com/, ver. 2005. [20] Jason Agent Platform Project. http://jason.sourceforge.net/. Accessed in August 2006. [21] Antonis C. Kakas, Paolo Mancarella, Fariba Sadri, Kostas Stathis, and Francesca Toni. The KGP model of agency. In Ramon L´opez de M´antaras and Lorenza Saitta, editors, ECAI, pages 33–37. IOS Press, 2004. [22] Mark Klein and Chrysanthos Dellarocas. Exception handling in agent systems. In Agents, pages 62–68, 1999. [23] Mark Klein, Juan A. Rodr´ıguez-Aguilar, and Chrysanthos Dellarocas. Using domain-independent exception handling services to enable robust open multiagent systems: The case of agent death. Autonomous Agents and Multi-Agent Systems, 7(1-2):179–189, 2003. [24] Nicholas Kushmerick. Software agents and their bodies. Minds and Machines, 7(2):227–247, 1997. [25] Ashok U. Mallya. Modeling and Enacting Business Processes via Commitment Protocols among Agents. PhD thesis, North Carolina State University, Raleigh, United States, 2005. [26] Ashok U. Mallya and Munindar P. Singh. Modeling exceptions via commitment protocols. In Autonomous Agents and Multi–Agent Systems, pages 122–129, New York, NY, USA, 2005. ACM Press. [27] Bertrand Meyer. Disciplined Exceptions. Technical report, Interactive Software Engineering, 1988.
22
E. Platon, N. Sabouret, and S. Honiden
[28] Robert Miller and Anand Tripathi. The Guardian Model and Primitives for Exception Handling in Distributed Systems. IEEE Trans. Software Eng., 30(12):1008–1022, 2004. [29] Fabio Y. Okuyama, Rafael H. Bordini, and Antˆonio Carlos da Rocha Costa. Elms: An environment description language for multi-agent simulation. In Danny Weyns, H. Van Dyke Parunak, and Fabien Michel, editors, E4MAS, volume 3374 of Lecture Notes in Computer Science, pages 91–108. Springer, 2004. [30] David Lorge Parnas and Harald W¨ urges. Response to undesired events in software systems. In International Conference on Software Engineering, pages 437–446, 1976. [31] H. Van Dyke Parunak. “Go to the Ant”: Engineering Principles from Natural Multi-Agent Systems. Annals of Operation Research, 75:69–101, 1997. [32] H. Van Dyke Parunak. A survey of environments and mechanisms for humanhuman stigmergy. In Danny Weyns, H. Van Dyke Parunak, and Fabien Michel, editors, E4MAS, volume 3830 of Lecture Notes in Computer Science, pages 163–186. Springer, 2005. [33] Eric Platon, Nicolas Sabouret, and Shinichi Honiden. A Definition of Exceptions in Agent-Oriented Computing. In Gregory O’Hare, Michael O’Grady, Oguz Dikenelli, and Alessandro Ricci, editors, Engineering Societies in the Agent World’06, 2006. [34] Eric Platon, Nicolas Sabouret, and Shinichi Honiden. Challenges for Exception Handling in Multi-agent Systems. In Software Engineering for Large-Scale Multi-Agent Systems, pages 45–50, 2006. [35] Eric Platon, Nicolas Sabouret, and Shinichi Honiden. Environment Support for Tag Interactions. In Environment for Multi–Agent Systems, 2006. [36] Anand S. Rao and Michael P. Georgeff. BDI Agents: From Theory to Practice. Technical report, Australian Artificial Intelligence Institute, 1995. [37] Alexander B. Romanovsky. Exception Handling in Component-Based System Development. In COMPSAC, pages 580–598. IEEE Computer Society, 2001. [38] Alexander B. Romanovsky, Christophe Dony, Jørgen Lindskov Knudsen, and Anand Tripathi, editors. Advances in Exception Handling Techniques (the book grow out of a ECOOP 2000 workshop), volume 2022 of Lecture Notes in Computer Science. Springer, 2001. [39] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, Edition 2003. [40] Nazaraf Shah, Kuo-Ming Chao, Nick Godwin, and Anne E. James. Exception diagnosis in open multi-agent systems. In Andrzej Skowron, Jean-Paul A. Barth`es, Lakhmi C. Jain, Ron Sun, Pierre Morizet-Mahoudeaux, Jiming Liu, and Ning Zhong, editors, IAT, pages 483–486. IEEE Computer Society, 2005.
An Architecture for Exception Management in Multi-Agent Systems
23
[41] Nazaraf Shah, Kuo-Ming Chao, Nick Godwin, Anne E. James, and C.-F. Tasi. An empirical evaluation of a sentinel based approach to exception diagnosis in multi-agent systems. In AINA (1), pages 379–386. IEEE Computer Society, 2006. [42] Nazaraf Shah, Kuo-Ming Chao, Nick Godwin, Muhammad Younas, and Christopher Laing. Exception Diagnosis in Agent-Based Grid Computing. In International Conference on Systems, Man and Cybernetics, pages 3213–3219. IEEE, 2004. [43] Reid G. Smith. The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Trans. Computers, 29(12):1104– 1113, 1980. [44] Fr´ed´eric Souchon, Christophe Dony, Christelle Urtado, and Sylvain Vauttier. Improving Exception Handling in Multi-agent Systems. In Carlos Jos´e Pereira de Lucena, Alessandro F. Garcia, Alexander B. Romanovsky, Jaelson Castro, and Paulo S. C. Alencar, editors, SELMAS, volume 2940 of Lecture Notes in Computer Science, pages 167–188. Springer, 2003. [45] Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, 2000. [46] Clemens Szyperski. Component Software. Addison-Wesley, 2002. [47] Andrew S. Tanenbaum. Distributed Operating Systems. Prentice Hall, 1994. [48] Anand Tripathi and Robert Miller. Exception handling in agent-oriented systems. In Romanovsky et al. [38], pages 128–146. [49] Danny Weyns. An Architecture-Centric Approach for Software Engineering with Situated Multiagent Systems. PhD thesis, Katholieke Universiteit Leuven, Leuven, Belgium, October 2006. [50] Danny Weyns, Andrea Omicini, and James Odell. Environment, First-Order Abstraction in Multiagent Systems. Autonomous Agents and Multi-Agent Systems, 14, number 1:5–30, February 2007. [51] Danny Weyns, Elke Steegmans, and Tom Holvoet. Towards Active Perception in Situated Multi-Agent Systems. Special Issue of the Journal on Applied Artificial Intelligence, 18(8–9), 2004. [52] Jie Xing and Munindar P. Singh. Engineering commitment-based multiagent systems: a temporal logic approach. In AAMAS, pages 891–898. ACM, 2003. [53] Jie Xing, Feng Wan, Sudhir K. Rustogi, and Munindar P. Singh. Commitmentbased interoperation for e-commerce. In ISADS, pages 161–168, 2001. [54] Jie Xing, Feng Wan, Sudhir Kumar Rustogi, and Munindar P. Singh. A commitment-based approach for business process interoperation. IEICE TRANSACTIONS on Information and Systems, E84-D(10):1324–1332, 2001. [55] Jie Xu, Alexander B. Romanovsky, and Brian Randell. Coordinated Exception Handling in Distributed Object Systems: From Model to System Implementation. In ICDCS, pages 12–21, 1998.