INCA: An Agent-based Network Control Architecture - CiteSeerX

9 downloads 21666 Views 118KB Size Report
The design of our agent based network management platform is not bound to ... INCA is based on software agents which can migrate over the network to ..... it is still subject to research which of them is suited best for mobile agents in network.
INCA: An Agent-based Network Control Architecture J. Nicklisch, J. Quittek, A. Kind, S. Arao [email protected] C&C Research Laboratories, Berlin NEC Europe Ltd. Abstract This paper describes the design and implementation of INCA, an open architecture for the distributed management of multi-service networks and systems applications. The Intelligent Network Control Architecture is populated by stationary and mobile intelligent agents. These agents perform monitoring and control of network and systems components, thereby supporting the integrated management of networks and services. The architecture provides transaction capabilities to control transport and mobility of agents, agent prioritization, and multiple agent code transfer schemes. Managed objects used to access resources on network elements and new system functionalities can be created, distributed, and replaced dynamically. An example INCA application demonstrates that prioritized agents are necessary to support the timely execution of critical tasks. The design of our agent based network management platform is not bound to a particular programming language or computing environment; the current implementation however, is based on Java and RMI. Keywords: Mobile Agents, Network Management Platform

1 Introduction The centralized or hierarchical approaches to network management, being tightly coupled to the client-server paradigm, show well known limitations. Facing the growing complexity and extension of computer and telecommunication networks, it is commonly agreed that there is need to study technologies for distributed network management. The goal is to be flexible, scalable, and open to extensions. One of the promising approaches is intelligent software agent technology, which supports not only distributed applications but also application mobility. However, despite 1

the research efforts in this area, there is not much acceptance of agent-based techniques in the practical network management world 1 . One of the reasons might be that available agent platforms are general in their design and not customized towards this domain. As part of the activities at the NEC C&C Research Laboratories in this direction, an open platform for distributed management of multi service networks, INCA (Intelligent Network Control Architecture), has been developed. The most important difference between INCA and other agent platforms, among several customizations, is reliability. This paper gives a description of the techniques applied to the design of our architecture. INCA is based on software agents which can migrate over the network to perform delegated and/or mobile functions in an intelligent manner. There is also support for open signalling, allowing flexible service creation as well as extensibility and modular replacement of elements, services and functions. The remainder of this paper is organized as follows: after discussing related work in the following section, we motivate the use of software agents, intelligence, and mobility in management of networks and services in Section 3. The INCA architecture is introduced in Section 4, and the features which make it a reliable platform, wellsuited for network management and telecommunications applications, are discussed. Measurements demonstrating the usefulness of agent prioritization are presented in in Section 5.

2 Related Work The general advantages of decentralized and agent-based approaches to network management and telecommunications have often been addressed, for example see [6, 11, 10, 3, 1]. Currently, several mobile agent environments are available, mostly due to the wide acceptance of the Java environment with its inherent support for code mobility; see [7] for an overview. Four of the major environments for agent-oriented Java programming are: Aglets [8], Concordia [19], Odyssey2 (a Java-based subset of a platform formerly known as Telescript [18]), and Voyager3 . However, all these and many others are general purpose environments and not customized for network management, where reliability and efficiency are important issues. We argue that for high reliability transaction capabilities for interactions of distributed entities are required. Based on such safe interactions, a fault-tolerant control of the itinerary of a mobile agent is pos1

Exceptions are SNMP and CMIP agents. They are widely used but incorporate a very restricted application of agent technology. 2 Odyssey home page: http://www.genmagic.com/agents/odyssey.html 3 Voyager home page: http://www.objectspace.com/voyager/index.html

2

sible. For efficiency, agent prioritization and support of different agent code transfer schemes is desired. Concrete network management applications focusing on agent intelligence have been described by da Rocha and Westphall [4] and Somers [14], both using stationary agents. Sahai et al. have presented an agent environment customized for network and system management, called Astrolog [13]. They employ mobile agents to support the mobility of the network operator. However, the core network management system is based on a hierarchy of stationary agents. Surprisingly, even in the network management domain there is no flexible environment for mobile intelligent agents offering the means for reliability and efficiency mentioned above.

3 Network Management with Agents, Intelligence and Mobility To ensure effective and efficient function of large networks as well as provision of services, and furthermore to address the requirements of network providers, se rvice providers, retailers and users, there is a strong need for a coherent framework supporting automated and more intelligent end-to-end management solutions. Network management today is typically based on a client-server model with SNMP [2] and CMIP [16] as the de facto standards for monitoring and for managing a network of computers and devices. Managed objects, like hosts, routers, bridges, switches, printers etc., provide read/write access to a set of variables through a management process. A management station can then communicate with the management processes in a client-server relationship. Although simple, this centralized approach to network management has some severe drawbacks with regard to scalability, flexibility and performance. By dividing management functions into mobile, autonomous and intelligent computing entities (i.e. software agents), many of the problems with network and service management can be addressed:

 Scalability is increased, as management is not performed solely by the management station but delegated to distributed management agents.  Repetitive tasks can be avoided if software agents learn from experience.  Fault tolerance can be increased through the autonomy and learning capability of management agents. 3

 Better performance is achieved by moving the management functionality closer to the actual network element, thus reducing network traffic.  The low-level details of different devices can be hidden behind the agent interface.  Legacy systems can be integrated by using an agent for inter-operation. Typical scenarios like service provision through different administrative domains, including accounting and charging, can be handled by a multi-agent based approach to distributed network and service management in a more scalable and coherent way than with a centralized approach to network management. This is particularly true for the creation of new services in the telecommunications world. With the advent of high-speed networks based on ATM switching, there is demand for better support when creating and managing new services. Proprietary, lowlevel interfaces to network devices currently prevent a rapid development of telecommunication applications (see [17]). Control and management of scalable multi service networks is difficult since the switching software is usually tightly coupled with the individual switching devices. TINA [5], xbind [9] and DCAN [12, 17] focus on providing more coherent and flexible access to multi service networks. However, control and management of switching devices based on mobile software agents has not been investigated so far.

4 The Architecture INCA can be described as a mobile agent platform customized for network management and control. This section gives an overview of the architecture and discusses the major design decisions. In particular, we will explain the concepts of agent priorities, transaction capabilities, migration control, and multiple agent code transfer schemes, which in their combination distinguish INCA from other available agent environments.

4.1 General Architecture An instance of the INCA platform consists of a set of stations and of services offered locally by a station or globally by the platform. 4.1.1 Local Services A station is local to a network element and provides the following services:

4

Concurrent execution of multiple agents. Multiple agents – each with its own thread of control – can be executed concurrently. The agents are scheduled according to their priority attribute by an INCA specific scheduling scheme, which for example takes into account the current system load. No agent of lower priority will be executed if there is an agent with higher priority in a ’runnable’ state present at a station. Loading of agent code. Agent code can be transferred between stations. Three schemes for code transfer are supported: push, pull, and migrate. Pushed code is sent from an agent code repository to a dedicated station, pulled code is loaded by the station from a repository, and in migrating code is sent from one (non-repository) station to another. All stations support the same code format, i.e. the same code can be executed at any stations. In order to improve performance, caching of agent code is used. Transfer of agent state. The current state of an agent can be transferred from one station to another. In combination with agent code transfer, this service enables agent migration. Transaction capabilities. For peer-to-peer communication between stations a reliable and fault-tolerant layer is used offering transaction capabilities. Access to local resources. Access to local resources at a station is provided by managed objects. Managed objects represent network elements (or parts of these) that are to be managed. Their interface specifies which resource can be accessed and how. Usually, a station contains at least one instance of a managed object giving access to the resources of the local network element. Additional managed objects can be created dynamically on demand of an agent. Support for location and status monitoring of agents. Location and status of an agent migrating from station to station may be monitored by a central instance called locator. For monitored agents the station send messages to the locator on arrival and departure or termination of the agent. 4.1.2 Global Services Besides services provided by each single station there are services offered by the entire platform, or by a set of dedicated stations: Control. For each INCA platform there is at least one control station. This station executes a stationary control agent offering a control service for launching and monitoring agents. Usually, the location of this station is the console of the network manager. Repository. Dedicated stations offer a repository service. This service provides access to the code off all agents that can run on the platform. The repository can be accessed via the control service to push agent code to a station or by any station to pull agent code. 5

Locator. A unique station runs a stationary locator agent offering the location tracking service. The locator agent receives messages about monitored agents from the stations. Based on this information the locator agent answers requests from the control agent about location and status of other agents. Furthermore, the locator agent supervises predefined itineraries (see below). Itinerary control. The control service allows to link a mobile agent with a predefined itinerary. The itinerary becomes part of the agent state, but is not accessed by the agent itself. It is controlled by a station whenever a mobile agent is to be migrated, in order to determine the station to send it to. When an agent with a predefined itinerary is launched by the control agent, a copy of the agents itinerary is sent to the locator agent which compares the itinerary with location messages. Migration control. By combining itinerary control with persistence of agent state fault tolerance can be increased for the migration of agents. Inter-agent communication. Communication between agents is offered by a common interface for inter-agent message invocation. This functionality includes a naming service from which agents can get references to other agents.

4.2 Priorities Concurrent execution of multiple agents does not only increase the functionality of an agent platform, it also introduces the problem of scheduling agents. We observed that a fair and equal scheduling of all agents in a system does not match the requirements of network management applications. The groups of agents listed below illustrate these requirements, for example.

 Network Control Agents support the network manager e.g. in configuration and fault management of the network. The tasks they pursue are demand driven, perhaps triggered by network faults, and often have to be carried out as fast as possible.  Service Management Agents support the service provider, e.g. by a service subscription agent. They use the network environment together with the platform services to install and establish end user services, or reconfigure them.  Network Maintenance Agents perform repeatedly occurring management tasks, e.g. the gathering of data from selected network elements or fault analysis. Their tasks are carried out repeatedly, only rarely requiring adjustments by the network management instance.  Network Monitoring Agents perform routine tasks such as the filtering of raw data collected from network elements. Often, such agents are stationary, since their task is to monitor a concrete element of the network, or a link. 6

Usually, network control agents should be executed fast, because the network manager is waiting for the execution. In contrast to this, monitoring agents typically run for a long time and should be interrupted, when one of the other kinds of agents arrives to be executed. We use agent priorities to satisfy these requirements. An arriving agent of higher priority interrupts an agent of lower priority and no agent of lower priority is executed while an agent of higher priority is present and not blocked. By these means the problem above, can be solved by giving network control agents the highest priority, service management agents and network maintenance agents a medium priority, and monitoring agents the lowest priority. An application described in Section 5 demonstrates the usefulness of the priority-based approach.

4.3 Transaction Capabilities Most interactions in INCA are peer-to-peer communications between stations. In the platform we have a clear separation of different layers of communication facilities. INCA is designed to run on top of an object-oriented distributed middleware providing naming service and transparent access of objects regardless of their physical location. Between the middleware and the application, INCA contains another layer called transaction capabilities providing reliable, fault-tolerant interactions between two stations. The design of this layer is based on the SS7 (Common Channel Signalling System No.7) transaction capabilities [15]. SS7 transaction capabilities use distributed transaction monitors to ensure the reliable exchange of messages between two communicating peers. In the INCA platform we used this technique to support asynchronous exchange of messages as well as synchronous exchange. On the application level these two are sufficient to express all interactions of two stations conveniently. Retransmission attempts of messages, time-outs, health checks, and further means providing reliability interactions.

4.4 Migration Control When starting an agent with a predefined itinerary, a copy is stored at the locator. A failure at the station currently hosting the mobile agent does not necessarily lead to a breakdown of the agents task, since an up-to-date copy of the agent’s itinerary is kept, and a new agent can be created in order to visit the remaining stations of the itinerary. Of course, this fault-tolerant procedure should only be applied to agents which have been designed according to the procedure. In contrast to this concept, mobile agents 7

might not make use of such an itinerary at all, but instead determine the hosts to be visited dynamically, at runtime. It depends on the particular application which of both methods should be employed. We observe that applications of mobile agents which require high reliability, as it is the case for many network management tasks, can make good use of the itinerary concept. To understand why that is the case consider the following example. Suppose that a mobile maintenance agent is used to perform a routine task at all the network elements, or a fixed subset of it. An example could be an agent which examines the network elements and checks the current memory usage and the disk space used. In case of an alarming situation at the network element, it will send a message to the network manager, otherwise it proceeds along its itinerary. In such an example there is actually no state involved in the agent migration, apart from the information contained in the itinerary. Now consider a breakdown at one of the intermediate stations. Without any observing instance the work done by the mobile agent, i.e. the confirmation of a certain system state at the stations visited, would be lost. The platform would have to instantiate another agent and start from scratch, because it cannot be aware of how far the agent could proceed its task. And in case certain adjustments were made by the agent at the network elements, these would perhaps have to be overwritten. Since the system employs the locator, however, it can recover from such simple error cases and let another instance of the maintenance agent proceed the task from the the station following the one which experienced the breakdown. Excessive system and network load can be avoided, because this second instance does not need to repeat its predecessor’s work. Apparently, the idea of keeping a copy of an agent’s itinerary is not the full story, but only the most simplified explanation of a more general recovery scheme. When the agent state actually matters, as it is the case for more complicated or more intelligent agents, the knowledge of the itinerary alone is insufficient. Instead, the whole agent state should be preserved at any source station of an agent migration. Then, after the destination station experiences a fault, the locator instance times out in waiting for the message that confirms the agent’s arrival. It can therefore take further actions and restore the agent from the state previously cached at the source station.

4.5 Multiple Agent Code Transfer Schemes INCA uses mobile code for all agents. For stationary agents the agent code has to be transferred to a single station only, whereas mobile agents require the code on all visited stations. When migrating, the state of a mobile agent has to be transferred from the station the agent is leaving to the station the agent is going to visit next. But since the code of the agent is independent of the current agent instance and its state, it can be transferred in different ways, and it can be cached at stations. INCA supports three schemes for code distribution: pull, push, and migrate. 8

Figure 1: Push type agent code distribution Push type agent code distribution. This scheme is known from Internet services like PointCast4 or Marimba’s Castanet5 . While these services push user data to a list of subscribers, the push scheme in INCA pushes agent code to a list of stations. Usually, the push scheme is in conjunction with predefined itineraries. When the agent is launched, the code is pushed to all stations of the itinerary. So, when the agent instance arrives at a station, the corresponding code is already there and it can start execution immediately. The push scheme is realized by sending a messages to the repository station which pushes the code to the stations. Pull type agent code distribution. The pull scheme is probably the most widespread one, since it is used by web browsers to download Java applets and scripts. When this scheme is used, a station downloads the corresponding agent code from a repository station after an agent instance has arrived. This scheme can be applied to mobile agents with flexible itineraries. These agents may decide dynamically which station to visit next. The drawback of this scheme is a decrease of performance. When an agent arrives at a station, its execution has to be delayed until the agent code has been downloaded. Agent code caching can avoid this delay for subsequent arrivals of agents of the same type. An alternative would be pushing the code of the agent to all stations, 4 5

PointCast Network: http://www.pointcast.com Marimba’s Castanet: http://www.marimba.com/new

9

but this solution reduces scalability, produces unnecessary load on the network, and might increase memory requirements of the stations.

Figure 2: Pull type agent code distribution

Migrate type agent code distribution. This scheme is the most obvious one to use for mobile agents. Code is moved together with the agent instance from station to station. Usually, the performance is lower than the push scheme performance, but higher than the pull scheme performance. Compared to the push scheme the migration time is longer, because agent instance and code have to be transferred. Compared to the pull scheme, communication with an repository station is only required when launching an agent. After launch the agent is more flexible, because of being independent of the availability of a repository. Since there are advantages as well as shortcomings in all of these three distribution schemes, it depends on the application, on the type of task the mobile agent has to work on, which scheme is preferable. The availability of more than one scheme also gives more flexibility for cache management. All stations provide caching of agent code, in order to improve agent performance and to reduce communication load on the network. But since the cache size is limited, cached code has to be dropped from time to time. Now, if push type agent code has been loaded, this code may be dropped by the cache manager - if required - even before the agent which is going to use it has arrived. In this case, the agent code is loaded corresponding to the pull scheme when 10

Figure 3: Migrate type agent code distribution the agent arrives. There are good reasons to consider dropping pushed agent code from the cache before the agent arrived. The agent might be delayed because of congestion in the network or because of its low priority, or it might have terminated irregularly. In the latter case it will never arrive at some stations and it would be wrong to keep the code in the cache infinitely.

4.6 Implementation The current implementation of INCA is based on the Java language. Java already supports code mobility and dynamic class loading. However, we replaced the Java class loader in order to be able to load code from the repository station and to support multiple code mobility schemes. As communication infrastructure we use Java RMI, but CORBA is also supported. Agents are implemented as Java threads with agent priorities translated into Java thread priorities. This mapping follows the INCA system policy imposed on various agent types (see Section 4.2), and takes into account the load level of the current station (e.g. the memory usage). As such it is more powerful than Java’s priority-based scheduling. Agent instance transfer is based on Java object serialization. The state of an agent is represented by an object which can be sent from station to station in its serialized representation. 11

A set of managed objects is provided as a library. Network elements like workstations or servers which can host an INCA station are supported by managed objects with direct access to the network element. Network elements without capabilities to run a Java virtual machine, like routers and ATM switches are supported by proxy managed objects accessing these elements via the local area network. However, our experiences with INCA showed that for many applications co-development of agents and managed objects is desirable.

5 Impact of Prioritization In order to demonstrate the usefulness of agent prioritization for network management applications, we chose a simple scenario. It consists of a set of connected LANs. At each LAN there is one network element hosting an INCA station. Each INCA station provides a set of managed objects allowing agents to access the network elements in the LAN. Furthermore, we consider a management station launching agents for monitoring and control. Monitoring agents are stationary and monitor dedicated network elements of a LAN. Thus, there typically are more than one monitoring agents at a station. Different to these continuously running monitoring agents, there are control agents which are launched interactively by a network operator to perform a usually short task. For these agents instantaneous task completion is desired. Figure 5a) shows measurements with three INCA stations, each of them hosting the same variable number of monitoring agents. These agents all have the same priority. Then a control agent is launched to visit each of these stations once to perform his task. If this agent has the same priority as the monitoring agents (dashed line) the time required for its completion increases linear to the number of threads per station. If otherwise he has a higher priority than the monitoring agents (solid line) the time required for its execution is almost independent from the number of monitoring agents and the reaction to the operator is always as fast as without any monitoring agents. Figure 5b) shows a similar measurement. Here the number of threads per station is a constant of two for all stations, but the number of stations is varied. Again, the advantage of high priorities for control agents can be observed. The measurements were conducted on Windows NT 4.0 workstations with Pentium 166 processors, connected by 10BaseT ethernet.

6 Conclusion INCA, the Intelligent Network Control and Management Architecture, is an infrastructure to support applications in the integrated network and service management and telecommunication areas. INCA is especially targeted at distributed management ap12

Figure 4: Measurement Setup plications based on intelligent mobile agents. This technology overcomes several of the inherent difficulties of centralized, client-server based management systems. The architecture provides prioritized agents, transaction capabilities, migration control, and multiple agent code transfer schemes to support the creation of highly reliable and efficient applications. The impact of agent prioritization has been demonstrated for an example network management application.

7 Future Work So far, INCA agents are implemented in a straightforward manner, with the algorithms coded directly into them. Although this approach is sufficient for smaller applications and quick prototyping, we are going to extend the INCA platform by libraries for agent intelligence. Hence, our main work is currently focused on choosing a proper goal representation and problem solving formalism. Also we are aware of the importance of security in agent systems, in particular in network management. Various security mechanisms have already been proposed, but it is still subject to research which of them is suited best for mobile agents in network 13

execution time / seconds

60

60

40

40

20

20

0

0

2 4 6 a) threads per station

0

8

1

3 5 b) number of stations

7

Figure 5: Execution time of a control agent management.

8 Acknowledgments This work was carried out as part of our efforts towards maintainable high speed networks for multimedia communications, at the NEC C&C Research Laboratories in Berlin. We are grateful to S. Iwasaki and N. Elshiewy for valuable feedback and comments, and we would like to thank S. Robidou for implementing INCA applications and performing the measurements used in this paper.

References [1] BALDI , M., G AI , S., AND P ICCO , G. P. Exploiting code mobility in decentralized and flexible network management. In Proc. First Int. Workshop on Mobile Agents (Berlin, 1997), K. Rothermel and R. Popescu-Zeletin, Eds., vol. 1219 of Lecture Notes in Computer Science, Springer-Verlag, Berlin, pp. 13–26. [2] C ASE , J., F EDOR , M., SCHOFFSTALL , M., AND DAVIN , J. A simple network management protocol (SNMP); RFC-1157. Internet Request for Comments, 1157 (May 1990). [3] C HARLTON , P., CHEN , Y., M AMDANI , E., O LSSON , O., P ITT, J., S OMERS, F., AND W EARN , A. An open agent architecture for integrating multimedia services. In Proceedings of the 1st International Conference on Autonomous Agents (New York, February 5–8 1997), W. L. Johnson and B. Hayes-Roth, Eds., ACM Press, pp. 522–523. 14

[4]

ROCHA , M. A., AND W ESTPHAL , C. B. Proactive management of computer networks using artificial intelligence agents and techniques. In The Fifth IFIP/IEEE International Symposium on Integrated Network Management (1997), Chapman & Hall, pp. 610–621. DA

[5] D E LA F UENTE , L. A., PAVON , J., AND S INGER , N. Application of TINA-C architecture to management services. Lecture Notes in Computer Science 851. [6] G OLDSZMIDT, G., AND Y EMINI , Y. Distributed management by delegation. In 15th International Conference on Distributed Computing Systems (1995), IEEE Computer Society. [7] K INIRY, J., AND Z IMMERMAN , D. A hands-on look at java mobile agents. IEEE Internet Computing 1, 4 (1997), 21–30. [8] L ANGE , D. B., OSHIMA, M., K ARJOTH , G., AND KOSAKA, K. Aglets: Programming mobile agents in java. In 1st Int’l Conf. on Worldwide Computing and Its Applications ’97 (WWCA97) (Mar. 1997), T. Masuda, Y. Masunaga, and M. Tsukamoto, Eds., LNCS, SV. [9] L AZAR , A. A. Programming telecommunication networks. IEEE Network (1997), 8–18. [10] M AGEDANZ , T., ROTHERMEL , K., AND K RAUSE , S. Intelligent agents: An emerging technology for next generation telecommunications? In IEEE INFOCOM 1996 (1996). [11] M EYER , K., ERLINGER, M., BETSER , J., S UNSHINE , C., G OLDSZMIDT, G., AND Y EMINI , Y. Decentralizing control and intelligence in network management. In Integrated Network Management IV (1995), A. S. Sethi, Y. Raynaud, and F. Faure-Vincent, Eds., Chapman & Hall, pp. 4–16. [12] ROONEY, S. The hollowman an innovative atm control architecture. In IM (1997). [13] S AHAI , A., B ILLIART, S., AND M ORIN , C. Astrolog: A distributed and dynamic environment for network and system management. In Proceedings of the 1st European Information Infrastructure User Conference (February 1997). [14] S OMERS, F. Hybrid: Intelligent agents for distributed atm network management. IATA workshop at ECAI’96, Budapest, August 1996. [15] Signalling System No.7 – Functional Description of Transaction Capabilities, March 1993. ITU-T Recommendation Q.771. [16] S TALLINGS , W. SNMP, SNMPv2 and CMIP: The Practical Guide to NetworkManagement Standards. Addison-Wesley, Reading, 1993. 15

[17]

M ERVE , K., AND L ESLIE , I. M. Switchlets and dynamic virtual atm networks. In IM (1997). VAN DER

[18] W HITE , J. E. Telescript technology: The foundation for the electronic marketplace. White paper, General Magic, Inc., 2465 Latham Street, Mountain View, CA 94040, 1994. [19] WONG , D., ET AL . Concordia: An Infrastructure for Collaborating Mobile Agents. In Proc. First Int. Workshop on Mobile Agents (Berlin, 1997), K. Rothermel and R. Popescu-Zeletin, Eds., vol. 1219 of Lecture Notes in Computer Science, Springer-Verlag, Berlin, pp. 86–97.

16

Suggest Documents