Decomposition Patterns for Mobile-Code-based Management Antonio Liotta, Graham Knight Department of Computer Science, University College London Gower Street, London WC1E 6BT - UK Phone: +44-171-419-3679; Fax: +44-171-387-1397 E-mail:
[email protected] ABSTRACT. The general aim of this paper is to discuss the feasibility of Mobile-code-based Network and Systems Management from a manager perspective. We consider a decomposition process that starts off with the description of management tasks and leads to the deployment of the correspondent Mobile Code. We identify and discuss the main issues related to this complex process, highlight some of their hurdles, and present our perspective on them. Keywords: Mobile Code, Mobile Agents, Distributed Network and Systems Management, Task Decomposition.
1
Introduction
The "Management by Delegation" paradigm, introduced by Yemini and Goldszmidt in 1991 Y. Yemini, G. G. Goldszmidt, S. Yemini, Network Management by Delegation. Integrated Network Management II, Amsterdam 1991., has sparked a widespread interest in the integration of two different research areas: "Distributed Network and System Management" and "Code Mobility". Many other researchers have described major limitations of the current centralised management systems 50000[5M-A. Mountzia, A Distributed Management Approach Based on Flexible Agents. Interoperable Communication Networks, Baltzer Science Publishers, Volume I/I, January 1998. ISSN 13859501.J-CH Gregoire, Models and Support Mechanisms for Distributed Management. Integrated Network Management IV. New York: Chapman & Hall, 1995. T. Magedanz, On the impacts of Intelligent Agent Concepts on Future Telecommunication Environments. In Proc. of the 3rd International Conference on Intelligence in Broadband Services and Networks IS&N 1995, Heraklion, Crete, Greece, October 16-20 1995. - e.g., lack of flexibility and scalability - and are now looking to the latest Mobile Agent 1 technologies for a potential means for improving management through decentralisation. However, merging Management with Code Mobility still poses a number of interesting and stimulating, albeit hard, problems to be dealt with M. Straßer, M. Schwehm, A Performance Model for Mobile Agent Systems . Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'97), Volume II, Editor H. R. Arabnia, Las Vegas 1997, Pages 1132-1140.M. Baldi, G.P. Picco, Evaluating the Tradeoffs of Mobile Code Design Paradigms in Network Management Applications. Proc. of the 20th International Conference on Software Engineering (ICSE'98), R. Kemmerer and K. Futatsugi, eds., Kyoto, Japan, April 19-25, 1998. . In fact, 1
For simplicity in this paper we use the term "Mobile Agent" (MA) as a synonym for "Mobile Code" (MC).
mobile-code-based management systems are likely to be quite complex to design, implement and maintain, in contrast to the relative simplicity of the current centralised models. Besides, it is still unclear whether or how mobility of code can be effectively used for management purposes. Some researchers are showing their doubts about the actual advantages of code mobility, in fact no comparative study of the different management models has been carried out so far. Finally, although some work as been carried out on Agents Design Yariv Aridor and Danny B. Lange, Agent Design Patterns: Elements of Agent Applications Design . Second International Conference on Autonomous Agents (Agents '98), May 1998. and on Architectural Styles for Agents Distribution C. Weir, Architectural Styles for Distribution, Using macro-patterns for system design. Second European Conference on Pattern Languages of Programming, EuroPLoP'97. June 1997., the application of these design techniques to the field of Management has not been investigated so far. The general aim of this paper is to discuss the feasibility of mobile-code-based management from a manager perspective. We consider a decomposition process that starts off with the description of management tasks and leads to the deployment of the correspondent Mobile Code. We identify and discuss the main issues related to this complex process, highlight some of their hurdles, and present our perspective on them. In consideration of the complexity of the matter we focus our discussion on the specification and decomposition of monitoring tasks for network and system management. The paper consists of three parts. First, we consider the specification of management tasks in a delegation-aware fashion - i.e., in such a way that tasks can be decomposed into sub-tasks that are meant to be delegated to remote locations. We envisage a task description language for this purpose, describing as an example a method for defining delegatable monitoring tasks. In the second part we describe some of the decomposition ingredients, that is the characteristics that have to be taken into account during the decomposition process. We describe different patterns of code mobility and show how overheads introduced by mobility can be accounted for. Finally, we outline the actual decomposition process of a sample task discussing the tradeoffs.
2
Specifying Monitoring Tasks
We may distinguish between two different levels in a taskspecification language: a high-level task-specification language and a low-level task-specification language. The former can be used by a manager to specify a monitoring task regardless of the way the task itself is then implemented. For instance, a manager may specify a
monitoring operation, a set of target locations to be monitored, and a set of monitoring constraints - e.g., maximum admitted error, maximum response time, etc. An example may be: "find out the most congested network interface within a certain internet domain; maximum response time: 1 minute". Decomposing high-level-specified tasks such as this one into a set of sub-tasks (to be implemented with different segments of MC) may be an extremely complicated objective and is beyond our current aims. In contrast, we have chosen to specify our tasks by means of a lower level specification language that is intended to make it possible the automatic decomposition of tasks and their implementation with MC. In the remaining part of this section we give a qualitative and graphical description of the low-level task specification language. A task can be specified as a graph (Figure 1) where at each node data values can be processed by a couple of functions, F{frow,i;fcol,i} (we also refer to it as "F operator") and filtered by a logical operator, Li. In this example objects "a", "b" and "c" are polled by the blocks 1, 4 and 6 respectively. Data values are processed by the respective functions and results are stored in objects 1, 4, and 6 respectively. Similarly, object 1 is polled by block 2, object 4 by block 5, and so on. a
F1,L1
F2,L2
b
F4,L4
F5,L5
F3,L3 c
Fm,Lm
F6,L6
Monitored Object Figure 1. Task Specification Graph.
A more detailed functional architecture of the basic operational block is shown in Figure 2. Each block communicates with its adjacent blocks according to three different models: polling, notification and alarms generation. Each block has three input and three output ports, depending on the communication model adopted. Input data may be collected by polling a set of blocks. In this case a sampling period, Sp, must be specified. Alternatively, a set of blocks may provide a periodic notification of input data. In this case a notification period, N p, must be specified. Finally, a block may receive alarms generated by other blocks. In the latter two cases the block has to subscribe to the other blocks in order to receive either their notifications or their alarms.
Sp
Op
Ap
Np
polls
polls frow,i fcol,i
Notific.
Notific.
Li alarms
alarms INPUT
operators OUTPUT
Figure 2. The basic functional block.
Another important parameter is the Analysing Period, A p, that is the interval between two different executions of the block operators (frow,i, fcol,i and L). The input data set is a bidimensional matrix, having A p/Sp columns and a number of rows equal to the number of monitored objects. The two functions (frow,i and f col,i) can be either as simple as an arithmetical operator or more sophisticated mathematical functions. They can also be small scripts or applications e.g., a UNIX shell script or a Java application. f row,i uses the matrix rows as input operands; it is applied to each of the rows, producing a mono-dimensional array. Then, f col,i uses this array as operand to generate the output. We can have different cases leading to either a single-value or a multiplevalue output, depending on the type of operands. The last parameter characterising a block is the Observation Period, Op, that is basically its lifetime. Thus, A p/Sp is the number of input samples that are processed by frow,i per each of the input objects. Whereas Op/Ap is the total number of times the operators are re-executed. The basic monitoring parameters - i.e., S p, Np, Ap and O p usually satisfy the following relation: Op >= A p >= S p ˜ Np Some of the special cases that can be encountered are: Sp = Ap. The F operator is evaluated every Sp seconds. Only one sample is used as input value for frow. The operator is executed Op/Sp times. Op = Ap. The F operator is only applied once - i.e. to the first set of input samples. Op = Ap = Sp. Like in the previous point, but in this case only one input value is used. The operator has a very short life. Op >> A p >> S p. This represents the most general case. The first relation indicates that the functional block lives for a relatively long time. The second shows that the operator collects a great number of samples before processing them. L= 0. No logical condition is applied to F. Then, no events are checked for and no alarms are generated. In other words either the polling mechanism or the notification one is enabled and no event-driven communication is supported.
1
100
10
0
polls
polls
1 2 3 4
AVG 1
max>2 INPUT
procedure overall response time. The latter two features are mainly due the task distribution. A graphical specification of the above monitoring task is represented in Figure 4. There are three main blocks. Each of them can generate alarms; the results from the second and third one can be also retrieved via polling. 0
Op
P
0
nil avg
P
Op
P
P
uptime nil
Managed Domain
2.1 Sample block A simple polling operator with additional smoothing may be defined as a block having frow=AVG and fcol=1. In the example of Figure 3 a sample every second is collected from four monitored objects (Sp=1) and 10 samples are buffered (Ap/Sp=10). The matrix size is then 4x10. Four data values are generated by F (one per monitored object), each containing the average of ten polled samples. The F operator is applied 10 times (O p = 100 and each cycle involves 10 polls). Moreover, the max of the four output data values is evaluated and an alarm is notified to each of the objects that have registered for it (only Od in this example) if this max is greater than 2 (L{max>2}).
>T2 0
>T1
Od alarms
Op
P
0
nil max
>T3
operators OUTPUT
Figure 4. A simple performance monitoring task. Figure 3. A simple polling operator with smoothing and events generation.
2.2 A monitoring task An example monitoring task in the field of System Management may be expressed as follows: TASK: Provide a snapshot of the total average CPU load for a given set of networked computers (generally belonging to different Internet domains). Notify an alarm to the manager if one of the following conditions occurs: the average load in any of the computers exceeds threshold T1 ; the overall average load exceeds threshold T2; the overall maximum load exceeds threshold T 3. This is a simplified example of a performance monitoring task. The system load average provides a convenient way to summarise the activity on a system. In a UNIX environment it may be defined as the average number of processes in the kernel’s run queue during an interval M. Loukides, System Performance Tuning. O’Reilly & Associates, Inc, 1990. ISBN 0-937175-60-9.. The UNIX command uptime provides an estimate of a computer system load average. Whereas the command ruptime provides the same estimate for a Local Area Network (LAN). ruptime is usually disabled in large LANs because of the heavy traffic it may introduce. An MC-based version of ruptime can have three major advantages. First, it may provide a load estimate for systems spanning over different LANs. Second, it may lead to a significant reduction in the network load introduced by this procedure. Finally, it may lead to a reduction in the
The task is executed every P seconds and lives for Op seconds. Thus, it is executed Op/P times in total. Notice that in this example we have N p=Ap=Sp.
3
Decomposition Ingredients
Decomposing a task is not an easy task as there are several ingredients to be taken into account (Figure 5). First of all it is essential to extract as much information as possible from the task specification. The optimisation strategy and the constraints might be crucial. Response time and bandwidth conservation are two typical optimisation strategies that can lead to totally different decompositions. For example, if we want to minimise response time, a reasonable solution might be to maximise the number of MCs implementing the given task. In contrast, if we want to minimise bandwidth consumption this solution might not be acceptable, especially if Op is relatively small. SPECIFICATION 1.Task Graph 2.Optimisation Strategies 3.Constraints
DECOMPOSITION 1.Aspects of Mobility 2. MC Paradigms 3.Patterns of Mobility 4.Cost of Mobility 5.MC infrastructure
DEPLOYMENT 1.MC configuration 2.Operational Pattern 3.Re-configuration Period
Figure 5. Process leading to MC generation and configuration.
In the following sub-sections we discuss the other decomposition ingredients listed in Figure 5. 3.1 Aspects of Mobility Task decomposition is aimed at an MC-based (or MA-based) implementation. The main aspects related to code mobility
should then be taken into account for the task decomposition. Some of these are discussed below.
resources reside. The component has local access to resources and can therefore execute the service.
Heterogeneity. A decomposition of a task leads to a number of corresponding sub-tasks. In some cases a task can be decomposed into identical sub-tasks. In other cases sub-tasks are heterogeneous.
3.3 Patterns of Mobility Travelling is the essence of MAs. We can specify some typical patterns of mobility for specialised MAs - e.g., MAs aimed at monitoring operations. Different patterns are characterised by different costs of mobility (see section Cost of Mobility) and fit then to different contexts. In general, after a task has been decomposed into sub-tasks, a set of MAs implementing these sub-tasks is generated. After that, MAs have to be configured - i.e., deployed to their initial locations. Then, they start their execution. Finally, MAs may need to migrate for several different reasons - e.g., to access different resources or to comply to optimisation strategies. Both MAs configuration and migration may apply a pattern of mobility. Some of them are described below as example.
Migration Autonomy. Sub-tasks are meant to be implemented with separate pieces of code - i.e., MAs. Depending on the implementation trade-off choices - i.e., either sophisticated but large MAs or simple and lightweight MAs - we may have the following four solutions: • • • •
Internally-triggered migration; externally-triggered migration; dynamically defined itinerary; pre-defined itinerary.
Inter-Agent Autonomy. Again, depending on the implementation trade-off choices we may have the following solutions: • • •
synchronously co-operating MAs; asynchronously co-operating MAs; non-co-operating (independent) MAs.
Inter-Agent Communication Model. MAs can communicate by polling, synchronous messaging and/or asynchronous messaging. An MA may make its results available for polling by other MAs. Messaging can be used instead for notifications or alarms propagation. Agent life-time. Depending on the task specification, the corresponding MAs can be once-off, triggered or continuous. In the first case a set of MAs are executed only once after being deployed. In the second one, they are deployed and executed on demand. Finally, continuous MAs have in indefinite life-time. MAs can also be executed a predefined number of times. Operations. We can have two different kinds of MAs: Monitoring MAs and Controlling MAs. The first type of MAs can only grab information from the hosting system. Whereas in the second case MAs can also change the state of the system. In this paper we only consider Monitoring MAs. 3.2 MC Paradigms The main MC paradigms, as described in 50000[6, are summarised below. Remote Evaluation (REV). The client sends a request to the server including the code component needed to perform the service. This code is then executed by the server, which has local access to the required resources and ships back the results. Code on Demand (COD). The client has already access to the resources needed by the service execution. However, it lacks the corresponding code component which is available on the server. Hence, it requests the code to the server and, after the code has been received, executes the service locally. Mobile Agents (MAs). The client knows how to perform the service but lacks the resources. Unlike REV, the whole computational component is migrated to the site where
3.3.1 Flat Broadcast If a given task can be implemented by deploying one or more sets of identical MAs, we can adopt a broadcasting mechanism. One or more MA templates are generated into a central location and corresponding clones are sent to a set of target locations. These clones are then executed remotely. Finally, either they return back to the original starting location or they send messages with the results (Figure 6). We name this pattern "flat broadcast" because no hierarchical clonation is allowed - i.e., a clone cannot clone other MAs. MA Clone A
MA Templ. A
MA Templ. B
MA Clone A MA Clone A MA Clone B MA Clone B
MA Clone B
Figure 6. MAs flat broadcast pattern.
This pattern can be particularly useful to deploy a set of MAs when the links involved in the deployment are relatively uniform in bandwidth and latency. After broadcasting MAs further local migrations can be enforced in order to implement optimisation strategies.
MA Clone
MA Clone MA Clone
MA Clone
MA Clone
MA Clone
MA Clone MA Clone
MA template
MA Clone
MA Clone
MA Clone
MA template
MA Clone
MA Clone MA Clone MA Clone
MA Clone
Figure 7. MAs hierarchical broadcast pattern.
Figure 8. The itinerary pattern.
3.3.2 Hierarchical Broadcast The hierarchical broadcast pattern differs from the flat one in the deployment mechanism. In hierarchical broadcast an MA clone can in turn generate and distribute other clones in a hierarchical fashion ( Figure 7)
This migration pattern may be useful when the task is sequential and can be implemented with one or more highly selective MAs. Alternatively a sequential task could be implemented with multiple MAs. However, if this approach is adopted, MA would have to be synchronised and this, in turn, might result in additional complexity and inter-MA communication overhead. As far as the selectivity is concerned, we have discussed this feature in 50000[7. We may define it as the ration between the data bytes processed and the data bytes produced by an MA. If an MA can filter out much information it is highly selective.
This pattern can be useful when the links involved in the deployment are not uniform in bandwidth and latency. In this case, and in order to reduce the bandwidth consumption during MAs deployment, the MAs initial distribution can be organised in a hierarchical fashion. An example is represented by a scenario where the MA template location is connected to the managed system through a highly congested link. In this case flat broadcast is not advisable. Instead, a first set of MA clones can be sent over the congested link. Other clones are then generated remotely and don’t involve the congested link any more. 3.3.3 Pre-defined Itinerary The itinerary pattern is described in Yariv Aridor and Danny B. Lange, Agent Design Patterns: Elements of Agent Applications Design. Second International Conference on Autonomous Agents (Agents ’98), May 1998.. We recall here some concepts. This travelling pattern is concerned with routing among multiple destinations. An itinerary maintains a list of destination, defines a routing scheme, handles special cases (such us what to do if a destination does not exist), and always knows where to go next (Figure 8). In a pre-defined itinerary patter the locations to be visited are known since the starting time - i.e., in the starting location.
3.3.4 Autonomous Itinerary Unlike the pre-defined itinerary, in the autonomous itinerary pattern the locations to be visited are not determined in the starting location. Instead, the capability to determine the MA next location is coded into the MA itself. MA Clone MA Clone MA Clone
MA Clone MA Clone
MA template
MA Clone
MA Clone MA Clone
Figure 9. Itinerary with flat broadcast pattern.
Therefore, this migration pattern is adopted when the itinerary cannot be determined a priori and is, instead, determined during the task execution. In this case MAs are highly autonomous. 3.3.5 Composite A composite migration pattern can be specified as a combination of the patterns defined above. An example is an itinerary with flat broadcast clonation pattern ( Figure 9). 3.4 Cost of Mobility Mobile code may introduce a significant overhead in terms of traffic and delay, in particular during the MAs
deployment process. MAs re-configuration through migration also adds overheads. Simple models accounting for the costs of code mobility are introduced in M. Straßer, M. Schwehm, A Performance Model for Mobile Agent Systems. Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'97), Volume II, Editor H. R. Arabnia, Las Vegas 1997, Pages 1132-1140. and M. Baldi, G.P. Picco, Evaluating the Tradeoffs of Mobile Code Design Paradigms in Network Management Applications. Proc. of the 20th International Conference on Software Engineering (ICSE'98), R. Kemmerer and K. Futatsugi, eds., Kyoto, Japan, April 19-25, 1998. . In A. Liotta, G. Knight, G. Pavlou, Modelling Network and System Monitoring over the Internet with Mobile Agents . Proceedings of IEEE/IFIP NOMS'98, Network Operations and Management Symposium. New Orleans, LA, 15-20 February 1998. we introduce the concept of Cost Functions that can help configuring MAs by taking into account three main parameters: preciousness of links, life-time of tasks, selectivity of the MAs implementing the task. The preciousness of a link may be estimated in terms of its network characteristics - e.g., latency and bandwidth. For example, if at a certain time a link has limited resources available - i.e., low bandwidth and high latency - we can say that the link is very precious at that time. Therefore, transporting data over that link at that precise time would be very expensive. Thus, we weight data transfer rate with preciousness in order to get the actual cost that has to be paid to transfer data over a link. Also the task life-time is important to determine cost of mobility. It is usually worth decomposing a long-living task at the smallest possible level of granularity in order to implement it with the largest possible number of MAs. On the other hand very short-living tasks - e.g., once-off tasks can perform better with little or no decomposition. MA selectivity is a crucial factor as well. If we want to compare the cost of mobility for two MAs of identical size but different selectivity we can say that the cheapest solution would be to move the more selective MA. Cost functions expressions also depend on the adopted pattern of mobility. As an example we now introduce a cost function accounting for the cost of mobility in terms of bandwidth consumption. We consider the "flat broadcast pattern of mobility" case. Similar functions can be defined for the remaining patterns of mobility. In general a cost function can be expressed as (Equation 1) C conf = C depl + C coll + C deliv were the three terms represent the cost for deploying MAs, collecting data from the monitored locations to the MAs, and delivering data (pre-filtered by the MAs) to the management locations respectively. In the following sub-sections we describe a mathematical model of these three terms for the flat broadcast pattern of mobility. We use the monitoring parameters defined in section Specifying Monitoring Tasks. We also introduce cost
coefficients, ki,j, to weight the network load injected between locations Li and L j by a given monitoring task A. Liotta, G. Knight, G. Pavlou, Modelling Network and System Monitoring over the Internet with Mobile Agents . Proceedings of IEEE/IFIP NOMS'98, Network Operations and Management Symposium. New Orleans, LA, 15-20 February 1998.. These coefficients are, thus, indicators of the preciousness of links. Finally, we assume that a given task is decomposed into m sub-tasks. 3.4.1 Deployment Cost If we assume that xj is the target location for a generic MA, MAj, and BMAj is its size in bytes, we can express the cost to deploy m MAs implementing the given task as: m
Cdepl = ∑ ( k 0 , x j * B MA j ) j =1
(Equation 2) Notice that m is evaluated at configuration time, that is when sub-tasks are allocated to MAs. 3.4.2 Collection Cost This cost depends on the kind of monitoring operation. For polling-based operations it can be expressed as follows:
bpoll ,i = ∑ ∑ k x j , xi * * Op ,i S j = 1 i ∈I ( j ) p ,i m
Ccoll , poll
where I(j) is the set of objects monitored by MAj, xi is the location of the generic object i, and bpoll,i is the number of bits used to code each sample. For event-based operations the collection cost can be expressed as follows:
Ccoll ,event = ∑ ∑ k x j , xi * Ρevent ,i * bevent ,i * Op ,i j =1 i ∈I ( j ) m
(Equation 3b) where Pevent,i represents the probability that the object i generates an event and bevent,i is the size in bytes of that event. 3.4.3 Delivery Cost The delivery cost also depends on the king of monitoring operation carried out by the MAs. In general, it can be expressed as m
Cdeliv = ∑ ( k x j , 0 * σ j * Bcoll , x j ) j =1
(Equation 4)
where Bcoll,xj represents the total amount of monitoring data collected by MAj. σj is its „selectivity“, that is the ratio between Bcoll,xj and the amount of data MAj delivers to the
manager, Bdeliv,xj – i.e., traffic between Lxj and L0 – and. The terms of σj may vary considerably, according to the kind of monitoring operation. The following sub-sections consider how σj can be evaluated for three different kind of operations: logging, polling-based alarm (or event) detection, and event-based alarm (or event) detection. 3.4.3.1 Logging operation For a „logging“ activity a manager may require a report of Bsumm,xj bytes, summarising the data monitored by MAj, every Ap,j seconds, so
Bcoll , x j =
∑
i ∈I ( j )
bpoll ,i S p ,i
* Op ,i
(Equation 5a)
Monitoring over the Internet with Mobile Agents . Proceedings of IEEE/IFIP NOMS'98, Network Operations and Management Symposium. New Orleans, LA, 15-20 February 1998.. Besides, MA-servers should provide managers with information about the preciousness of the relevant inter-MAs links (weighting coefficients). Finally, the infrastructure should provide the support for MAs migration. A further discussion of the functionality of an MA platform for management is reported in A. Liotta, G. Knight, G. Pavlou, Modelling Network and System Monitoring over the Internet with Mobile Agents . Proceedings of IEEE/IFIP NOMS'98, Network Operations and Management Symposium. New Orleans, LA, 15-20 February 1998..
4
A p, j
* O p, j
(Equation 5b)
Notice that Op,i and Op,j are referred to the observation period of objects and MAs respectively. Though, MAs can also be interpreted as a special case of monitored objects in this context. 3.4.3.2 Polling-based alarm or event detection For a „polling-based alarm or event detection“ activity the MA polls the relevant entities and filters out the collected data in order to generate events. Events of size bevent,j are generated with probability Pevent,j, then
Bcoll , x j =
∑
i ∈I ( j )
bpoll ,i S p ,i
* Op ,i
(Equation 6a)
m
Cdeliv = ∑ ( k x j , 0 * σ j * Bcoll , x j ) j =1
(Equation 6b)
3.4.3.3 Event-based alarm or event detection For an „event-based alarm or event detection“ activity the MA collects events of size bevent,i and probability Pevent,i and generates alarms as in the previous case, then
Bcoll ,x j =
∑P
i ∈I ( j )
event ,i
* bevent ,i
(Equation 7a)
m
Cdeliv = ∑ ( k x j , 0 * σ j * Bcoll , x j ) j =1
(Equation 7b)
3.5 MC Infrastructure A manager relies on information provided by the mobile code infrastructure in order to decompose, deploy and maintain a given task. Managers should be continuously notified about the number, the location and the state of MAservers - that is the execution environments for MAs A. Liotta, G. Knight, G. Pavlou, Modelling Network and System
P Managed Domain
Bsumm, x j
Bdeliv , x j =
Decomposing Monitoring Tasks
In the previous section we have discussed some of the basic ingredients for a task decomposition process. These may become numerous and even complicated if we try to include a wide range of possible monitoring tasks. Nevertheless, the aim of this paper is not to provide an exhaustive description of patterns. Instead we would like to discuss the decomposition problem from a perspective which is more related to the software engineering field. In this view, task decomposition is strongly related to task design - e.g. its specification - as well as to the information that can be obtained from the MA infrastructure. Op
P
P
0
Op
uptime nil
>T1
P
0
nil max
>T3
Figure 10. Sample task.
Thus, the first issue is whether it is possible to recognise patterns in the structure of a monitoring task. It is then interesting to see whether for each pattern it is possible to identify a suitable decomposition process. These observations suggest that what is really needed is a methodology for task design as well as a methodological approach to task decomposition. Ideally, we would like to be able to identify decomposition patterns that can be expressed as functions of the above decomposition ingredients. In the following sub-sections we discuss an approach to task decomposition by using the example of Figure 10, a simplified version of the example defined in section A monitoring task. 4.1 Task features of the sample task A characterisation of the sample task, expressed in terms of the decomposition ingredients described in section Decomposition Ingredients, is sketched below. The example task is homogeneous, as it can be decomposed into a number of identical sub-task. Thus, starting from a template MA, we can then clone and distribute other
Migration Autonomy depends on other decomposition and implementation choices. We may want to implement the task using an itinerary pattern of mobility. In this case we would inject a limited number of MAs, each of them having a predefined itinerary. Of course as an overall effect we should have that all target objects are monitored. Alternative choices could be Flat or Hierarchical Broadcast. If we assume that all monitored object reside in the same LAN, the flat broadcast pattern of mobility can be suitable. As far as inter-agent autonomy is concerned, in this task we have both synchronous and asynchronous co-operation. MAs co-operate synchronously among each other and asynchronously with the manager - i.e. through events. Similarly, both the notification and the asynchronous communication models are adopted. Finally, MAs are executed in a continuous mode, though they have a limited life-time, O p. Migration after deployment is advisable if Op/P is relatively large.
Managed Domain
4.2 A sample decomposition procedure We describe here a four-step approach to the decomposition of the sample task defined above. First, possible decomposition patterns are found out. Then cost functions are used to estimate the best one. A set of MAs matching this pattern is generated and deployed. Finally, MAs can be periodically reconfigured through migration if necessary. uptime
max
uptime
max
uptime
max
Managed Domain
identical MAs with the aim of increasing the degree of distribution of the task. The blocks in Figure 4 can be seen as the basic task pattern (i.e., the template MA).
uptime uptime
max
uptime
Figure 12. Horizontal Decomposition.
4.4 Cost of Mobility in terms of bandwidth The cost of mobility can be estimated using the equations of section Cost of Mobility. If we assume that all target objects are within the same LAN we can also assume that all links have a relatively comparable and stable preciousness. Which means that k i,j=K for each {i,j) in the LAN and K does not change significantly over time. Thus, in the following we can neglect these weighting coefficients. 4.4.1 Cost for vertical decomposition The three terms of the cost function can be expressed as follows:
C depl = m * ( BMA + BMA,1 + BMA, 2 )
(Equation 8a)
where BMA is the size of a MA regardless of the task it implements; BMA,1 represents the additional bytes needed to code the "uptime" sub-task; and BMA,2 represents the additional bytes needed to code the "max" sub-task; and m is the number of MAs.
C coll = ( N obj − m) * bc *
Op Sp
(Equation 8b)
where Nobj is the number of monitored objects; and bc is the size of a single poll. max
Figure 11. Vertical Decomposition.
4.3 Sample Decomposition patterns We can recognise two decomposition patterns for the above sample task. A vertical pattern and a horizontal one. The first one regards the two blocks (uptime and max) as a whole (Figure 11). Basically the number of processes collecting the raw data has to be determined. There are three main objectives at this level. First, to determine how many subdomains - portions of the managed domain - we want to generate. Second, to decide how to assign the target objects to the different sub-domains. Finally, to decide how many MAs will monitor these object and from which locations. The horizontal decomposition pattern ( Figure 12) is concerned with the decomposition of the two blocks. We have to decide which functions will be implemented within each of the MAs. The cost functions for the above patterns are discussed in the next sub-sections.
C deliv = m * bd *
Op
(Equation 8c)
Np
4.4.2 Cost for horizontal decomposition The three terms of the cost function can be expressed as follows:
C depl = m * ( BMA + BMA,1 )
C coll = ( N obj − m) * bc * C deliv = N obj * bd *
(Equation 9a)
Op
Op Np
Sp
(Equation 9b)
(Equation 9c)
4.5 MAs Initial Configuration By analysing equations 8 and 9 it is possible to choose which decomposition pattern is more suitable and to decide the number of MAs to be deployed. If we plot both cost functions against the variable m, we may distinguish among three different cases (see the three charts below): 1. Both slopes are positive. The horizontal decomposition with one MA implementing the whole task is the cheapest solution. This case usually occurs when Op/Sp is relatively small and B MA is relatively large. 2. The slope of the horizontal decomposition function is positive and the other one is negative. In this case there might be more than one solutions. Either horizontal
9440 9190 8940 8690 8440 8190 7940 7690 7440 7190 6940
Costs
Horizontal Decomp. Is cheaper in this area
1000
Vertic. Decomp. Horiz. Decomp.
Op/Sp = 1 Op/Sp = 3
500
Op/Sp = 5
0 -500
-1000
1
3
5
7
9
11
13
15
17
19
Vertical Decomp. Is cheaper in this area
-1500 m (Num ber of MAs)
2 3 4 m (Number of MAs)
5
One pos. and one neg. slope 70610 70410 Costs
1500
Both Costs have positive Slopes
1
Horiz. Decomp. Vertic. Decomp.
70210 70010 69810 20
80230
22
24 26 28 m (num ber of MAs)
30
Figure 13. Difference between costs.
We can follow two different directions in order to adapt to network changes: 1. Periodic re-configuration of the deployed MAs through migration; 2. Periodic re-execution of the whole decomposition procedure. The second approach may be required in cases where, due to dramatic changes of the network state, a different decomposition pattern and a different number of MAs are required. In other cases when the number of MAs has to be changed (but not their decomposition pattern) we can use mechanisms to control the MA population - e.g. through MA cloning.
Both Cost lines w ith negative slopes Vertic. Decomp. Horiz. Decomp.
80030 Costs
4.6 MAs Periodic Re-configuration In the above example we have neglected the weighting coefficients, ki,j, and as a result we have obtained a pretty stable system. In this case there is no need for a periodic reconfiguration of the MAs. In contrast, real systems can be characterised by a fast dynamics. Weighting coefficients may vary - e.g., because of network congestion - and so costs will vary accordingly. Consequently, the choices of decomposition pattern and number of MAs may change over time as well.
Cvert - Choriz
3.
decomposition with one MA or Vertical decomposition with a number of MAs equal to the maximum number of MA-servers. Both slopes are negative. In this case the type of decomposition depends on the maximum number of possible MAs. In the example shown below horizontal decomposition is cheaper if m28. In both situations the maximum allowed number of MAs should be injected. This case is usually occurs when Op/Sp is relatively large and BMA is relatively small.
79830 79630 79430 79230 25
26
27 28 29 m (number of MAs)
30
In Figure 13 the difference between the costs of the two decomposition patterns is plotted against the number of MAs. Different lines, corresponding to different values of Op/Sp are plotted. Notice that vertical decomposition tends to be more convenient if we can afford to deploy a relatively large number of MAs. As the task lifetime increases, the threshold value for m increases as well. For example if Op/Sp=1 then vertical decomposition is more convenient only if we can deploy more than 3 MAs. Whereas if Op/Sp=5 the threshold is m=13.
5
Conclusions
Our investigation moves from the following basic question: assuming that an MA infrastructure is available, how can we use it for management purposes? We have discussed the feasibility of mobile-code-based management and considered some of the issues involved in a task decomposition process. We think that the selection of a decomposition pattern can be made on the basis of two sets of data: data extracted from the specified task, and data extracted from the underlying network - e.g., latency and congestion among candidate target locations for the mobile code. We have described some basic decomposition patterns, illustrating them with examples.
6
Future Work
Our current and future work includes a further study of the decomposition process in the context of network and system monitoring. From the preliminary study presented here we believe that a further study of design techniques for delegatable tasks is essential. We think that properly designed tasks can be implemented with MAs and can, in several cases, lead to a significant reduction in network
bandwidth waste and response time. Therefore our investigation will be focused on the study of MAs configuration algorithms and MAs re-configuration algorithms through migration. Migration can be exploited in order to adapt to network dynamics. Acknowledgements We are grateful to Hewlett-Packard for their encouragement and sponsorship in relation to this work. We would like to thank the researchers at HP Laboratories Bristol (HPLB) for providing interesting and stimulating ideas. In particular we acknowledge Keith Harrison for his constructive criticism and constant encouragement. Helpful suggestions also came from members of the Department of Computer Science at UCL. References [1] M. Loukides, System Performance Tuning. O’Reilly & Associates, Inc, 1990. ISBN 0-937175-60-9. [2] Yariv Aridor and Danny B. Lange, Agent Design Patterns: Elements of Agent Applications Design . Second International Conference on Autonomous Agents (Agents ’98), May 1998. [3] C. Weir, Architectural Styles for Distribution, Using macro-patterns for system design. Second European Conference on Pattern Languages of Programming, EuroPLoP’97. June 1997. [4] M. Straßer, M. Schwehm, A Performance Model for Mobile Agent Systems. Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'97), Volume II, Editor H. R. Arabnia, Las Vegas 1997, Pages 11321140. [5] G. Pavlou G. Mykoniatis, J. Sanchez-P, Distributed Intelligent Monitoring and Reporting Facilities . The British Computer Society, The Institution of Electrical Engineers and IOP Publishing. Distrib. Syst. Eng. 3, pagg. 124-135, 1996. [6] M. Baldi, G.P. Picco, Evaluating the Tradeoffs of Mobile Code Design Paradigms in Network Management Applications. Proc. of the 20th International Conference on Software Engineering (ICSE'98), R. Kemmerer and K. Futatsugi, eds., Kyoto, Japan, April 19-25, 1998. [7] A. Liotta, G. Knight, G. Pavlou, Modelling Network and System Monitoring over the Internet with Mobile Agents. Proceedings of IEEE/IFIP NOMS'98, Network Operations and Management Symposium. New Orleans, LA, 15-20 February 1998. [8] Y. Yemini, G. G. Goldszmidt, S. Yemini , Network Management by Delegation. Integrated Network Management II, Amsterdam 1991. [9] M-A. Mountzia, A Distributed Management Approach Based on Flexible Agents. Interoperable Communication Networks, Baltzer Science Publishers, Volume I/I, January 1998. ISSN 13859501.
[10] J-CH Gregoire, Models and Support Mechanisms for Distributed Management. Integrated Network Management IV. New York: Chapman & Hall, 1995. [11] T. Magedanz, On the impacts of Intelligent Agent Concepts on Future Telecommunication Environments . In Proc. of the 3rd International Conference on Intelligence in Broadband Services and Networks IS&N 1995, Heraklion, Crete, Greece, October 16-20 1995.