Autonomic Adaptation in Distributed Systems and

0 downloads 0 Views 267KB Size Report
Stochastic schemes involve minimum variance (Clarke and Gawthrop (1975), Gawthrop ... the dynamics of the controlled process) and its derivative, defined over a compact set. ... Spikes will go out to some neuron j with probability P + ..... load such as number of messages and file-copies which need processing or transfer.
Autonomic Adaptation in Distributed Systems and Networks∗ Preliminary Version Erol Gelenbe School of Electrical Engineering & Computer Science University of Central Florida Orlando, FL 32816 [email protected]

Abstract There is now considerable interest in industry and academia for “autonomic” or self-adaptive networks and distributed systems. Large-scale applications, such as simulation, typically run on large distributed systems in which it is impossible to guarantee at each instant the full reliability and availability of all processing nodes and of all communication links. Nevertheless such systems have to accomplish their mission and provide a high level of dependability and the best possible performance to critical applications. Self-adaptation will therefore be an intrinsic feature of large distributed systems and networks in order to move tasks and files in response to changes in subsystem dependability and system load, and to provide Quality-of-Service and dependable connections in the presence of fluctuating workloads and unknown system behavior. In this paper we review our work on the design of adaptive on-line task management algorithms for distributed systems, and the control of cognitive packet networks (CPN) to offer user specified QoS.

1

Adaptive control

Recalling L.A. Zadeh’s (1963) definition [81], consider a system subjected to a family {S γ } of time functions, and let Pγ be the performance of the system under Sγ . Let W be a family of acceptable performances for the system. We say that the system is adaptive with respect to {S γ } if Pγ  W, for each γ. This definition says what the system must do, but does not say how it must do it. In more pragmatic terms, an adaptive system is one which adjusts its parameters, as its environment and its own characteristics change, so as to maintain “good” (if not optimal) operation according to some criteria which have been specified or previously agreed upon. Neural networks have an intrinsic capability to adapt through learning. Therefore they are strong candidates for being components of adaptive controllers. Adaptive control originated with the ideas of Drenick and Shahbender [14], Bellman and Kalaba [7] and others. In its simplest and theoretically most developed form, it is an extension to mildly non-linear and slowly time varying systems, of the classical theory of linear system control: Astr¨ om and Wittenmark (1989), Gawthrop (1987, 1990), Kokotovic‘ (1991), Narendra and Annaswamy (1989), Sastry and Bodson (1989). An adaptive system is often composed of three elements: an identifier carrying out on-line estimation of parameters of interest, a controller providing decisions according to the estimated parameters, and a plant, which is the system begin controlled. There is an abundant literature on real-time parameter identification or estimation: Ljung (1987), S¨ oderstr¨ om and Stoica (1989), Goodwin and Sin (1984). The algorithms used assume a linear in the parameters model which allows application of recursive least squares and its refinements. Under the assumption of linearity and Gaussian noise it can be shown (e.g. Goodwin and Sin (1984)) that the estimates converge to a unique value, provided the essential dynamics of the system are sufficiently excited. This important issue, called the persistence of excitation (see e.g. Sastry and Bodson (1989)), is connected with an inherent trade-off of adaptive control: on the one hand the estimation error should ∗ This research was supported by U.S. Army PEOSTRI via NAWC under Contract No. N61339-02-C0117, and NSF under Grants No. EIA0086251 and EIA0203446.

i

signal so that it can be effective (identification objective). Although estimation is an essential part of an adaptive system, the system’s purpose is control. Approaches vary widely: PID autotuning (Astr¨ om and H¨ agglung (1988)), pole placement (Astr¨ om and Wittenmark (1990)), stochastic schemes (Astr¨ om (1970)). Stochastic schemes involve minimum variance (Clarke and Gawthrop (1975), Gawthrop (1987)), LQG (Hunt (1989), Klarn` y et al. (1985)) and dual control (Astr¨ om and Wittenmark (1989)). A related, but different approach is that of predictive control (Clarke et al. (1987a, 1987b), Demircioglu and Gawthrop (1992), Bitmead et al. (1990). Robustness of the controls is an important issue: Praly (1986) and Anderson et al. (1986), as well as the recent method of averaging discussed by Ljung (1977), Kumar and Varaiya (1986). Stability of the resulting controlled system is also important: see e.g. Narendra and Annaswamy (1989). There are two main approaches to adaptive control: direct adaptive control updates the controller parameters directly (these schemes are sometimes called implicit) from measured or estimated data. When the control parameters are obtained indirectly via a a model of the plant whose parameters need to be estimated, the control is said to be indirect. Many problems with adaptive control systems remain open, in particular in a stochastic environment. The goals of adaptive control are still far from being realized, and there is a widespread agreement that the field is still in its infancy. Yet, adaptive control is necessary in many applications, because the parameters in a system undergo significant changes or cannot be measured with sufficient accuracy, rendering the application of classical control methods either unreliable or impractical. This paper specifically addresses system adaptation in the framework of distributed systems and packet networks, in the light of work we have conducted on load balancing in distributed systems and QoS driven routing in packet networks.

1.1

Contents of this paper

In this paper we first present a framework for the control of a distributed computer system where the purpose is to balance system load to improve performance. We review past work on load balancing and then summarize our own results on a direct adaptive control approach to dynamically balance the load in a distributed system. The direct adaptive control is implemented and compared experimentally with reasonable heuristic methds. We then discuss a second application we have implemented for networking system that routes packets dynamically in a network to offer using defined QoS. This system uses smart packets to search for the best routes in view of the QoS requested by the user, while the actual user traffic is transported by “dumb” packets over the routes which have been found by the smart packets. Direct adaptive control is applied to find routes for the smart packets. The control algorithm in this application is based on reinforcement learning, which is invoked each time a smart packet must search for the next hop towards its destination.

2

Neural networks for adaptive control

The intrinsically self-adaptive nature of neural networks, and the successful theoretical advances recently accomplished in this area, encourage research and applications of neural networks for adaptive control. Descriptions of recent work in this area include [62, 61, 63]. Again one may consider direct and indirect approaches: Direct adaptive control by neural networks. Here the penalization of the feedback control law is done directly without using a parametrisation of the plant. The feedback law is usually a neural network propagation function. It is therefore a parametric model of the control, which has to be determined through a penalization procedure. The feedback control learning rules usually use gradient methods to penalize control error (observed in terms of tracking error) through modification of network parameters. Indirect adaptive control by neural networks. Most neural networks techniques for control address indirect control problems, in which the parameters of the plant are estimated at any instant, and the parameters of the controller (in this case, the weights of the neural network controller) are adjusted assuming that the estimated parameters of the plant represent the true values. It requires either a learning stage (identification of the plant), or a neurally expressed model of the plant, or a combination of the two. The most popular neural network method for indirect control is Back-propagation Through Time [77], which does not require any knowledge about the plant, but identifies the plant with a neural network, and evaluates the Jacobian by back-propagation. An example can be found in [80]. A neural

network controller output through two networks: the forward model and the feedforward controller. Backpropagation Through Time has been applied to various problems, such as the cart-pole system [79], the truck backer-upper [80]. In [62], a similar indirect control method is applied for the identification and control of nonlinear problems. This algorithm does not require any prior knowledge of the plant. Training of two different networks, the second one depending on the accuracy of the first, is time consuming, and there is no guaranteed that plant dynamics would not evolve during the (very long) learning phase. For identification, the problem is to find a neural network able to approximate any continuous function (i.e. the dynamics of the controlled process) and its derivative, defined over a compact set. A key theoretical result is due to Sontag (1990) [76] who showed that the class of two-hidden-layer networks contains all feedback laws that will stabilize a system over a compact set (i.e. all trajectories of the system coming from an initial condition from a compact set converge asymptotically to an equilibrium point). This is an existence result which does not provide a constructive procedure. Sontag has also shown, by counterexample, that one-hidden-layer nets do not have this property. Adaptive algorithms either follow a steady-state or fixed-point objective, or they follow a trajectory. Such diversity is impossible for feedforward networks. Since trajectory learning is of particular interest from the control viewpoint, both for identification and control, significant adaptive neural control can only be achieved with recurrent (i.e. feedback) neural networks.

2.1

The Random Neural Network (RNN) Model

In the random neural network model (Gelenbe (1989,90) [21, 22]) signals in the form of spikes of unit amplitude circulate among a finite set of neurons. Positive signals represent excitation and negative signals represent inhibition. Each neuron’s state is a non-negative integer called its potential, which increases when an excitation signal arrives to it, and decreases when an inhibition signal arrives. In this model, neural potential also decreases when the neuron fires. Thus, an excitatory spike is interpreted as a ‘+1 signal at a receiving neuron, while an inhibitory spike is interpreted as a −1 signal. A neuron i emitting a spike, whether it be an excitation or an inhibition, will lose potential of one unit, going from some state l to the state l − 1. The state of the n-neuron network at time t, is represented by the vector of non-negative integers k(t) = (k1 (t), ..., kn (t)), where ki (t) is the potential or integer state of neuron i. We will denote by k and ki arbitrary values of the state vector and of the i − th neuron’s state. Neuron i will ”fire” (i.e. become excited and send out spikes) if its potential is positive. The spikes will then be sent out at a rate ri , with independent, identically and exponentially distributed interspike intervals. Spikes will go out to some neuron j with probability Pij+ as excitatory signals, or with probability Pij− as inhibitory signals. A neuron may also send signals out of the network with probability P − + d(i), and: d(i) + j=1,...,n Pij + Pij = 1. + − For convenience, we will call ωij = ri Pij+ , and ωij = ri Pij− . Here the “omega’s” play a role similar to that of the synaptic weights in connexionist models, though here they specifically represent rates of excitatory and inhibitory spike emission. Exogenous (i.e. those coming from the “outside world”) excitatory and inhibitory signals also arrive to neuron i at rates Λi , λi , respectively. Note that we are dealing here with a recurrent network of arbitrary topology. The “calculus” associated with RNN model is based on the probability distribution of network state p(k, t) = P r[k(t) = k], or with the marginal probability that neuron i is excited qi (t) = P r[ki (t) > 0]. As a consequence, the time-dependent behaviour of the model is described by an infinite system of Chapman-Kolmogorov equations for discrete state-space continuous Markovian systems. For the RNN they are: X X − ++ + +− ] (1) , t)ωij + p(kij , t)ωij [p(kij p(k, t) = [Λi p(ki− , t) + (λi + ri d(i))p(ki+ , t)] + ij

i

where the vectors used have the following meaning:

+− kij ++ kij

ki− = (k1 , ..., ki − 1, ..., kn ), ki+ = (k1 , ..., ki + 1, ..., kn ), = (k1 , ..., ki + 1, ..., kj − 1, ..., kn ), = (k1 , ..., ki + 1, ..., kj + 1, ..., kn ).

+ − it is excited, will send spikes to neuron i at a frequency ωij = ωij + ωij . These spikes will be emitted at exponentially distributed random intervals. In turn, each neuron behaves as a non-linear frequency demodulator since it transforms the incoming excitatory and inhibitory spike trains’ rates into an “amplitude”, which is qi (t) the probability that neuron i is excited at time t. Intutively speaking, each neuron of this model is also a frequency modulator, since neuron i sends out excitatory and inhibitory spikes at rates (or frequencies) qi (t)ri Pij+ , qi (t)ri Pij− to any neuron j. Often we will use the stationary probability distributions associated with the model. These are denoted: p(k) = lim p(k, t), qi = lim qi (t), i = 1, ..., n. (2) t→∞

t→∞

Theorem (Gelenbe 1989,90) [21, 22] The following is the basic mathematical property of the RNN model: the stationary probability distributions associated with the RNN – when they exist – are given by the solution of the following system of non-linear fixed-point equations: PN + Λi + j=1 qj ωji , i = 1, ..., N (3) qi = PN − ri + λi + j=1 qj ωji ki Furthermore: p(k) = ΠN i=1 qi (1−qi ) . The quantities which appear in the numerator and denominator of 3 are the frequencies at which excitation or inhibition spikes arrive at the i − th neuron, as well as the characteristic firing frequency ri of the neuron. For a complete theoretical development, the reader can consult [21, 22]. Indeed, 3 is also a “sigmoidal” relation. The computational algorithm for the q i ’s is a fixed-point iteration: PN + Λi + j=1 qjm ωji m+1 qi = , i = 1, ..., N (4) PN − ri + λi + j=1 qj ωji

This equation is a sigmoidal relation between the states 0 ≤ qim ≤ 1 of the neurons at “step” m and at step m + 1 of the computation. The learning algorithm for the model as well as stability conditions are discussed in [25] where the existence, uniqueness and stability of the solutions to the random neural network are also developed. Theorem (Gelenbe 1993) [25] The random neural network is stable, i.e. there is a well defined stationary probability distribution p(k) given above for the network state if and only if the solution qi , i = 1, ..., n to the system of fixed point equations 3 is such that 0 ≤ qi < 1 for all i = 1, ..., n. Furthermore, by virtue of the uniqueness of solutions to the underlying Chapman-Kolmogorov equations, the stationary probebility distribution is unique. Thus the RNN has the following nice property for a “recurrent” network: if the non-linear fixedpoint equations which describe the state of each neuron have a solution for which the 0 ≤ q i < 1 so that they can be interpreted as probabilities, then the network is stable and the solution obtained is unique. Otherwise the network is not stable. Thus the numerical solution of the fixed-point system (1) can be used to test for stability. More stringent sufficient stability conditions, which can be verified by inspection, have also been proved (in particular the “hyperstable” networks discussed in Gelenbe (1990) [22] for which the incoming excitatory signal flows to each neuron are always strictly smaller than the neuron’s firing rate). The RNN model has been applied to a wide variety of problems including image texture analysis [19, 20], associative memory and pattern recognition [23], and combinatorial optimization [26]. 2.1.1

Input-Output relation of the RNN

One can consider that 3 computes using a non-linear input-output relation. The input variables are: • The rates of the external excitation signals to the n-neurons Λ = (Λ1 , ..., Λn ), • The rates of the external inhibition signals to the n-neurons λ = (λ1 , ..., λn ). The output variables are the state variables q = (q1 , ..., qn ), qi  [0, 1], or functions composed using them. Learning with the Random Neural Network Model We have developped a gradient based learning algorithm for the RNN model (Gelenbe(1993) [25]) which computes the excitation and inhibition wights

y = yi  [0, 1], i = 1, ..., n, are presented to the network for training. If C(q, y) is a continuous differentiable penalty or cost function, then the computation of the gradient ∂C ∂ω for the set of all weights ∂q + − , which is ω = ωij , ωij is of complexity O(n3 ). In fact, this computation is based on calculating ∂ω + − obtained by inverting a matrix [I − W ], where W = {wij } and for each i, j, wij = wij − wij q(x). For sparse networks (i.e. those which do not need to interconnect all neurons), and for planar networks which have non-crossing interconnections in the plane, the complexity will be substantially lower. In the two following sub-sections we briefly discusse two rather different applications of the RNN.

2.2

Application of the RNN to the Traveling Salesman Problem (TSP)

In order to illustrate the use of the RNN for static, rather than adaptive, optimization we will briefly discuss its application to the well known TSP. However we create an artificial dynamical system we call the Dynamic Random Neural Network (DRNN) (Gelenbe et al. (1993) [26], which constructs a CohenGrossberg type [94] dynamical behaviour. The DRNN assumes an additional time-delayed feedback defined by an equation Cohen-Grossberg equation [94]. For any of the undetermined parameters, and any qi we have: dv ∂F (q) = A(qi )[B(qi ) − ] i = 1, . . . , N (5) dτ ∂qi F (q) is a penalty function which will be optimized. Clearly, a good candidate for F (q) is C defined above, or some related cost function. The gain function A(qi ) is the control parameter for the convergence rate, and the decay function B(qi ) allows us to place the attractors in appropriate positions of the state space. These functions constitute the dynamic parameters of the DRNN whose performance is sensitive to their choice. We have applied the DRNN approach to a static optimization problem, the travelling salesman problem (TSP) [26], and have obtained very encouraging results on 10 and 20 city problems. On more than 1000 instances of TSP, we were able to find optimum solutions in more than 70% of the cases, and we could do better than the nearest neighbour heuristic in more than 70% of the cases. The DRNN was observed to be superior to other neural methods such as that of Aiyer et al [87, 88], and of Abe [85]. Thus we believe that it can provide an interesting extension to the RNN in relation to dynamic, as well as static, optimization problems.

3

Distributed system behavior

In order to describe the direct adaptive control approach to load balancing in a distributed system, consider an M-node distributed system, where all nodes are interconnected via a high-speed network. Each node is a processing facility for a large scale distributed application such as a distributed simulation The state si (t) of any node i at time t can have one of two values: 1 (up) if it is in normal operating mode, and 0 (down) if it has failed. The nodes are interconnected by a network which we do not represent in detail in terms of physical links or topology, but we assume that it is possible to send information between any two nodes via some network path if it is in normal operation mode. We denote by K(i, j) the network path from node i to node j, and represent its state by the variable e(i, j, t) = 1 , if the path is “up”, and e(i, j, t) = 0 if it is “down”. If path K(i, j) is down, then messages or tasks cannot be sent from i to j. Note that K(i, j) and K(j, i) are, in general, distinct paths. It is important to know whether a node can be reached, or can reach, other nodes. For instance, if several outgoing or ingoing paths of a node are “down”, this may disable the node from functioning properly. Thus we will define the outgoing EO,i (t), and ingoing EI,i (t) communication states of node i by: PM PM e(i,j,t) e(j,i,t) j=1,j 6= i j=1,j 6= i EO,i (t) = , E (t) = I,i M −1 M −1 Clearly, under normal operating conditions, these quantities will be 1 for all nodes. The “comunication state” of node i will be denoted by Ei (t) = EI,i (t).EO,i (t); if its value is less than 1 this implies that at least one of node i’s ingoing or outgoing communication paths is “down”.

any time. Each node i has a set of tasks running or waiting to run; the total load at time t due to these tasks is denoted by Li (t). As they run, these tasks may need to communicate with with each other via the paths which connect their respective nodes (or directly within the node, for tasks which reside on the same node). New tasks can be generated at nodes, and tasks can also “die”, i.e. finish their execution and remove themselves from the system. Tasks and files can be moved dynamically from one node to another for specific reasons, such as: • Dependability. Indeed if a node fails, or if some of its communication paths go down, its tasks anf active files should be moved, if possible, to some other currently reliable node. • Load Balancing. If for some reason some node is idle or lightly loaded, while other nodes are very heaviliy loaded, it makes sense to redistribute the workload to achieve better balance. • Better Access to Files and Data. Indeed, when large data sets reside at some node, it may be more efficient to move tasks which need them to that same node, rather than to move the data. The purpose of this research is to design, test and evaluate adaptive direct controllers using recurrent neural networks1 , which will manage the tasks of a large scale distributed application as the workload in the system fluctuates, and as some nodes and links fail. The algorithms we design will be tested on a distributed system platform consisting of high-end workstations (servers) interconnected by a local area network and by a high-speed switch. 2

3.1

The Control Objective

In the distributed systems we consider, the state variables Li (t) (the load at node i at time t) are directly observable quantities if a node is ”up” and if its communication links are also ”up”. These quantities can also be known for nodes which are “down”, if a global table of all created and non-terminated tasks is maintained at all nodes. si (t) (a node’s reliability state) is directly observable with some delay: a priority message can be sent to a node, and if it does not respond by some time one concludes that it is not “up”. e(i, j, t) (the link K(i, j, t)’s reliability state) is observable if at least one of nodes i and j is up, and all communication links with a node which has failed are also considered to have failed. The deviation of node i with respect to the load balancing objective is measured by: i KLB = [Li (t) − L∗ (t)]2 .

where L∗ (t) is the average load on nodes which are “up”: PM Li (t)si (t) ∗ L (t) = i=1 . PM i=1 si (t)

(6)

(7)

Let KLB (t) be the mean square deviation of load at all “up” nodes, with respect to their average load: M X KLB (t) = [Li (t)si (t) − L∗ (t)]2 , (8) i=1

Another performance measure of interest Kd (t) evaluates the workload which remains at those nodes which are “down” or which are unable to communicate fully: Kd (t) =

M X

Li (t)(1 − si (t))1[Ei (t) < 1],

(9)

i=1

where 1[.] is the “characteristic function” which takes the value 1 if its argument is true, and is 0 otherwise. i KLB (t), KLB (t), Kd (t) are cost functions to be minimized by a properly functioning system. Thus the global cost function to be minimized could have the form: K(t) = Kd (t) + αKLB (t)

(10)

1 The basic tool used in these recurrent nets will be the “random neural network model” which we have developped (Gelenbe (1989,90,93) [21, 22, 25]), and its extensions which will be proposed below. 2 The target platform will be composed of IBM RS6000/360 servers interconnected by a high-speed (30 MB/s) low-latency (66ns) switch.

The natural control variable is NT (t) is, for any task T , the node to which it is currently (at time t) assigned for execution. Note that it translates directly into the cost functions discussed above, since: P Li (t) = T 1[NT (t) = i].E[WT ] E[WT ] represents the expected (average)3 work associated with task T . Because of the intrinsic uncertainty associated with the system’s behaviour we may prefer to consider probabilistic counterparts, in which case we would deal with: P E[Li (t)] = T P r[NT (t) = i].E[WT ] where P r[NT (t) = i] is the probability that task T is assigned to node i at time t. Similarly we would have: PM E[Kd (t)] = i=1 E[Li (t)] (1 − P r[si (t) = 1]), and the other measures can also be couched in probabilistic terms. It would be difficult – if not impossible – to fully characterize mathematically the system or “plant” being controlled, though system state can be readily observed (with some obvious but relatively short time delays) in current distributed operating systems. A queuing network model of the system (Gelenbe, Mitrani (1980) [18]) could be constructed but would be difficult to parametrise and validate. Thus direct adaptive control4 would be a better choice than indirect control for such systems. However neural networks can be used for indirect adaptive control of distributed systems since they can be used to “learn” the plant’s behaviour through observation and self-adaptation.

3.2

Research on load balancing

There is an abundant and substantial literature on load sharing and balancing policies for distributed systems. However, it deals mostly – in fact quasi-exclusively – with systems which do not fail. Most of this work uses analytical modeling and simulation, and there are relatively few implementations and experimental evaluations. In the early formulations [101] load balancing was viewed as a file allocation problem. Models which study the file assignment problem are presented in [102]. Most of this work considers static load balancing algorithms to optimally place files at different sites in a distributed system. Soon after, researchers turned their attention to dynamic and adaptive algorithms for file placement, as in [103] where file storage cost and file availability are optimized with the possibility of duplicating files, for a known maximum number of file copies. In distributed databases, transaction response time is the measure of interest; it is optimized by efficient data allocation and appropriate transaction routing [122]. The typical goal of a static task allocation strategy is to deterministically allocate tasks to processors so that the total time to process all tasks is minimized [104, 105]. Tantawi and Towsley developed a technique for static optimal probabilistic assignment [106]. In [108] it is said that “in a network of nodes, there is a very high probability that at least one node is idle while jobs are queued at some other nodes,” which motivates interest in the design of adaptive strategies for job allocation. Several adaptive load sharing strategies [109, 110, 111, 112, 113] for process migration have been proposed. Comparative and comprehensive studies include [114], and [110], which point to the potential benefits of adaptive load sharing and compare different policies, concluding that very simple adaptive load sharing policies which collect a small amount of system state information and use this information in simple ways, can yield dramatic performance improvement. In distributed programming environments, where a program can be represented as parallel tasks or modules, the optimum module to processor assignment problem is a static load balancing [119, 120, 121]. Load balancing has been analyzed by modeling the system as a set of M parallel queues which represent the resources, and a central dispatcher which distributes load among queues [110, 123, 124, 125, 126]. These studies include the analysis of static and dynamic policies. Simulation has also been extensively used [128, 129, 130]. Load balancing has also been investigated in other distributed resource environments. In computer networks, routing is an instance of load balancing [115, 116, 117, 118]. More recently in [131], a general method for quantitative and qualitative analysis 3 We

will always write E[.] to denote the expectation. does not use a model of the “plant”.

4 Which

deal only with task or process movement; they assume that jobs and data are one information unit and can migrate to any site.

4

Direct adaptive control of a distributed systems

We review here our work on direct adaptive control for load balancing in distributed systems. In our work the control is distributed, load balancing policies are adaptive and they utilize current system state to make decisions. Our approach covers both process and file movement and distribution among nodes. We consider a message passing, distributed memory distributed system in which both processes and data (files) can move among various sites or nodes. These sites are completely interconnected using a network. Any task or process in the distributed system is created at some node, and it may require data (one or more files) for execution. Files reside normally in some node’s storage, which is referred to as its “host node”. During the execution of processes in the system, new files are created and old ones deleted. Initially, a file is always created at the local node and other nodes are informed about its creation. We suggest that after a fixed time interval tD , if files are not distributed equitably, file redistribution occurs so that each node has roughly the same occupied space. Files can also be deleted to make sure that the file-system is not full. File deletion is carried out only when 1) a delete process is invoked at a host node whose file system is full, and 2) the file is free. When a file is busy at a node, the node forwards the file request to some foreign node which holds the file-copy. Any subsequent requests for the same file must wait at the host node until the file copy reaches the requesting node. In the meantime, the node services the other requests in FCFS order. File placement and job execution are transparent to the users. A user submits a task at the node where it is logged (local node), but the task receives service at some arbitrary node determined by the load balancing algorithm. Consider four simple strategies to move tasks: • NO-PROCESS-MIGRATION (NP): In this strategy only file-copies are allowed to move and all processes are executed at the local node. This strategy is not adaptive, and every file request is forwarded to the host node. • ADAPTIVE-ALL-NODES (AD-ALL): This policy is direct adaptive. It chooses between executing the job at the local node or migrating it to any other node. • ADAPTIVE-TWO-NODES (AD-TWO): Instead of choosing among all nodes, this direct adaptive algorithm only considers the local and the host node for process execution, leading to lower overhead than for AD-ALL. • RANDOM-TWO (RD-TWO): Here a random selection, with equal probability, is made for process execution between the host node and the local node. This non-adaptive policy is used as a yardstick. In order to maintain a complete view of the system, load information about all nodes is maintained and periodically updated at each site. The load index L is based on total load information using the number of processes assigned to the node and their estimated service time. It also includes all other load such as number of messages and file-copies which need processing or transfer. Periodically each t B time units, the node broadcasts its load to all other nodes, and each node maintains a load vector which is updated whenever information arrives from other nodes. It updates the load of other sites when it receives them, and refreshes its own when a new process starts service. Overhead due to broadcast can be controlled by varying tB , and a balance is struck between overhead and performance improvement. L includes the following components. For any task created at a node, tE is the “pure” execution or run time of the process. tF T is the time it takes to transfer a particular file from the host to the local node, including the file-busy overhead. This quantity depends on the size of the file. tM is the time it takes to move a process to a remote site. Ll is the current total load at the local node, expressed as the estimated execution time of all waiting processes, messages, etc.. Lr is the total load at the remote node expressed as the estimated execution time of all waiting processes, messages, etc.. The decision or control variable is π, the probability of deciding that the process will be executed at the local node, where it was created. It will be computed by the algorithms discuss here. If 0.5 ≤ π, the decision should be not to move the process, though in certain cases a decision threshold larger than 0.5 may be used.

πz+1 ⇐ πz − η

∂K |z ∂π

(11)

where z is the index of the update step we are considering, and K is an appropriate cost function obtained from the quantities defined above, and evaluated at each update step. The gradient descent rule is guaranteed to reduce the cost at each step. η is the speed of gradient descent, and the algorithm is stopped whenever two successive values of the cost function are less than some “level of diminishing returns”. We now outline how a meaningful cost K can be chosen. The costs of executing the process locally Wl , or of executing it remotely Wr , are computed at the local node from the load information and job characteristics. The cost K is then the average value of the total execution time of the process, under the policy π: K = πWl + (1 − π)Wr , (12) from which the gradient used in (6) can be computed. Let π i denote the probability that the local node is chosen over some other node i. Then the node i∗ with the smallest value of π i will be chosen for remote execution if the corresponding probability is less than 0.5; otherwise the execution will be local.

4.1

Experimental Results

We have experimentally evaluated our algorithms on a 5-node distributed system. The performance measures considered, include the total load at a node, the average node response time R n , and the average process response time Rp . The ADAPTIVE-TWO-NODES strategy compares favorably with the ADAPTIVE-ALL-NODES strategy, since the results it yields are slightly better but the overhead is also much less. For the ADAPTIVE-ALL-NODES policy, we noticed that only a very small fraction (1%) of the processes were transfered to foreign nodes. On Figures 1 and 2 we present measurements over time (in milliseconds), of the total load at a node. We compare the system running under AD-TWO with the system running under the NP policy on Figure 1, and with the system running with the RD policy on Figure 2. Clearly, the AD-TWO policy is very effective. On Figure 3 we compare the four policies considered with respect to average process response time Rp . The abscissa represents the average process execution time. Each point on the curves corresponds to the average process response time measured over 1000 process executions, for a fixed value of average process execution time C. All times are given in multiples of 128 microseconds (ticks), corresponding to the experimental platform’s (nCube) time unit. The measurements are taken under the following load conditions: each node creates a new process as soon as the previous process it has created has completed execution. When the average process execution time is small, the four policies are relatively equivalent, though AD-TWO still remains the best. However when the average process execution time is large, there is a great benefit in choosing the best load balancing policy, which is AD-TWO. The improvement (reduction) in response time is of the order of 50% for AD-TWO with respect to no load balancing (NP).

4.2

Proposed Direct Neural Control

In order to develop a neural controller of the unreliable distributed system, we may construct a neural network as follows. Our description concentrates on the case where nodes can fail but all links are reliable. The case of communication link failures can be handled in a similar manner. Recall that si (t), Li (t), e(i, j, t) are all quantities which can be observed in the system, and which will be used to provide inputs to the controlling neural network. In the sequel we will only develop the neural network to handle load balancing and failing nodes in order to simplify the presentation. However the effect of failing communication paths will also be considered in our research. The neural network’s neurons will in turn provide decision variables to move tasks around the system. All neurons are now denoted ν(x), where the argument x will be chosen to signify the role of the neuron. Its excitation probability will be denoted q[x] . For each pair of nodes (i, j) there will be a neuron ν(i, j) whose role is to decide to move tasks from i to j. It will be excited by an external input L i (t).m and inhibited by Lj (t).d so as to move tasks from heavily loaded nodes towards more lightly loaded nodes. Furthermore, if c(i, j) is the cost (say in terms of time delay) of moving a task from i to j, neuron ν(i, j) will receive an external inhibitory signal of rate c(i, j).r to signify the cost of task movement.

rapid movement of tasks from i to j if node i is “down” j is “up”. ν(i, j) and ν(l, i) for any other node l will mutually inhibit each other at a rate I so as to avoid useless motion of tasks back and forth. The choice of network parameters a, m, d, r, E, F, I is then a major issue. Assuming given values for all of these parameters, the equations (3) for the network can be written and the network state computed using the following system of equations: q[i,j] =

Li (t)m + sj (t)E + (1 − si (t))F P , i, j = 1, ..., M r[ij] + Lj (t)d + e(i, j, t)r + l6=i Iq[l,i]

(13)

where r[ij] is the sum of all outgoing inhibitory weights of neuron ν(i, j). The neural controller should provide decisions which minimize a cost function obtained from the discussion in Section 1.1, with the following form: C(t) = KLB (t) + αKd (t)

(14)

where α is a positive constant which establishes the relative importance of not leaving tasks at nodes which have failed, with respect to the load balancing objective. The neural network will decide upon task movement from node i to node j if: q[ij]

> q[ji] + θ

where θ is some positive threshold sufficiently large to justify the decision.

4.3

Parameter choice via learning

Having set up the network described above, or a similar but more complex network which could take into account other aspects such as link failures, one first issue is to choose its “free” parameters such as m, d, r, E, F, I(i, l), so as to get it to make the “best” decisions. These decsions should lead to small, if not minimal, values of the performance objective C. We propose to invesigate gradient algorithms of the form: ∂C |z , ∂v where v is each one of the parameters m, d, r, E, F, I(i, l). vz+1 ⇐ vz − η(v)

4.4

(15)

Updating the Decision Variables

The algorithms we will investigate will therefore periodically update the unspecified parameters as indicated above, each time the state of the distributed system is broadcast to all nodes. Then the decision variables q[ij] will also be updated using a gradient rule: z+1 z ⇐ q[ij] − η([ij]) q[ij]

∂C |z ∂q[ij]

(16)

The updates will be stopped using a standard condition, and the resulting q[ij] can then be used to decide about task movement as discussed above.

5

Network architecture and routing

The “Cognitive Packet Network (CPN)” project uses on-line network control via self-awareness and selfadaptation [40, 41] to allow individual users to dynamically set up connections that address user defined QoS goals. CPN Accepts Direction, by inputing Goals prescribed by users. It exploits Self-Observation with the help of smart packets (SPs) so as to be aware of its state including connectivity of fixed or mobile nodes, power levels at mobile nodes, topology, paths and path QoS. It performs Self-Improvement, and Learns from the experience of smart packets using neural networks and genetic algorithms to determine routing schemes with better QoS. It will Deduce hitherto unknown routes by combining or modifying paths which have been previously learned so as to improve QoS and robustness.

1e+04 9e+03

AD-TWO

8e+03

NP

Load

7e+03 6e+03 5e+03 4e+03 3e+03 2e+03 1e+03

0

1e+04

2e+04

3e+04

4e+04

5e+04

6e+04

Real run-time of the measurements (in ms)

Figure 1: Comparison of the AD-TWO and NP policies over a long period of time 1e+04 9e+03

AD-TWO

8e+03

RD

Load

7e+03 6e+03 5e+03 4e+03 3e+03 2e+03 1e+03

0

1e+04

2e+04

3e+04

4e+04

5e+04

6e+04

Real run-time of the measurements (in ms)

Figure 2: Comparison of the AD-TWO and RD policies over a long period of time

Av. Process Response time (in tic

AD_ALL RD

1.28e+04 AD_TWO NP

9.5e+03

6.25e+03

3e+03

1

3e+03

6e+03

9e+03

Average Process Execution Time (in ticks)

Figure 3: Average process response time as a function of average process execution time for the four policies AD-TWO, AD-ALL, RD and NP CPN makes use of three types of packets: smart packets (SP) for discovery, source routed dumb packets (DP) to carry payload, acknowledgments (ACK) to bring back information that has been discovered by SPs and Ps. Conventional IP packets tunnel through CPN to seamlessly operate mixed IP and CPN networks. SPs are be generated by a user (1) requesting that a path having some QoS value be created to some CPN node, or (2) requesting to discover parts of network state, including location of certain fixed or mobile nodes, power levels at nodes, topology, paths and their QoS. SPs exploit the experience of other packets using random neural network (RNN) based Reinforcement Learning (RL) [44, 40, 41]. RL is carried out using a Goal which is specified by the user who generated a request for a connection. The decisional weights of a RNN will be increased or decreased based on the observed success or failure of subsequent SPs to achieve the Goal. Thus RL will tend to prefer better routing schemes, more reliable access paths to data objects, and better QoS. (2) Secondly, the system can deduce new paths to users, nodes and data objects by combining previously discovered paths, and using the estimated or measured QoS values of new paths select the best new paths. This is similar conceptually to a genetic algorithm which generates new entities by combination or mutation of existing entities, and then selects the best among them using a fitness function. These new paths will then be tested by forwarding Ps so that the actual QoS or success can be evaluated. When a SP arrives to its destination, an ACK is generated and heads back to the source of the request. It updates mailboxes (MBs) in the CPN nodes it visits with information which has been discovered, and provides the source node with the successful path to the node. All packets have a life-time constraint based on the number of nodes visited, to avoid overburdening the system with unsuccessful requests. A node in the CPN acts as a storage area for packets and mailboxes (MBs). It also stores and executes the code used to route smart packets. It has an input buffer for packets arriving from the input links, a set of mailboxes, and a set of output buffers which are associated with output links. CPN software is integrated into the Linux kernel 2.2.x, providing a single application program interface (API) for the programmer to access CPN. CPN routing algorithms also run seamlessly on ad-hoc wireless and wired connections without specific dependence on the nature (wired or wireless) of the links, using QoS awareness to optimize behavior across different connection technologies and wireless protocols.

Smart packet routing outlined above is carried out by code stored in each router whose parameters are updated at the router. For each successive smart packet, the router computes the appropriate outgoing link based on the outcome of this computation. A recurrent RNN with as many “neurons” as there are possible outgoing links, is used in the computation. The weights of the RNN are updated so that decision outcomes are reinforced or weakened depending on how they have contributed to the success of the QoS goal. In the RNN [25] the state qi of the i − th neuron in the network is the probability that the i − th neuron is excited. Each neuron i is associated with a distinct outgoing link at a node. The q i satisfy the system of non-linear equations: qi = λ+ (i)/[r(i) + λ− (i)], (17) where λ+ (i) =

X

+ qj wji + Λi ,

j

λ− (i) =

X

− qj wji + λi ,

(18)

j

+ − wji is the rate at which neuron j sends “excitation spikes” to neuron i when j is excited, w ji is the rate at which neuron j sends “inhibition spikes” to neuron i when j is excited, and r(i) is the total firing rate from the neuron i. For an n neuron network, the network parameters are these n by n “weight matrices” W+ = {w+ (i, j)} and W− = {w− (i, j)} which need to be “learned” from input data. RL is used in CPN as follows. Each node stores a specific RNN for each active source-destination pair, and each QoS class. The number of nodes of the RNN are specific to the router, since (as indicated earlier) each RNN node will represent the decision to choose a given output link for a smart packet. Decisions are taken by selecting the output link j for which the corresponding neuron is the most excited, i.e. q i ≤ qj for all i = 1, .. , n. Each QoS class for each source-destination pair has a QoS Goal G, which expresses a function to be minimized, e.g.Transit Delay or Probability of Loss, or Jitter, or a weighted combination, and so on. The reward R which is used in the RL algorithm is simply the inverse of the goal: R = G −1 . Successive measured values of R are denoted by Rl , l = 1, 2, ..; These are first used to compute the current value of the decision threshold:

Tl = aTl−1 + (1 − a)Rl ,

(19)

where 0 < a < 1, typically close to 1. Suppose we have now taken the l − th decision which corresponds to neuron j, and that we have measured the l − th reward Rl . We first determine whether the most recent value of the reward is larger than the previous value of the threshold Tl−1 . If that is the case, then we increase very significantly the excitatory weights going into the neuron that was the previous winner (in order to reward it for its new success), and make a small increase of the inhibitory weights leading to other neurons. If the new reward is not greater than the previous threshold), then we simply increase moderately all excitatory weights leading to all neurons, except for the previous winner, and increase significantly the inhibitory weights leading to the previous winning neuron (in order to punish it for not being very successful this time). Let us denote by ri the firing rates of the neurons before the update takes place: n X ri = [w+ (i, m) + w− (i, m)], (20) 1

We first compute Tl−1 and then update the network weights as follows for all neurons i 6= j: • If Tl−1 ≤ Rl – w+ (i, j) ← w+ (i, j) + Rl , – w− (i, k) ← w− (i, k) +

Rl n−2 ,

if k 6= j.

• Else – w+ (i, k) ← w+ (i, k) + −



Rl n−2 , k

– w (i, j) ← w (i, j) + Rl .

6= j,

the neural network, we then re-normalize all the weights by carrying out the following operations. First for each i we compute: n X ∗ ri = [w+ (i, m) + w− (i, m)], (21) 1

and then re-normalize the weights with: w+ (i, j) ← w+ (i, j) ∗ rr∗i , i w− (i, j) ← w− (i, j) ∗ rr∗i . i

Finally, the probabilities qi are computed using the non-linear iterations (17), (18). The largest of the qi ’s is again chosen to select the new output link used to send the smart packet forward. This procedure is repeated for each smart packet for each QoS class and each source-destination pair.

6

Cold Start Set-Up Time Measurements

One of the major requirements of a CPN is that it should be able to start itself with no initial information, by first randomly searching, and then progressively improving its behaviour through experience. Since the major function of a network is to transfer packets from some source S to some destination D, it is vital that the CPN be able to establish a path from S to D even when there is no prior information available in the network. The network topology we have used in these experiments is shown at the top of Figure 6, with the source and destinations nodes marked at the left and right ends of the diagram. The network contains 24 nodes, and each node is connected to 4 neighbours. Because of the possibility of repeatedly visiting the same node on a path, the network contains an unlimited number of paths from S to D. However, the fact that SPs are destroyed after they visit 30 nodes, does limit this number though it still leaves a huge number of possible paths. In this set of experiments, the network is always started with empty mailboxes, i.e. with no prior information about which output link is to be used from a node, and with neural network weights set at identical values, so that the neural network descison algorithm at nodes initially will produce a random choice. Each point shown on the curves of Figure 6 is a result of 100 repetitions of the experiment under identical starting conditions. Let us first comment on the left-hand top-most curve. An absissa value of 10 indicates that the number of SPs used was 10, and – assuming that the experiment resulted in an ACK packet coming back to the source – the ordinate gives the average time (over the 100 experiments) that it elapse between the instant that the first SP was sent out, and the first ACK comes back. Note that the first ACK will be coming back from the correct destination node, and that it will br bringing back a valid forward path that can be used by the subsequent useful traffic. We notice that the average set-up time decreases significantly when we go from a few SPs to about 10, and after that, the average set-up time does not improve appreciably. Its value somewhere between 10 and 20 milliseconds actually corresponds to the round-trip transit time through the hops. This does not mean that it suffices to have a small number of SPs at the beginning, simply because the average set-up time is only being measured for the SPs which are successful; unsuccessful SPs are destroyed after 30 hops. Thus the top-most curve on the righ-hand-side of Figure 6 is needed to obtain a more complete understanding of what is happening. Again for an x-axis value of over 10 packets, we see that the probability of successfully setting up a path is 1, while with a very small number of packets this figure drops down to about 0.65. These probabilities must of course be understood as the empirically observed fraction of the 100 tests which result in a successful connection. The conclusion from these two sets of data is that to be safe, starting with an empty system, a fairly small number of SPs, in the range of 20 to 100, will provide almost guaranteed set-up of the connection, and the minimum average set-up time. The third curve that we show as a result of these experiments provides some insight into the dynamics of the path set-up. Inserting SPs into the network is not instantaneous, and they are fed into the network sequentially by the source. The rate at which they are fed in is determined by the processing time per packet at the source, and also by the link speeds. Since the link speed is 100M b/s and because SPs are only some 200Bytes long at most, we think that the limiting factor here is the source node’s processing time. Since, on the other hand, the previous curves show that connections are almost always established

we would expect to see that the connection is generally established before all the SPs are sent out by the source. This is exactly what we observe on this third curve. The x axis shows the number of SPs sent into the network, while the y axis shows the average number sent in (over the 100 experiments) before the first ACK is received. For small numbers of SPs, until the value 10 or so, the relationship is linear. However as the number of SPs being inserted into the network increases, we see that after (on the average) 13 packets or so have been sent out, the connection is already established (i.e. the first ACK has returned to the source). This again indicates that a fairly small number of SPs suffice to establish a connection.

Figure 4: CPN Network Topology for Cold Start Experiments Set−Up Measurements from Cold Start (Empty Network)

Set−Up Time Measurements from Cold Start (Empty Network) 2

0

10

Average Set−Up Time

Probability that the Requested Path is Set Up (ACK Received)

10

−1

1

10 0 10

1

2

3

10 10 10 Number of Smart Packets Successively Sent into the Network

10

4

0

10

10

1

2

10

10

3

4

10

10

Number of Smart Packets Succesively Sent into the Network

Figure 5: Average Network Set-Up Time (Left) and Probability of Successful Connection (Right) from Cold Start, as a Function of the Initial Number of Smart Packets 2

Set−Up Measurements from Cold Start (Empty Network)

Average Number of Smart Packets Needed to get a Valid Path

10

1

10

0

10 0 10

1

2

3

10 10 10 Number of Smart Packets Succesively Sent into the Network

4

10

Figure 6:

REFERENCES

[1] B.D.O. Anderson, R.R. Bitmead, C.R. Johnson, P.V. Kokotovic, R.L. Kosut, I.M.Y. Mareels, L. Praly and B.D. Reidle (1986) Stability of Adaptive Systems: Passivity and Averaging Analysis, MIT Press. [2] K.J. Astr¨ om (1970) Introduction to Stochastic Control Theory, Academic Press.

Conf. on Decision and Control, pp. 982-987. [4] K.J. Astr¨ om and T. H¨ agglund (1988) Automatic Tuning of PID Regulators, Instrument Society of America. [5] K.J. Astr¨ om and B. Wittenmark (1989) Adaptive Control, Addison-Wesley. [6] K.J. Astr¨ om and B. Wittenmark (1990) Computer Controlled Systems, Second Edition, PrenticeHall. [7] R. Bellman and R. Kalaba “On adaptive control processes”. IRE Transactions on Automatic Contro, 4, pp 1-9, November 1959. [8] R.R. Bitmead, M. Gevers and V. Wertz (1990) Adaptive Optimal Control. The Thinking Man’s GPC, Prentice Hall. [9] D.W. Clarke, and P.J. Gawthrop (1975) “Self-tuning Controller”, Proceedings IEE, 122, pp 929-934. [10] D.W. Clarke and P.J. Gawthrop (1979) “Self-tuning Control”, Proceedings IEE, 126, pp. 633-640. [11] D.W. Clarke, C. Mohtadi and P.S. Tuffs (1987a) “Generalized Predictive Control – Part I. The Basic Algorithm”, Automatica, 23, pp. 137-148. [12] D.W. Clarke, C. Mohtadi and P.S. Tuffs (1987b) “Generalized Predictive Control – Part II. Extensions and Interpretations”, Automatica, 23, pp. 149-160. [13] H. Demircioglu and P.J. Gawthrop (1992) “Multivariable Continuous-time Generalized Predictive Control”, Automatica, 28, 4, pp. 697-713. [14] R.F. Drenick and R.A. Shahbender “Adaptive servomechanisms.” AIEE Transactions, 76, pp 286292, November 1957. [15] P.J. Gawthrop (1980) “Hybrid Self-tuning Control”, Proceedings IEE, 127, pp. 229-236. [16] P.J. Gawthrop (1987) Continuous-time Self-tuning Control. Vol 1: Design, Research Studies Press. [17] P.J. Gawthrop (1990) Continuous-time Self-tuning Control. Vol 2: Implementation, Research Studies Press. [18] Gelenbe, E., Mitrani, I. Analysis and Synthesis of Computer Systems, Academic Press, New York and London (1980). Published in Japanese Translation by Ohm-Sha Publishing Co.Tokyo (1988). [19] Atalay V., Gelenbe E. and Yalabık N., “The random neural network model for texture generation”, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 6, No. 1, pp 131-141, 1992. [20] Atalay V. and Gelenbe E. “Parallel algorithm for colour texture generation using the random neural network model”, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 6, No. 2-3, pp 437-446, 1992. [21] Gelenbe E., ”Random neural networks with negative and positive signals and product form solution”, Neural Computation, Vol. 1, No. 4, pp 502-511, 1989. [22] Gelenbe E. ”Stability of the random neural network model”, Neural Computation, Vol. 2, No. 2, pp. 239-247, 1990. [23] Gelenbe E.Stafylopatis A.Likas A., ”An extended random network model with associative memory capabilities”, Proc. International Conference on Artificial Neural Networks (ICANN’91), Helsinki, June 1991. [24] Gelenbe, E.Stafylopatis, A., “Global behaviour of homogenous random neural systems”, Applied Math. Modelling, 15 (1991), pp. 535–541.

1, pp 154-164, 1993. [26] E. Gelenbe, V. Koubi, F. Pekergin ”Dynamical random neural approach to the traveling salesman problem,” ELEKTRIK, Vol. 2, No. 2, pp 1-10, 1994. [27] E. Gelenbe, N. Hernandez ”Virus tests to maximize availability of software systems,” Theoretical Computer Science, Vol. 125, pp 131-147, 1994. [28] E. Gelenbe ”G-networks: An unifying model for queueing networks and neural networks,” Annals of Operations Research, Vol. 48, No. 1-4, pp 433-461, 1994. [29] E. Gelenbe, T. Feng, K.R.R. Krishnan ”Neural network methods for volumetric magnetic resonance imaging of the human brain,” Proceedings of the IEEE, Vol. 84, No. 10, pp. 1488-1496, Octobre 1996. [30] C. Cramer, E. Gelenbe, H. Bakircioglu ”Low bit rate video compression with neural networks and temporal subsampling”, Proceedings of the IEEE, Vol. 84, No. 10, pp. 1529-1543, Octobre 1996. [31] Gelenbe, A. Ghanwani, V. Srinivasan ”Improved neural heuristics for multicast routing”, IEEE Journal of Selected Areas of Communications, Vol. 15, No. 2, pp. 147-155, February 1997. [32] E. Gelenbe, K. Harmanci, J. Krolik ”Learning neural networks for detection and classification of synchronous recurrent transient signals”, Signal Processing, Vol. 64 (3), pp. 233-247, 1998. [33] H. Bakircioglu, E. Gelenbe, T. Kocak, ”Image processing with the Random Neural Network model”, ELEKTRIK, Vol. 5 (1), pp. 65-77, 1998. [34] E. Gelenbe, C. Cramer ”Oscillatory corticothalamic response to somatosensory input”, Biosystems, Vol. 48 (1-3), pp. 67-75, 1998. [35] E. Gelenbe, Z.-H. Mao, Y.-D. Li ”Function approximation with spiked random networks”, IEEE Trans. on Neural Networks, Vol. 10, No. 1, pp. 3-9, 1999. [36] Y. Feng and E. Gelenbe ”Adaptive object tracking and video compression,” Network and Information Systems Journal, Vol. 1, No. 4-5, pp. 371-400, 1999. [37] E. Gelenbe, J.M. Fourneau ”Random neural networks with multiple classes of signals”, Neural Computation, Vol. 11 (4):953-963, 1999. [38] E. Gelenbe and T. Kocak ”Area-based results for mine detection”, IEEE Trans. on Geoscience and Remote Sensing, Vol. 38, No.1, pp. 1-14, Janvier 2000. [39] C. Cramer, E. Gelenbe ”Video quality and traffic QoS in learning-based subsampled and receiverinterpolated video sequences”, IEEE Journal on Selected Areas in Communications, Vol. 18, No. 2, pp. 150-167, February 2000. [40] E. Gelenbe, R. Lent, Z. Xu, ”Measurement and performance of a cognitive packet network”, Computer Networks, Vol. 37, pp. 691-791, 2001. [41] E. Gelenbe, R. Lent, Z. Xu, ”Design and performance of cognitive packet networks”, Performance Evaluation, Vol. 46, pp. 155-176, 2001. [42] E. Gelenbe, K. Hussain ”Learning in the multiple class random neural network”, IEEE Trans. on Neural Networks, Vol. 13 (6), 1257–1267, 2002. [43] G.C. Goodwin and K.S. Sin (1984) Adaptive Filtering Prediction and Control, Prentice-Hall. [44] U. Halici, “Reinforcement learning with internal expectation for the random neural network” Eur. J. Opns. Res., 126 (2) 2, 288-307, 2000. [45] K.J. Hunt (1989) Stochastic Optimal Control Theory with Application in Self-tuning Control, Springer-Verlag.

tuning Controllers, Ed. K. Warwick, Peter Peregrinus. [47] K.J. Hunt and G.R. Worship (1990) Expert Systems for Self-tuning Control, In Knowledge-based Systems for Industrial Control, Eds. J. McGhee, M.J. Grimble and P. Mowforth, Peter Peregrinus. [48] K.J. Hunt and D.G. Sbarbaro (1991b) “Neural Networks for Non-linear Internal Model Control”, Proceedings IEE Pat D, 138, pp 431-438. [49] K.J. Hunt and D.G. Sbarbaro (1991c) “Neural Networks in Process Control”, Process Enginering, pp 59-63. ˙ [50] K.J. Hunt, D.G. Sbarbaro, R. Zbikowski and P.J. Gawthrop (1992) “Neural Networks for Control Systems – A Survey”, Automatica, 28, pp 1083-1112, [51] P.Ioannou and P.V. Kokotovic (1983) Adaptive Systems with Reduced Models, Springer-Verlag. [52] A. U. Levin, K. S. Narendra (1993), “Control of Non-linear Dynamical Systems using Neural Networks: Controllability and Stabilization,” IEEE Trans. on Neural Networks, 4 (2), pp. 192-206. [53] M. Karn´ y, A. Halouskov´ a, J. B¨ om, R. Kulhav´ y and P. Nedoma (1985) “Design of Linear Quadratic Adaptive Control: Theory and Algorithms for Practice”, Supplement to Kybernetica, 21, pp. 3-97. [54] P.V. Kokotovic‘ (Ed.) (1991) Foundations of Adaptive Control, Springer-Verlag. [55] P.R. Kumar and P. Varaiya (1986) Identification and Adaptive Control Prentice-Hall. [56] L. Ljung (1977) “Analysis of Recursive Stochastic Algorithms”, IEEE Trans. on Automatic Control, AC-22, pp. 551-575. [57] L. Ljung (1987) System Identification – Theory for the User, Prentice-Hall. [58] W.T. Miller, R.S. Sutton, and P.J. Werbos, editors. Neural Networks for Control, MIT Press, 1991. [59] K.S. Narendra and A.M. Annaswamy (1989) Stable Adaptive Systems, Prentice-Hall. [60] K.S. Narendra and J.H. Taylor (1973) Frequency Domain Criteria for Absolute Stability, Academic Press. [61] K.S. Narenda and S. Mukhopadhyay. “Intelligent control using neural networks”. IEEE Control System Magazine, pp 11-18, April 1992. [62] K.S. Narenda and K. Parthasarathy. “Identification and control of dynamical systems using neural networks. IEEE Trans. on Neural Networks, 1(1):4-29, 1990. [63] O. Nerrand, L. Roussel-Ragot, L. Personnaz, G. Dreyfus, S. Marcos, C. Macchi, and C. Vignat. “Neural network training schemes for non-linear adaptive filtering and modelling.” In Proc. of IJCNN91, Seattle, 1991. [64] P.C. Parks (1966) “Lyapunov Redesign of Model Reference Adaptive Control Systems”, IEEE Trans. on Automatic Control, AC-11, pp. 362-367. [65] L. Personnaz, O. Nerrand, and G. Dreyfus. “Apprentissage et mise en oeuvre des r´eseaux de neurones boucl´es, Journ´ees Internationales des Sciences Informatiques, Tunis, 1990. [66] L. Praly (1986) “Global Stability of a Direct Adaptive Control Scheme with Respect to a Graph topology”, in Adaptive and Learning System – Theory and Applications, Ed. K.S. Narendra, Plenum Press. [67] S.Z Qin, Su H.G.and T.J. McAvoy. “Comparison of four neural net learning methods for system identification. IEEE Trans. on Neural Networks, 3(1):122-130, January 1992. [68] C.E. Rohrs, L.S. Valavani, M. Athans and G. Stein (1985) “Robustness of Continuous-time Adaptive Control in the Presence of Unmodeled Dynamics”, IEEE Trans. on Automatic Control, AC-30, pp 881-889.

1986. [70] R.M. Sanner and J.J. Slotine. “Gaussian networks for direct adaptive control”. IEEE Trans. on Neural Networks, 3, 1992. [71] S.S. Sastry and M. Bodson (1989) Adaptive Control: Stability, Convergence, and Robustness, Prentice-Hall. [72] D.G. Sbarbaro and P.J. Gawthrop (1990) “Learning complex mappings by stochastic approximation”. Proc. Int. Joint Conference on Naural Networks, IJCNN90, Vol. I, pp. 569-572. [73] D.G. Sbarbaro and P.J. Gawthrop (1991) “Self-organization and Adaptation in Gaussian Networks”. Proc. 9th IFAC/IFORS Symposium on Identification and Systems Parameter Estimation, Budapest, Hungary. [74] N. Seube “Construction of learning rule in neural networks that can find viable regulation laws to control problems by self-organization. International Neural Networks Conference, Volume 1, pages 209-212, Paris, 1990. [75] J.J. Slotine and S.S. Sastry. “Tracking control of non-linear systems using sliding surfaces, with application to robot manipulators”. Int. J. of Control, 38(2):465-492, 1983. [76] D.E. Sontag. “Feedback stabilization using two-hidden-layer nets”. Report SYCON-90-11, Rutgers Center for Systems and Control, 1990. [77] P.J. Werbos. “Back-propagation through time: What it does and how to do it”. Proceedings of the IEEE, 78(10), October 1990. [78] A.W. Westerman. “Neural network control of a robotic manipulator arm for undersea applications”. Neural Network for Ocean Engineering Workshop, pp 161-168, Washington DC, August 1991. [79] B. Widrow. “The original adaptative broom balancer”. IEEE Conf. on Circuits and Systems, 1987. [80] B. Widrow and D. Nguyen. “The truck backer-upper: An example of self-learning in neural networks”. Joint International Conference on Neural Networks, 1989. [81] L.A. Zadeh “On the definition of adaptivity”. Proceedings of the IEEE, 51, pp 569-570, March 1963. [82] Aarts E.H.L. Korst J.H.M., ”Boltzmann machines for traveling salesman problems”, European Journal of Operational Research, No. 39, pp 79-95, North-Holland, 1989. [83] Abe S., ”Theories on the Hopfield neural networks”, Proc. International Joint Conference on Neural Networks (IJCNN’89), Washington D.C.Vol. I, pp 557-564, June 1989. [84] Abe S., ”Determining weights of the Hopfield neural networks”, Proc. International Conference on Artificial Neural Networks (ICANN’91), Helsinki, pp 1507-1510, June 1991. [85] Abe S., ”Global convergence and suppression of spurious states of the Hopfield neural networks”, Trans. IEEE Circuits & Systems I, in press. [86] Ackley D.H. Hinton G.E. Sejnowski T.J., ”A learning algorithm for Boltzmann machines”, Cognitive Science, No. 9, pp 147-169, 1985. [87] Aiyer S.V.B.Niranjan M.Fallside F., ”A Theoretical Investigation into the Performance of the Hopfield Model”, IEEE Transactions on Neural Networks, Vol. 1, No. 2, pp 204-215, June 1990. [88] Aiyer S.V.B.Niranjan M.Fallside F., ”On the optimization properties of the Hopfield model”, Proc. International Conference on Neural Networks (ICNN’90), pp 245-249, 1990. [89] Akiyama Y. Yamashita A. Kajiura M. Aiso H. ”Combinatorial Optimization with Gaussian Machines”, Proc. International Joint Conference on Neural Networks (IJCNN’89), Washington D.C.Vol. I, pp 533-540, 1989.

Systems, Man and Cybernetics, 19, 1989, p. 1264-1274. [91] R. W. CONNERS et C. A. HARLOW, A theoretical comparison of texture algorithms, IEEE Trans. Pattern Anal. Machine Intell., 2, 1980, p. 204-222. [92] G. CROSS et R. JAIN, Markov random field texture models, IEEE Trans. Pattern Anal. Machine Intell., 5, 1983, p. 25-39. [93] H. DERIN and H. ELLIOTT, Modeling and segmentation of noisy and textured images using Gibbs random fields, IEEE Trans. Pattern Anal. Machine Intell., 9, 1987, p. 39-55. [94] Cohen M.A.and Grossberg S. ”Absolute stability of global pattern formation and parallel memory storage by competitive neural networks”, IEEE Trans. Sys. Man Cyber., Vol. 13, No. 5, 1983. [95] Hopfield J.J. and Tank D.W., ”Neural computation of decisions in optimization problems” Biological Cybernetics, No. 52, pp 141-152, 1985. [96] Joppe A., Cardon H.R.A. and Bioch J.C., ”A neural nandwork for solving the traveling salesman problem on the basis of city adjacency in the tour”, Proc. International Joint Conference on Neural Nandworks (IJCNN’90) , San Diego, California, Vol. III, pp 961-964, June 1990. [97] Lawler E.L.Lenstra J.K.Rinnooy Kan A.H.G.Shmoys D.B., The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization, John Wiley and Sons Ltd, p 481, 1985. [98] Pekergin F. ”Optimisation combinatoire par le calcul neuronal and parall´elisme optimal”, Doctoral Dissertation, Universit´e Ren´e Descartes, March, 1992. [99] Pekergin F. “Combinatorial optimization by random neural network model - application to the independent set problem”, Proc. International Conference on Artificial Neural networks (ICANN’92), Brighton, U.K.Sept. 1992. [100] Wilson G.V.Pawley G.S., ”On the stability of the traveling salesman problem algorithm of Hopfield and Tank”, Biological Cybernetics, No. 58, pp 63-70, 1988. [101] P. Chen, “Optimal file allocation in Multi-level Storage Systems,” Proc. AFIPS National Computer Conference, vol. 42, pp. 277–282, 1973. [102] L. Dowdy and D. Foster, “Comparative Models of the File Assignment Problem,” ACM Computing Serveys, vol. 14, pp. 287–313, June 1982. [103] L. Laning and M. Leonard, “File allocation in a Distributed Computer and Communication network,” IEEE Trans. Computers, vol. C-32, pp. 232–244, 1983. [104] A. Thomasian and P. Bay, “Analytical queueing network models for parallel processing of task systems,” IEEE Transactions on Computers, vol. C-35, pp. 1045–1056, Dec 1986. [105] D. Menasce and L. Barroso, “A mandhodology for performance evaluation of parallel applications on multiprocessors,” Journal of Parallel and Distributed Computing, vol. 14, pp. 1–14, 1992. [106] A. Tantawi and D. Towsley, “Optimal Static Load Balancing in Distributed Computer Systems,” Journal ACM, vol. 32, pp. 445–465, April 1985. [107] E. Silva and M. Gerla, “Queueing network Models for Load Balancing in Distributed Systems,” Journal of Parallel and Distributed Computing, vol. 12, pp. 24–38, 1991. [108] M. Livny and M. Melman, “Load balancing in homogeneous broadcast distributed systems,” Proc. ACM Comput. network Performance Symp., pp. 47–55, 1982. [109] T. Liu, “Dynamic load balancing algorithm in homogeneous distributed systems,” Proc. of the Sixth Int. Conf. on Distributed Computing Systems, pp. 216–222, May 1986. [110] D. Eager, E. Lazowska, and J. Zahorjan, “Adaptive Load Sharing in Homogeneous Distributed Systems,” IEEE Transaction on Software Engineering, vol. 12, pp. 662–676, 1986.

tems,” Proc. of the Ninth International Conference on Distributed Computing Systems, (Newport Beach, California), pp. 298–306, June 1989. [112] H.-C. Lin, G.-M. Chiu, and C. Raghavendra, “Performance Study of Dynamic Load Balancing Policies for Distributed Systems with Service Interruptions,” in IEEE INFOCOM’91 Proceedings, pp. 797–805, 1991. [113] L. Ni, C. Xu, and T. Gendreau, “A Distributed drafting algorithm for Load Balancing,” IEEE Transaction Software Engineering, vol. SE-11, pp. 1153–1161, October 1985. [114] Y. Wang and R. Morris, “Load sharing in Distributed Systems,” IEEE Transactions on Computers, vol. C-34, pp. 204–217, March 1985. [115] A. Agrawala, S. Tripathi, and G. Ricart, “Adaptive routing using a virtual waiting time technique,” IEEE Trans. Software Eng., vol. SE-8, pp. 76–81, 1982. [116] C. Brown and M. Schwartz, “Adaptive routing in central computer communication networks,” Porc. IEEE Int. Comp. Commum., pp. 12–16, June 1979. [117] Y. Chow and W. Kohler, “Models for dynamic load balancing in a handerogeneous multiple processor system,” IEEE Transaction Computers, vol. 28, pp. 354–361, 1979. [118] L. Ni, “A Distributed Load Balancing Algorithm for Point to Point Computer networks,” Proc. of IEEE COMPCON, pp. 116–123, 1982. [119] T. Chou and J. Abraham, “Load balancing in distributed systems,” IEEE Transactions on Software Engineering, vol. SE-8, July 1982. [120] G. Rao, H. Stone, and T. Hu, “Assignment of tasks in a distributed processor system with limited memory,” IEEE Trans. Comput., vol. C-28, pp. 291–298, April 1979. [121] H. Stone, “Multiprocessor scheduling with the aid of network flow algorithms,” IEEE Trans. Software Eng., vol. SE-3, pp. 85–94, Jan 1977. [122] A. Leff and P. Yu, “An adaptive startegy for load sharing in distributed database environments with information lags,” Journal of Parallel and Distributed Computing, vol. 13, pp. 91–103, 1991. [123] C. Gao, J. Liu, and M. Railey, “Load balancing algorithms in homogeneous distributed systems,” Proc. 1984 International Conference on Parallel Processing, (Silver Spring, MD), pp. 302–306, IEEE Computer Sociandy, 1984. [124] K. Lee and D. Towsley, “A comparison of priority-based decentralized load balancing policies,” Proc. Performance ’86 and 1986 ACM SIGMandRICS Conf, pp. 70–77, 1986. [125] T. Yaun and H. Lin, “Adaptive load balancing for parallel queues,” Proc IEEE International Conf. on Comm., (Amsterdam), 1984. [126] L. Ni and K. Hwang, “Optimal load balancing in a multiple processor system with many job classes,” IEEE Transaction on Software Engineering, vol. 11, pp. 491–496, 1985. [127] R. Gallager, “A mimimum delay routing algorithm using distribted computation,” IEEE Transaction on Comm., vol. COM-25, pp. 73–85, 1977. [128] R. Bryant and R. Finkel, “A stable distributed scheduling algorithm,” Proc. 2nd Int. Conf. Distributed Computer Systems, pp. 314–323, 1981. [129] P. Kruger and R. Finkel, “An adaptive load balancing algorithm for a multicomputer,” Tech. Rep. 539, Dept. Computer Science, University of Wisconsin, Madison, April 1984. [130] S. Zhou, “A Trace-Driven Simulations Study of Dynamic Load Balancing,” IEEE Transactions on Software Engineering, vol. 14, No. 9, pp. 1327–1341, Sept 1988.

Trans. on Parallel and Distributed Systems, vol. 3, pp. 747–760, November 1993. [132] A. Goscinski, Distributed Operating Systems, Logical Design. Addison Wesley, 1991. [133] G.R. Ash, F. Chang, D. Medhi “Robust Traffic Design for Dynamic Routing networks” IEEE INFOCOM’91, 5D.1, 0508-0514, 1991. [134] K.R. Krishnan “Dynamic traffic routing and network management” IEEE GLOBECOM’91 37.7, 1346-1350, 1991. [135] J. Filipiak, P. Chemouil “Routing and Bandwidth Management Options in High Speed Integrated Services net works” IEEE GLOBECOM’91, 48.1, 1685-1689, 1991. [136] J.M. Jaffe “Distributed Multi-Destination Routing: The Constraints of Local Information” Siam J. Computing Vol.14, 875-888 (1985). [137] G.I. Stassinopoulos, M.G. Kazantzakis “A Computationally Efficient Iterative Solution of the Multidestination Optimal Dynamic Routing Problem” IEEE Trans. on Communications, Vol.39, 13701378 (1991). [138] K.R. Krishnan “Performance Benefits of State-Dependent Routing of Telephone Traffic” IEEE ICC’90, 334.3, 1314-1318, 1990.

Suggest Documents