Neural Networks for Distributed Overload Control in ...

2 downloads 196 Views 193KB Size Report
of call processing computers from excessive load during tra c ... telecom networks have been widely stud- ied. ..... vector 2 and k, the RBF center nearest to 2 has ...
Neural Networks for Distributed Overload Control in Telecommunications Networks

1

S. Wu and K. Y. Michael Wong Department of Physics, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. fphwusi,[email protected]

Abstract | Overload control in telecom networks is used to protect the network of call processing computers from excessive load during trac peaks, and involves techniques of predictive control with limited local information. Here we propose a neural network algorithm, in which a group of neural controllers are trained using examples generated by a globally optimal control method. Simulations show that the neural controllers have better performance than local control algorithms in both the throughput and the response to trac upsurges. Compared with the centralized control algorithm, the neural control signi cantly decreases the computation time for making decisions and can be implemented in real time.

I. Introduction

In recent years the use of neural networks for intelligent management and control in telecom networks have been widely studied. Trac problems are often very dif cult, for there are too many degrees of freedom and trac processes are stochastic. The optimal solutions may be dicult to nd or too complex. With their learning and generalization abilities, neural networks are good candidates to solve trac problems, e.g. neural networks for trac routing [1], bandwidth allocation [2], call admission control in ATM networks [3] and so on. In these applications neural networks are used to approximate complex functions. Here we investigate the application of neural networks for overload control in telecom networks. In modern telecommunications networks overload control is critical to guarantee good system performances of the call setup and disconnection processes. It protects call processing computers from excessive load during trac upsurges, based on a throttling mechanism for new arriving requests. It is increasingly important with the emergence of Integrated Services Dig-

ital Networks (ISDN), in which numerous customer services are provided [4]. The overload control strategy has been well developed for the traditional hierarchical networks [5]. For networks of distributed architecture, in which the role of each processor is equivalent, the situation is much more complex and dicult. Recent advances in the technology of the signalling network enable the transfer of a large amount of information instantly among system elements. This provides the possibility of networkwide control in distributed call processors. A natural way is a centralized control method, which uses the information of the whole system to make globally optimal decisions. However, centralized control is often complex and time-consuming. It is also sensitive to system breakdowns and leads to excessive load in the signalling networks. Some local control methods have been suggested, where each processor makes decisions depending on only locally available information [6], [7], [8]. Local control has the advantages of easy implementation and robustness to system breakdowns. However, it is generally not an optimal control, and the challenge is to coordinate the control steps taken by each processor to achieve globally optimal performance of the system. In this paper we propose a neural network method to combine the strengths of both local control and centralized control methods. The centralized controller serves as the teacher, who generates examples of globally optimal decisions. These examples are used to train the neural controllers, each located on a processor node. After learning the neural controllers are implemented to infer the control decisions of the teacher based on locally available information. To evaluate the performance of our

method, we perform simulations on a metropolitan network. We compare the behaviours of the proposed local, centralized and neural control methods, referred to as LCM, CCM and NNM respectively. It shows that NNM performs better both in the throughput and the response to trac upsurges when compared with LCM. Compared with CCM, NNM signi cantly decreases the computation time for decision making and can be implemented in realtime. II. Overload Control in Telecommunication Networks

A. The Formulation of the Problem

Consider a distributed telecom network which consists of N fully connected switch stations. Call requests between two stations are assumed to arrive as Poisson processes. A call setup process is often complex and may generate various tasks. Here we adopt a simpli ed model, which captures the essential features of real processes (Table I): each call setup request initiates ve jobs, corresponding to sending dial tones, receiving digits, routing, connecting path and so on. Time delays between successive jobs are assumed to be stochastic and uniformly distributed. Jobs 1-3 are processed on the originating node, and jobs 4 and 5 on the terminating node. A processor is overloaded if its load status exceeds a prede ned threshold. Overload control is implemented by gating new calls. The gate values, i.e. the fraction of admitted calls, are updated periodically. Taking into account hardware limitations, control speed and statistical uctuations, we choose the control period to be 5 seconds. An e ective control is to nd out the optimal gate values and satis es the following requirements: 1) maximum throughput, thereby avoiding unnecessary throttling; 2) balance between stations; 3)fairness; 4) robustness against changing traf c pro les and partial network breakdown; and 5) easy implementation. B. The Local Control Method (LCM)

Local control methods are the currently adopted overload control strategies in distributed telecom networks. In these strategies, each node monitors its own load and makes decisions independent of all others. As shown in Fig. 1(a), we use two kinds of gate representing where throttling takes place. The gate values gio and gii denote respectively the acceptance rates of calls out-

TABLE I

The Simplified Call Processing Model

Call processing on the originating node job 1 (50ms) 1  3 s delay job 2 (150ms) 2  8 s delay job 3 (50ms)

Call processing on the terminating node job 4 (100ms) job 5 (50ms)

going from and incoming to node i. Each node monitors its own load and makes decisions independent of all others. When a node is overloaded, the local controller rst rejects the outgoing call requests. If this is still not e ective, the controller further adjusts the incoming gate. Here priority is given to the terminating calls to maximize the throughput, since they have already consumed processing resources in their originating nodes. LCM is certainly not an optimal control, for there is no cooperation between nodes. C. The Optimal Centralized Control Method (CCM)

In the centralized control algorithm networkwide information is available to the controller. Therefore through cooperative control on each node, only outgoing calls need to be throttled (Fig. 1(b)). CCM is able to take into account the multiple objectives prescribed in Section A, in which case the order of priority of the objectives determines the optimization procedure. We consider the maximization of throughput to be the most important, since it is a measure of system performance. Load balancing is next important, since it is a measure of system performance under uctuations. Fairness comes the third. The technique can be generalized to other choices of priorities. Hence CCM can be implemented as a sequence of linear programming problem. Suppose ij (t) is the call rate from node i to j in the time period t. The gate value gij (t) is the acceptance rate for outgoing calls from node i to j in the time period t. max is the prede ned capacity threshold. Here it is set to 0:85, slightly below the nominal value of 1 to accommodate for trac uctuations. the throughput PStepijone(t):gij (Maximize t ) subject to (i;j ) 0 

gij (t)  1;

(1)

0

X j

ij (t)gij (t) + 00

X j

ji(t)gji (t)

+i?left (t)  max ; 1  i  N; (2) where 0 is the averaged service time for a call on its originating node, which arrives in the current period. 00 is corresponding service time on the terminating node. i?left is the leftover load carried on from the previous periods on node i. It is given by i?left (t)

X  (t ? 1)g (t ? 1) ij ij j X  (t ? 2)g (t ? 2)  ij ij j X  (t ? 1)g (t ? 1) 0 ji ji j X  (t ? 2)g (t ? 2); 0

=

1

+

2

+ +

1

2

j

ji

ji

(3) where 1 and 2 are the averaged service times for a call on its orginating node, which has arrived in the previous one and two periods respectively. 10 and 20 are the corresponding service times on the terminating node. They are estimated by assuming the model in Table 1. ij (t) is estimated by averaging over a few periods. The above problem can be solved using the active set searching method in linear programming [9]. It turns out that the optimal solution space is often degenerate. Any point in the solution sapce has the same value of maximum throughput. Removing the degeneracy enables us to optimize the secondary objectives of load balancing and fairness, which is done within the subspace of maximum throughput. Mathematically it requires that all active constraints (equalities) are preserved. Step two: Optimize load balance by maximizing  in the subspace of maximum throughput, where 0

X j

ij (t)gij (t) + 00

X j

ji(t)gji (t)

+i?left (t) +   max ; (4) and each i denotes a non-full node in the subspace. Maximizing  decreases the load of the most congested nodes. As a result, the trac load is more evenly distributed among the stations. If there is still degeneracy, which is generally the case in our numerical simulation, the third optimization step is needed. Step three: Optimize fairness by maximizing  in the subspace of maximum

throughput and optimal load balance, where   gij (t)  1: (5) and each gij (t) denotes an undetermined gate value in the previous optimization. Maximizing the lower bound  will avoid unfair rejection in some nodes. This step is repeated until all remaining degeneracies, if any, are lifted. The method is very time-consuming. On HP 9000 workstations, one turn of decision making for a network of 7 fully connected nodes needs 0:4 seconds, and the time grows as N 6 with the size N of the networks [9]. It is also susceptible to network breakdowns and brings heavy load to the signalling network, since networkwide information is necessary. g

o

g

1

1

i

g

2

2

g

i

1

g

1

(a)

12

2

o

g

2

21

(b)

gij fk gk

...

ij

...

ξ

(c)

(d)

Fig. 1. (a)Local control method; (b)Centralized control method; (c)The part of neural network for calculating gij ; (d)A 7-Node fully connected network.

III. The Neural Network Control Method (NNM)

A neural network on a processor node receives input about the conditions of the connected call processors, and outputs corresponding control decisions about the gate values. It acquires this input-output mapping by a learning process using examples generated by CCM. It is dicult to train the neural networks properly using examples generated for a large range of trac intensity, but on the other hand, training them at a xed trac intensity makes them in exible to changes. Hence for each processor node, we build a group of neural networks, each member being a single layer perceptron trained by CCM using examples generated at a particular background trac intensity. The nal output is an interpolation of the outputs of all members

using radial basis functions, which weight the outputs according to the similarity between the background and real-time trac intensities. This enables the neural controller to make a smooth t to the desired control function, which is especially important during trac upsurges. A. Training a Member of the Group of Neural Networks

For a neural controller associated with a node, the available information includes the measurements within an updating period of all the outgoing and incoming call attempts, and the processing load of all nodes. Note that the processing load is the only global information fed into the neural controller. They are used to estimate the background load and leftover jobs on itself and other nodes. To increase the learning eciency of the neural networks, it is important to preprocess the inputs, so that they are most informative about the teacher control function. From the viewpoint of the neural controller at node i, the constraint of capacity is

X  o (t)g 0

j

ij

ij (t) +

j

j

ij (t)

 ~i ;

00 oij (t)gij (t)

 ~j ;

0

ij

0

0 0 τ 0 λ12(t)g12 (t)+τ 0 λ13(t) g13 (t) = ∼ ρ1

∼ ρ τ ’ λ0 3 0 13

ji

(8) j

6= i: (9)

P where ~i = max ? j ^ iji (t) ? i?left (t) 1

g13(t)

X ^ i (t)

+i?left (t)  max : (6) where ^0 is the averaged service time for a incoming call arriving in the current period. At the same time, the controller at node i should consider the constraints of capacity at other nodes j 6= i, that is 00 oij (t)gij (t) + 0 oji(t)gji(t) +ij ?left (t) + j ?back(t)  max ; j 6=(7)i: where the rst two terms are the processing load on node j generated by the trac ow between node i and j , and ij ?left is the corresponding leftover load. j ?back is the background processing load between node j and other nodes excluding node i. To the neural controller, the information of oji and gji for j 6= i is not available. We estimate oji (t)gji(t) to be iji (t). j ?back(t) is estimated by averaging over a few periods. For simplicity, we rewrite the equations (6) and (7) as

X  o (t)g

To nd the most informative inputs to the neural networks, we consider for illustration a simple network of 3-fully connected nodes. The feasible solution space satisfying the above constraints is shaded in Fig. 2. The following variables are important in re ecting the geometry of the shaded region: (a) the range along the direction of gij (t), given by ~j =00 ij for j 6= i. Since gij (t) lies between 0 and 1, we let min(~j =00 ij , 1) to be N ? 1 inputs to the neural control at node i. (b) the distance of plane corresponding to constraint P (8) from the origin, given by ~i =0 [ j (poij (t))2 ]1=2. Since this isPbounded abovepby N , we let min(~i=0 [ j (oij (t))2 ]1=2, N ) to be the N th input to the neural network at node i.

and ~j = max ?0 oji(t)gji (t)?ij ?left (t)? j ?back (t).

∼ ρ2 τ ’ λ0 12 0

0

g12(t)

Fig. 2. A simple diagram to illustrate the solution space of inequalities (8) and (9) for node 1.

The other N ? 1 inputs consist of the outgoing call attempts oijpfor node i. We P o o 2 normalize ij by a factor l (il ) , since according to the constraints in CCM, they represent the optimization direction in the space of gate values. The above inputs form a 2N ? 1 dimensional vector  1 fed to each neural network in the group, each trained by a distinct training set of examples. The kth member outputs the gate values gijk according to k gij

= f(

N? X Jk

2

1

n=1

1 k ijnn + Jij 0);

(10)

where f (h) = (1 + e?h )?1 is the sigmoid k and the bifunction. The couplings Jijn k ases Jij 0 are obtained during the learning process by gradient descent minimization of an energy function 1 X(O;k ? g;k )2 (11) E= ij 2 ;k ij where Oij;k is the optimal decision of gij prescribed by the teacher for example  in

B. Implementation of the Group of Neural Networks

Consider the part of neural controller for calculating the gate value gij , as shown in Fig. 1(d) (the other parts have the same structure). The kth hidden unit is trained at a particular trac intensity, and outputs the decision gijk ( 1 ) for the 2N ? 1 dimensional input vector  1 described in Section A. To weight the contribution of the kth output, we consider a N ? 1 dimensional input vector  2 which consists of the call rates 0ij (t), j 6= i. The weight f k ( 2 ) is the radial-basis-function (RBF) [10] given by f k ( 2 ) =

Pexp[exp[?(?(? ?k)l)=2=2k] ] 2

2

2

l

2

2

2

l

(12)

where k is the RBF center, and k is the size of the RBF cluster. In our case, k is the input vector  2 averaged over the kth training set of examples, and describes the backgound trac intensity. k2 is the variance of the input vector  2 in P the kth set; for Poisson trac it is equal to j 6=i hoij i=T . The nal output of neural network is a combination of the weighted outputs of all hidden units, that is, gij ( 1 ;  2) =

X f k ( )gk ( ): 2

k

ij

1

(13)

Since the numerator of (12) is a decreasing function of the distance between the vector  2 and k , the RBF center nearest to  2 has the largest weight. If  2 moves between the RBF centers, their relative weights change continuously, hence providing a smooth interpolation of the control function. IV. Simulation results

To compare the above three methods, we perform simulations on part of the Hong Kong metropolitan network (for convenience we call it the Jumbo Network), which consists of 7-fully connected switch stations (Fig. 1(d)). The call arrival rates between di erent nodes under normal traf c condition are shown in Table 2. The RBF centers of the neural networks are chosen as 1, 2, 3, 4, 6 and 8 multiples of the normal trac intensity. The performance of the three control methods are compared using the following criteria:

(1) Steady Throughput. Fig. 3 compares the steady throughputs of the system under di erent trac intensities. The simulation for each case is executed for 4000 seconds and the throughput is measured every 10 seconds. We see that the neural control performs comparably with the centralized control, and has a large improvement in throughput over the local control for a large range of trac intensities. 150

120 throughput

the kth training set, and gij;k is the output of the kth member of the group of neural networks.

90

60

neural control centralized control local control

1

2

3

4 5 traffic intensity

6

7

8

Fig. 3. Network throughput under constant traf c. The trac intensities are measured in multiples of the normal rates.

(2) Control Error. In reality control errors are unavoidable due to statistical uctuations. We de ne the control error (CE) as the fraction of the processing load which exceeds the nominal value 1, given by Pi;t(i(t) ? 1)(i (t) ? 1) Pi;t i (t) (14) CE = where i (t) is the actual load on node i in the control period t. (x) is the step function, which equals 1 when x  0 and 0 otherwise. CE re ects the stability of control to the uctuations. Fig. 4 compares the control error of the three methods during constant trac. It shows that CCM has lowest error at all trac intensities. For light trac, NNM and LCM have comparable control errors, whereas for heavy trac, NNM performs better than LCM. (3) Response to Trac Upsurges. Of particular interest to network management is the response of the system to trac upsurges. In reality this occurs in such cases as phone-in programs, telebeting and the hoisting of typhoon signals, when the amount of call attempts abruptly increases. It is expected that the control schemes should respond as fast as possible to accommodate the changing trac condition. Fig. 5 shows the system responses during a trac upsurge. We also measured the averaged control errors of three methods for the

subsequent 50 seconds, referred to as CEL, CEC and CEN respectively. We see that NNM has higher throughput than CCM, but with a slight, tolerable compromise in control error. They are both much better than LCM.

methodology successfully avoids the shortcomings of traditional centralized and local control methods. The control technique can be generalized to the distributed control of many large systems such as the ATM network and the wireless cellular network. TABLE II

0.020

Call arrival rates (per hour) of the Jumbo Network in the condition of normal traffic.

control error

0.015

0.010

0.005

0.000

neural control local control centralized control 1

2

3

4 5 traffic intensity

6

7

8

Fig. 4. Control errors under di erent trac intensities. 175

throughput

125

100 neural control CEN=0.095 centralized control CEC=0.071 local control CEL=0.090

75

0

40

80

120

simulation time

Fig. 5. Network throughput during a trac upsurge. The trac intensity increases to 6 times the normal rates at t = 40s.

Neural controller also signi cantly decreases the time for making decisions. For the network we simulated, it is about 10% of the CPU time of CCM. V. Conclusion and Discussion

In summary, we have found a neural network control algorithm for overload control in telecom systems. The neural controllers are implemented in each station and learns the controlling functions prescribed by an optimal centralized teacher. It combines the advantages of both local and centralized control methods, and achieves a simple, adaptive, robust and near-optimal control. In our problem, the teacher function is a complex optimization task with multiple objectives. Instead of learning the task by a sophisticated student network, the task is divided among a group of local student networks with simple architecture, which cooperate in the control function. This

S0

0 360 900 700 1080 250 500

S1

480 0 400 410 280 220 260

S2

1070 220 0 2090 1300 290 490

S3

1040 320 2100 0 970 170 430

S4

1640 390 1550 1020 0 230 450

S5

280 240 450 270 380 0 230

References Lor, W. K. F. & Wong, K. Y. M. (1995) Decentralized neural dynamic routing in circuit-switched networks, Proceedings of the International Workshop on Applications of Neural Networks to Telecommunications 2(IWANNT-95), Ed. J. Alspector et al., Lawrence Erlbaum Associates, New Jersy, pp.137-144. [2] Campbell, P. K., Dale, M., Ferra, H. L. & Kowalczyk, A. (1995) Experiments with neural networks for real time implementation of control, Advances in Neural Information Processing System 8, Cambridge, MA: MIT Press. [3] Hiramatsu, A. (1990) ATM communications network control by neural networks, IEEE Transactions on Neural Networks, V1, pp.122-130. [4] Eckberg, A. E. & Wirth, P. E. (1988) Switch overload and ow control strategies in an ISDN environment, Trac Engineering for ISDN Design and Planning, pp.425-434. [5] Hanselka, P., Oehlerich, J. & Wegmann, G. (1989) Adaptation of the Overload Regulation Method STATOR to Multiprocessor Controls and Simulation Results, ITC-12, pp.395-401. [6] Man eld, D., Denis, B., Basu, K. & Rouleau, G. (1985) Overload Control in a Hierarchical Switching System, ITC-11, pp.894-900. [7] Villen-Altamirano, M., Morales-Andres, G. & Bermejo-Saez, L. (1985) An Overload Control Strategy for Distributed Control Systems, ITC-11, pp.835-841. [8] Kaufman, J. S. & Kumar, A. (1989) Trac overload control in a fully distributed switching environment, ITC-12, pp.386-394. [9] Best, M. J. & Ritter, K. (1985) Linear Programming: Active Set Analysis and Computer Programs, Englewood Cli s,N.J.: Prentice-Hall. [10] Hertz, J., Krogh A. & Palmer, R. G. (1991) Introduction to the Theory of Neural Computation, London, Addison-Wesley.

[1]

150

50

S0 S1 S2 S3 S4 S5 S6

S6

670 300 520 410 400 210 0

Suggest Documents