The Most General Framework of Continuous Hopfield Neural Networks

0 downloads 0 Views 275KB Size Report
ization, the most general framework of continuous Hopfield networks is created, where almost ... well as by an outlook on future topics of research. ... In 1984, Hopfield introduced a deterministic model 5 having continuous-valued neurons.
The Most General Framework of Continuous Hop eld Neural Networks Jan van den Berg

Erasmus University Rotterdam, room H4-23, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands Email: [email protected]

Abstract A generalization of the energy function of the classical continuous Hop eld neural network is presented, the stationary points of which coincide with the complete set of equilibrium conditions of the network. Instead of applying statistical mechanical arguments, here, a direct proof is given. Thereupon, an energy expression is presented of a Hop eld network having built-in constraints, namely, of the so-called Potts glasses type. By then performing a far-reaching generalization, the most general framework of continuous Hop eld networks is created, where almost arbitrary energy functions can be chosen and where constraints of all kind can be incorporated in the neural net. The analysis includes the presentation of several stability theorems concerning various sets of di erential equations. Furthermore, the computational results of a few simulations are presented, some of which appear to have startling outcomes. The paper is nished by a discussion on the possibilities to apply the presented theoretical results as well as by an outlook on future topics of research.

1: The classical continuous Hop eld network In 1984, Hop eld introduced a deterministic model [5] having continuous-valued neurons and an energy Ec (V ) conform Ec (V )

= , 12 =

|

X i;j

wij Vi Vj ,

X

{z

i

E (V )

Ii Vi +

XZ V

} | +

i

i

0

g ,1 (v )dv

{z

Eh (V )

}

(1) :

(2)

function to be minimized and V = (V1 ; V2 ; : : : ; Vn ) 2 [0; 1]n is often called the state vector, where Vi represents the output value of neuron i. The term Eh (V ), which we call the `Hop eld term', has a statistical mechanical interpretation [1, 3]. Its general e ect is a displacement of the minima of E (V ) towards the interior of the state space [5] whose magnitude depends on the current `temperature' in the system: the higher the temperature is, the larger is the displacement towards the interior1. The motion equations corresponding to (1) are E (V ) is the cost

U_ i = ,

@Ec (V ) @Vi

=

X j

wij Vj + Ii , Ui ;

(3)

1 In case of choosing the sigmoid Vi = 1=(1 + exp ( Ui )) as the transfer P function g(Ui), the part of temperature is acted by 1= and Eh(V ) can be written as Eh (V ) = 1 i (Vi ln Vi + (1 , Vi ) ln(1 , Vi )).

where continuously Vi = g(Ui ) should hold. Ui represents the input of neuron i. After an

r

r

r

r

r

r

I1

P

w11 w1n

j

U1

g

V1

U2

g

V2

Un

g

Vn

I2

P

w21 w2n

j

In

P

wn1 wnn

j

The original continuous Hop eld network.

Figure 1.

initialization, the network is generally not in an equilibrium state. Then, while keeping = g(Ui ) valid, the input values Ui are adapted in agreement with (3). The following theorem [5] gives conditions for which an equilibrium state will eventually be reached: Theorem 1 (Hop eld). If (wij ) is a symmetric matrix and if 8i : Vi = g(Ui ) is a monotone increasing, di erentiable function, then Ec is a Lyapunov function for motion equations (3). Under the given conditions, the theorem guarantees convergence to an equilibrium state of the neural net where X 8i : Vi = g(Ui ) ^ Ui = wij Vj + Ii : (4) Vi

j

2: A rst generalization

Theorem 2. The energy expression (1) can be generalized to an energy expression Fg1 (in both the input and the output value of all neurons), de ned by X X X X Fg1 (U; V ) = , 21 wij Vi Vj , Ii Vi + Ui Vi , i;j

i

i

i

ZU

i

0

g (u)du:

(5)

If (wij ) is a symmetric matrix, then any stationary point of the energy Fg1 coincides with an equilibrium state of the continuous Hop eld neural network. Proof. Fg1 can simply be derived2 from Hop eld's original expression (1) using partial integration. Having Vi = g(Ui ), we can write

XZ V

i

i

0

g ,1 (v )dv =

PR

X  ,1 i

g

XZ U  V (v)v , 0

i

i

i

g,1 (0)

v du =

X i

Ui Vi ,

XZ U i

0

i

g (u)du + c;

(6)

where c = , i g0,1 (0) g(u)du is an unimportant constant which may be neglected3 . Substitution of (6) in (1) yields (5). Furthermore, by resolving 8i : @Fg1 =@Ui = 0 ^ @Fg1 =@Vi = 0; (7) 2 In fact, Fg1 has a statistical mechanical background and was originally derived by means of a `mean eld' analysis of the binary stochastic Hop eld model [1]. 3 It is not dicult to see that g(0) = 0 ) c = 0.

ut

the set of equilibrium conditions (4) is immediately found.

Theorem 3. If (wij ) is a symmetric matrix, if Fg1 is bounded below, and if 8i : Vi = g(Ui )

is a di erentiable and monotone increasing function, then Fg1 is a Lyapunov function for the motion equations (3). Proof. Taking the time derivative of Fg1 , we nd F_g1

=

X @Ffg _ X @Ffg _ i

= ,

@Vi

Vi +

X_ i

@Ui

i

Ui V_ i = ,

X i

Ui =

X, X ,

i

(U_ i )2 ddUVi  0: i

j



wij Vj , Ii + Ui V_ i +

X, i



Vi , g (Ui ) U_ i

(8)

Since Fg1 is supposed to be bounded below (which generally is the case [1]), execution of the motion equations (3) constantly decreases Fg1 until 8i : U_ i = 0. ut

Theorem 4. If the matrix (wij ) is symmetric and positive de nite and Fg1 is bounded below, then Fg1 or alternatively, if the matrix (wij ) is symmetric and negative de nite and ,Fg1 is bounded below, then ,Fg1 is a Lyapunov function for the motion equations V_ i = g (Ui ) , Vi ;

where Ui =

X j

wij Vj + Ii :

(9)

Proof. The proof again considers the time derivative of Fg1 . If (wij ) is positive de nite, X_ _ X _ X @Ui _ X_X _ _

then

Fg1 = ,

i

Vi Ui = ,

i

Vi

j

V @Vj j

=,

Vi

i

j

wij Vj

 0:

(10)

If (wij ) is negative de nite, then ,F_g1  0. In both cases, updating conform (9) decreases the corresponding Lyapunov function until, nally, 8i : V_ i = 0. ut We nish this section by observing that in case of taking the sigmoid of footnote 1 as the transfer function, equation (5) reduces to

X X X X Fg2 (U; V ) = , 21 wij Vi Vj , Ii Vi + Ui Vi , 1 ln(1 + exp( Ui )): i;j

i

i

i

(11)

3: Building-in constraints If the cost function E (V ) is submitted to certain constraints, it can be tried to incorporate these constraints in the neural network. A classical example relates to an attempt to solve the `travelling salesman problem' [8]. The transfer function chosen was Vi = gi (U ) =

Ui ) Pexp( exp( U ) l

l

(implying the constraint:

X i

Vi = 1):

(12)

By taking this transfer function, the original Hop eld model has been generalized to a model where Vi = gi (U ) is a function of U1 ; U2 ; : : : ; Un and not of Ui alone. As in the previous section, a generalized energy expression analogous to (11) can be found using a statistical mechanical analysis [1, 3], where, again, 1= plays the part of temperature:

Theorem 5. If (wij ) is a symmetric matrix, then the energy of the continuous Hop eld network submitted to the constraint as mentioned in (12) can be stated as X X X X Fg3 (U; V ) = , 21 wij Vi Vj , Ii Vi + Vi Ui , 1 ln( exp( Ui )): i;j

i

i

i

(13)

The stationary points of Fg3 are found at points of the state space for which Ui ) 8i : Vi = Pexp( exp( U ) ^ l

l

Ui =

X j

wij Vj + Ii :

(14)

Proof. The derivation of expression (13) is here omitted, but can be found in [1]. Resolution of the equations @Fg3 =@Ui = 0; @Fg3 =@Vi = 0 yields the equations (14) as solutions. ut From the character of Vi = gi (U ) it follows that theorem 1 does not hold here. It induces the question under which conditions, the constrained Hop eld model converges. Theorem 6. If (wij ) is a symmetric matrix, if Fg3 is bounded below, if (12) is used as the transfer function, and if, during updating, the Jacobian matrix Jg = (@Vi =@Uj ) rst becomes and then remains positive de nite, then the energy Fg3 is a Lyapunov function for the motion equations (3). Proof. Assuming that the conditions of the theorem hold we may say that in the long run F_g3

=

X @Fg3 _ X @Fg3 _ i

= ,

@Vi

Vi +

@Ui

i

X _ X @Vi i

Ui

U_ j @U j j

Ui =

X X i

(,

j

wij Vj , Ii + Ui )V_ i +

X i

Ui ) _ )Ui (Vi , Pexp( l exp( Ul )

= ,U_ T Jg U_  0:

(15)

Since Fg3 is bounded below, its value decreases constantly until 8i : U_ i = 0.

ut

Whether the general condition holds that the matrix Jg will become and remain positive de nite, is not easy to say. Applying lemma 1 from the appendix, the symmetric matrix Jg is given by 0 V (1 , V ) ,V V    ,V V 1 BB 1,V2V1 1 V2 (1 ,1 V2 2 )    ,V12Vnn CC B

.. .

@

.. .

CA :

(16)

wij Vj + Ii :

(17)

.. .

,Vn V1 ,Vn V2    Vn (1 , Vn ) Thus, all diagonal elements of Jg are positive, while all non-diagonal elements are negative. P Knowing i Vi = 1, we argue that for large n in general 8i; 8j; 8k : Vi Vj 0

and

@Vi @Ul

= , Vi Vl < 0:

Proof. The proof of the lemma is straightforward. It can be found in [1].

References [1] J. van den Berg. Neural Relaxation Dynamics, Mathematics and Physics of Recurrent Neural Networks with Applications in the eld of Combinatorial Optimization, PhD thesis, Erasmus University Rotterdam, 1996. [2] J. van den Berg and J.C. Bioch. Constrained Optimization with the Hop eld-Lagrange Model. Proceedings of the 14th IMACS World Congress, 470{473, Atlanta, GA 30332 USA, 1994. [3] J. van den Berg and J.C. Bioch. On the (Free) Energy of Hop eld Networks. In Neural Networks: The Statistical Mechanics Perspective, Proceedings of the CTP-PBSRI Joint Workshop on Theoretical Physics, eds. J.H. Oh, C. Kwon, S. Cho, 233{244, World Scienti c, 1995. [4] J. Hertz, A. Krogh, and R.G. Palmer. Introduction to the Theory of Neural Computation, Addison-Wesley, 1991. [5] J.J. Hop eld. Neurons with Graded Responses Have Collective Computational Properties Like Those of Two-State Neurons, Proceedings of the National Academy of Sciences, USA, 81, 3088{ 3092, 1984. [6] M. Muneyasu, K. Hotta and T. Hinamoto. Image Restauration by Hop eld Networks Considering the Line Process, Proceedings of the IEEE International Conference on Neural Networks, 1703{1707, Perth, Western Australia, 1995. [7] Y-K. Park and G. Lee. Applications of Neural Networks in High-Speed Communication Networks, IEEE Communications, 33-10, 68{74, 1995. [8] C. Peterson and B. Soderberg. A New Method for Mapping Optimization Problems onto Neural Networks, International Journal of Neural Systems 1, 3{22, 1989. [9] J. Starke, N. Kubota, and T. Fukuda. Combinatorial Optimization with Higher Order Neural Networks - Cost Oriented Competing Processes in Flexible Manufactoring Systems, Proceedings of the IEEE International Conference on Neural Networks, 1855{1860, Perth, Western Australia, 1995.

Suggest Documents