ReSuMe - Proof of convergence - Semantic Scholar

ReSuMe - Proof of convergence Filip Ponulak Institute of Control and Information Engineering Poznań University of Technology Piotrowo 3a, Poznań, Poland [email protected] http://d1.cie.put.poznan.pl/˜fp October 12, 2006 Abstract We consider learning convergence in Spiking Neural Networks trained according to the ReSuMe learning rule. We begin with a short introduction of the ReSuMe learning rules and we define the learning scenarios to be considered. Next, we present a formal proof of learning convergence for the Spike Response Model (SRM) trained according to the defined scenarios. Finally, we demonstrate that the proof made for the SRM holds for the wide class of the spiking neurons. keywords: Spiking Neural Networks, Supervised Learning, Proof of Convergence

1 Introduction In this report we examine convergence of the learning process performed according to the ReSuMe learning rules. ReSuMe (or Remote Supervised Method) [1, 2] is a novel algorithm for precise learning of spatio-temporal patterns of spikes in Spiking Neural Networks (SNN) [3, 4, 5]. The method corresponds to the Widrow-Hoff rule, but takes advantage of the plasticity mechanisms similar to the spike-based Hebbian processes, such as Spike-Timing Dependent Plasticity (STDP) [1, 5]. It has been demonstrated that ReSuMe enables efficient learning of complex spatio-temporal spike patterns with the given accuracy and that the method enables imposing on the networks the desired input/output properties [2, 6]. In addition, it has been shown that ReSuMe can be successfully applied to various 1

models of spiking neurons (from simple Leaky-Integrate-and-Fire neurons (LIF), to complex biologically realistic models). In [6] we demonstrated the generalization properties of the spiking neurons trained with ReSuMe. It was also shown that SNN can be trained with ReSuMe to perform the function approximation tasks. ReSuMe proved to be suitable to the practical applications such as modelling or movement generation and control [7, 8, 9]. In this paper we focus our attention on the convergence of the ReSuMe learning process. First, we introduce the ReSuMe learning rule and define the learning scenarios to be considered. Then, we present a formal proof of learning convergence for the Spike Response Model (SRM) trained according to the defined scenarios. Finally, we demonstrate that the proof made for the SRM still holds for the wide class of the spiking neurons.

2 Learning rule The ReSuMe learning rule is defined by the following equation: Z ∞ d d in o W (s) S (t − s) ds , wki (t) = S (t) − S (t) a + dt 0

(1)

where S d (t), S in (t) and S o (t) are the desired, pre- and postsynaptic spike trains, respectively. According to [5], a spike train of a neuron m is defined as a sequence of the impulses triggered at the firing times: X Sm (t) = δ(t − tfm ), (2) f

where δ(x) is an impulse function; δ(x) = 1 for x = 0 and δ(x) = 0 elsewhere. The (real-valued) constant a determines amplitudes of the non-Hebbian terms. For the excitatory synapses a > 0, whereas for the inhibitory synapses a < 0. The role of the non-Hebbian factor in ReSuMe is to adjust the average strength of the synaptic inputs so as to impose on a neuron the desired level of activity (desired mean firing rate). The integral in Eq.(1) represents the Hebbian contribution to the weight change. The factor W (s) is known as a learning window [10, 11]. The learning window W (s) is defined as: ( +A+ · exp −s if s > 0, τ+ W (s) = (3) 0 if s ≤ 0,

2

with A+ and τ+ being the amplitude and the time constant of the learning window, respectively. For the excitatory synapses A+ > 0 while for the inhibitory synapses A+ < 0; in both cases τ+ > 0. The parameter s is a time delay between the correlated spikes. In ReSuMe we apply the learning windows concept to correlate the presynaptic spikes occurring at the firing times tin,f with the postsynaptic ones at to,f , as well as with the spikes in S d (t) occurring at td,f . In the particular cases s = (to,f −tin,f ) or s = (td,f −tin,f ), respectively.

3 Convergence Proof Consider a single neuron no with a single synaptic input characterized by a modifiable parameter w. The neuron is excited with a single presynaptic impulse at tin . It generates an action potential at to ; however, it is expected to fire at td (Fig.1.A). We assume that tin < td , to . The synaptic weight w is modified according to the ReSuMe learning rule defined by Eq.(1). We denote the defined input/output conditions as scenario (S1). In this simple system, we are interested in a total weight change ∆w after an occurrence of all three spikes (we assume that ∆w is measured at t → ∞). We find ∆w by integrating the expression (1) over time. Since we operate on single spikes, we substitute S d (t), S o (t) and S in (t) with the impulse functions δ(t − td ), δ(t − to ), δ(t − tin ), respectively. After performing the necessary transformations we obtain: d o −t + tin −t + tin ∆w = A+ exp − A+ exp , (4) τ+ τ+ where A+ > 0 is the amplitude and τ+ > 0 is the time constant of the learning window. The learning rule defined by Eq.(4) is general enough to deal also with other scenarios: • Scenario (S2): the neuron no fires at some time to although it is expected not to fire (in this case td may be assumed to approach infinity), • Scenario (S3): the neuron no does not fire (to → ∞) although it is expected to generate a spike at td . In the case of (S2) the learning equation (4) takes a form: o −t + tin . ∆w = −A+ exp τ+ 3

(5)

A.

t

in

in

n

d

t

n

w

d

o

t (w)

o

n

time

B.

J w’’

q

w’’>w’>w

w’ w

u res time

t Figure 1: (A) A neuron no is excited with a single presynaptic impulse at tin through the synaptic connection characterized by the efficacy w. The neuron is expected to fire at td . Its recent firing to (w) is determined by the parameter w, which is modified according to the ReSuMe rules. (B) If no is represented by the Spike Response Model then the EPSP has its maximum at tin + τ and the firing of no is assumed to occur immediately after the membrane potential passes the threshold ϑ.

Since the right hand side of Eq.(5) is always negative for the bounded values of to and tin , then by repeating the learning process we constantly decrease the synaptic efficacy w until it passes the threshold value below which the neuron models stop to fire, as expected. In the case of scenario (S3) the synaptic efficacy w is modified according to: d −t + tin ∆w = +A+ exp . (6) τ+ In this case the right hand side of Eq.(6) is always positive and the efficacy w will be increased until it passes the threshold value, for which the neuron no begins to fire and the learning can be further proceeded according to Eq.(4). This discussion demonstrated that all scenarios (S1), (S2) and (S3) can be accounted for in our further considerations simply by applying the same learning rule given by Eq.(4). 4

In the following we formally prove that by repeating the learning process with the learning rule given by Eq.(4), the synaptic weight w approaches some optimal value w˜ which ensures the firing of the neuron no at to = td . Theorem. 1 Given a neuron no with a synaptic input w, let [wmin , wmax ] ⊆ R+ be the range of possible values of w. Let the learning rate A+ and the constant τ+ be any arbitrary positive, real numbers. If neuron no is excited at time tin in each learning epoch and no is required to fire at td and w is updated according to Eq.(4) then, for any arbitrary initial weight from [wmin , wmax ], the sequence of weight values converges to w˜ = θτ (td − tin )−1 exp −1 + (td − tin )τ −1 , with the positive, real constants θ and τ . The value w˜ ensures the firing of neuron no at to = td . We begin with proving the learning convergence for the SRM neuron model. Next, we generalize the obtained results to other neural models. Hence, we can demonstrate that the convergence of the learning process may be ensured for diverse neuron models and the learning method is neuron model–independent. Proof. Consider the SRM model with the EPSP potential described by: t − tin t − tin in ǫ(t − t ) = w . exp 1 − τ τ

(7)

This equation describes the EPSP for t > tin , while for t < tin : ǫ(t − tin ) ≡ 0 [5]. Symbol τ > 0 represents the membrane time constant and w is a synaptic weight. From Eq.(7) we can compute the firing time to by assuming that max(ǫ(t)) = θ = (ϑ − Em ). Here θ (> 0) is a depolarization potential required to cross the membrane threshold ϑ, starting from the resting potential Em . In the case of EPSP defined by Eq.(7) it is clear (cf. Fig.1.B) that the EPSP has its maximum at the time (tin+τ ). The threshold is reached always at the rising slope of the EPSP, hence the desired firing time td must occur within an interval [tin , tin +τ ]. In order to get max(ǫ(t)) > θ it is necessary that w ∈ [wmin , wmax ] where wmin > θ. By substituting t = to and ǫ(t − tin ) = θ we can solve Eq.(7) for to : −θe−1 o t = −τ L + tin , (8) w where L denotes Lambert function [12] and e−1 = exp(−1). Lambert function L(−θe−1 w−1 ) maps [wmin , ∞] into [−1, 0] (cf. Fig.2.A).

5

A.

-0.8 -0.9

B.

(-qe-1w -1)

-1

104

d dw

(-qe-1w -1)

b(w)=exp t t+

(-qe-1w -1)

2

C.

10

0.015 0.01

D. 10-2

d b(w) dw

10-4 1

1.005

1.01

1.015

1.02

wq

Figure 2: Graphical illustration of the functions: (A) Lambert function L(−θe−1/w), (B) derivative of L(−θe−1/w) with respect to w, (C) function β(w) (cf. Eq.13), (D) derivative dβ(w)/d(w). All graphs plotted against the synaptic weight w normalized with respect to the depolarization potential θ. Constant parameters: θ = 15 mV, τ+ = 0.2 ms, τ = 1 ms.

We find optimal weight w˜ required to obtain the firing of the neuron no exactly at t . The value of w˜ is computed from Eq.(7) by substituting t with td : θτ td − tin w˜ = d exp −1 + . (9) (t − tin ) τ d

With w˜ we can express td as: −θe−1 t = −τ L + tin . w˜ d

Next, we rewrite Eq.(4) using Eqs.(8) and (10): −θe−1 τ −θe−1 τ L − A+ exp L . ∆w = A+ exp τ+ w˜ τ+ w 6

(10)

(11)

Thus we can describe the progression of w by an iterative map f : R → R defined as: f (w) = w + ∆w = w + A+ β(w) ˜ − A+ β(w), (12) with:

τ −θe−1 β(w) = exp L . τ+ w

(13)

Remark 1: we observe that the function β(w) is continuous, monotonically increasing and bounded for all w, w˜ ∈ [wmin , wmax ] (cf. Fig.2.C). Now, we prove Theorem 1 according to the fixed point theory [13]. For this we need to demonstrate that the following conditions are satisfied: f (w) has a single, positive fixed point w˜ given by Eq.(9); f (w) maps [wmin , wmax ] into itself and f (w) is contractive in [wmin , wmax ]. We prove that Eq.(9) describes the fixed point of f (w) by showing that f (w) ˜ = w. ˜ According to Eq.(12) we have: f (w) ˜ = w˜ + A+ β(w) ˜ − A+ β(w) ˜ = w. ˜ Self-mapping of f can be proved by demonstrating that: a) f (wmax ) ≤ wmax , b) f (wmin ) ≥ wmin , c) ∀w∈[wmin ,wmax ] f ′ (w) ≥ 0, where f ′ (w) denotes a derivative of f (w) with respect to w. These conditions also ensure that w˜ is a single fixed point. According to Eq.(12) condition (a) is equivalent to: wmax + A+ β(w) ˜ − A+ β(wmax ) ≤ wmax .

(14)

Inequality (14) is true, if and only if β(w) ˜ ≤ β(wmax ). According to Remark 1, w˜ ≤ wmax implies β(w) ˜ ≤ β(wmax ). This proves (a). Analogically, w˜ ≥ wmin implies β(w) ˜ ≥ β(wmin ). This proves (b). Next, consider condition (c). According to Eq.(12): f ′ (w) = 1 − A+β ′ (w), where β ′ (w) (cf. Fig.2.D) is given by: τ −L(−θe−1/w) τ −θe−1 ′ β (w) = · exp L . (15) · wτ+ τ+ w [1 + L(−θe−1/w)] Since L(−θe−1/w) ∈ [−1, 0) for all w ∈ [wmin , wmax ] then all factors on the right hand side of Eq.(15) are positive and the derivative β ′ (w) > 0 for all w ∈ [wmin , wmax ] (this also comes directly from Remark 1). 7

Now, we consider the condition f ′ (w) ≥ 0 equivalent to A+ β ′ (w) ≤ 1. This condition is always satisfied for A+ ≤ β ′ (w∗ )−1 , where w∗ = maxw (β ′ (w)). Function β(w) is continuous and bounded in [wmin , wmax ], which implies that also β ′ (w) is bounded and w∗ exists. According to these considerations the convergence is guaranteed for the learning rate A+ < Amax where: + Amax +

−1 −1 τ −θe−1 τ+ ′ −θe · L · exp − L . = τ τ+ w∗ w∗

(16)

Finally we prove that f is contractive in [wmin , wmax ], i.e. |f ′ (w)| ≤ L for all [wmin , wmax ], with Lipschitz constant L < 1. Condition |f ′ (w)| ≤ L is equivalent to: ( A+ β ′ (w) > 0, (17) A+ β ′ (w) < 2. The learning rate A+ is positive by the definition and we have already shown that β ′ (w) > 0 and A+ β ′ (w) ≤ 1 for all [wmin , wmax ], where wmin > θ. These conditions ensure that inequalities (17) hold for all [wmin , wmax ]. This ends the proof of Theorem 1. Next, we will generalize the proof made for the SRM model to the wider class of neural models. First, note that we were able to satisfy all conditions in the proof of Theorem 1 simply based on the observation that β(w) (Eq.12) is a continuous, monotonically increasing and bounded function for all [wmin , wmax ] (cf. Remark 1). According to the definition of the learning window W (s) and according to the equations (12),(4) we can rewrite β(w) as: β(w) =

1 W (s(w)) , A+

(18)

where s(w) expresses the delay between the input spike at tin and the resulting output firing at to as a function of the synaptic weight w. From Eq.(18) we see that Remark 1 holds for all neuron models and all learning windows, for which both conditions are satisfied: a) s(w) is continuous and monotonically decreasing for all w ∈ [wmin , wmax ], b) W (s) is a continuous, monotonically decreasing and bounded function of s(w), with w ∈ [wmin , wmax ].

8

Figure 3: Comparison of the function s = s(w) for the different spiking neural models. Schematic illustration for: (A) Spike Response Model, (B) LeakyIntegrate-and-Fire, (C) Quadratic LIF model, (D) Hodgkin-Huxley model.

For the case of SRM model, the function s(w) can be easily evaluated from Eq.(8). This is illustrated in Fig.3.A. For the other spiking neural models, such as for LIF, Quadratic-LIF or HH models it is difficult to express s = s(w) explicitly. For these models we present the relation s(w) obtained from the simulations (Fig.3.B,C,D, respectively). In a set of simple experiments the particular neural models were excited via a single synapse with a modified efficacy w. The delays s = (to − tin ) were recorded for the particular values of w. Comparison of the obtained results (Fig.3) demonstrates that the function s = s(w) is similar for all these neuron models and in each case satisfies condition (a). According to Remark 1 and Eq.(18) this is sufficient to confirm that the proof of convergence holds for these models.

References [1] Filip Ponulak. ReSuMe - new supervised learning method for Spiking Neural Networks. Technical Report, Institute of Control and Information Engineering, Poznan University of Technology, 2005. Available at http://d1.cie.put.poznan. pl/˜fp/. [2] Andrzej Kasiński and Filip Ponulak. Experimental Demonstration of Learning Properties of a New Supervised Learning Method for the Spiking Neural Networks. In Proceedings of the 15th International Conference on Artificial Neural Networks: Biological Inspirations, volume 3696 of Lecture Notes in Computer Science, pages 145–153, 2005.

9

[3] Andrzej Kasiński and Filip Ponulak. Comparison of Supervised Learning Methods for Spike Time Coding in Spiking Neural Networks. Int. J.of Applied Mathematics and Computer Science, 16(1):101–113, 2006. [4] Wolfgang Maass. Networks of spiking neurons: The third generation of neural network models. Neural Networks, 10(9):1659–1671, 1997. [5] Wulfram Gerstner and Werner Kistler. Spiking Neuron Models. Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge, 2002. [6] Filip Ponulak and Andrzej Kasiński. Generalization Properties of SNN Trained with ReSuMe. In Proc. of European Symposium on Artificial Neural Networks, ESANN’2006, pages 623–629, 2006. [7] Filip Ponulak and Andrzej Kasiński. A novel approach towards movement control with Spiking Neural Networks. In Proc. of 3rd International Symposium on Adaptive Motion in Animals and Machines, Ilmenau, 2005. (Abstract). [8] Filip Ponulak and Andrzej Kasiński. ReSuMe learning method for Spiking Neural Networks dedicated to neuroprostheses control. In Proc. of EPFL LATSIS Symposium 2006, Dynamical principles for neuroscience and intelligent biomimetic devices, pages 119–120, 2006. [9] Filip Ponulak, Dominik Belter, and Andrzej Kasiński. Adaptive Central Pattern Generator based on Spiking Neural Networks. In Proc. of EPFL LATSIS Symposium 2006, Dynamical principles for neuroscience and intelligent biomimetic devices, pages 121–122, 2006. [10] Henry Markram, J. Luebke, M. Frotscher, and B. Sakmann. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science, 275:213—-215, 1997. [11] Guo-Qiang Bi and Mu-Ming Poo. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type. The Journal of Neuroscience, 18(24):10464–10472, 1998. [12] Eric W. Weisstein. Lambert function. From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/LambertW-Function.html. [13] E.K. Blum. Numerical Analysis and Computation: Theory and Practice. AddisonWesley, Reading, Mass., 1972.

10