A Nonlinear Autoregressive Volterra Model of the

J Comput Neurosci manuscript No. (will be inserted by the editor)

A Nonlinear Autoregressive Volterra Model of the Hodgkin-Huxley Equations Steffen E. Eikenberry · Vasilis Z. Marmarelis

Received: date / Accepted: date

Abstract We propose a new variant of Volterra-type model with a nonlinear auto-regressive (NAR) component that is a suitable framework for describing the process of AP generation by the neuron membrane potential, and we apply it to input-output data generated by the Hodgkin-Huxley (H-H) equations. Volterra models use a functional series expansion to describe the input-output relation for most nonlinear dynamic systems, and are applicable to a wide range of physiologic systems. It is difficult, however, to apply the Volterra methodology to the H-H model because is characterized by distinct subthreshold and suprathreshold dynamics. When threshold is crossed, an autonomous action potential (AP) is generated, implying an effective decoupling of the output from the input, and thus the standard Volterra model fails. Therefore, in our framework, whenever membrane potential exceeds some threshold, it is taken as a second input to a dual-input Volterra model. This model correctly predicts membrane voltage deflection both within the subthreshold region and during APs. Moreover, the model naturally generates a post-AP afterpotential and refractory period. It is known that the H-H model converges to a limit cycle in response to a constant current injection. This behavior is correctly predicted by the proposed model, while the standard Volterra model is incapable of generating such limit cycle behavior. The inclusion of cross-kernels, S. Eikenberry Department of Biomedical Engineering, University of Southern California, 1042 Downey Way, Los Angeles, CA 90089, USA E-mail: [email protected] V. Z. Marmarelis Department of Biomedical Engineering, University of Southern California, 1042 Downey Way, Los Angeles, CA 90089, USA E-mail: [email protected]

which describe the nonlinear interactions between the exogenous and autoregressive inputs, is found to be absolutely necessary. The proposed model is general, nonparametric, and data-derived. Keywords Volterra kernels · Nonlinear modeling · Neuronal modeling · Laguerre expansions · Autoregressive model 1 Introduction Over the past century, numerous mathematical models for single-neuron firing dynamics have been proposed, ranging from simple parametric integrate-and-fire models to highly detailed biophysical models. Hodgkin and Huxley (1952) proposed a four-equation model describing the axonal membrane potential in response to injected current and leading to the generation of an action potential (AP). This effort was justly rewarded with a Nobel prize, and the Hodgkin-Huxley (H-H) model remains the canonical model for AP generation by a firing neuron. The purpose of this paper is to introduce a new method to model the AP generation by a single-neuron based on a Volterra-type model that incorporates a nonlinear auto-regressive (NAR) component. We use data simulated by the H-H model as ground truth for developing and testing our proposed model. The Volterra series, introduced by Vito Volterra in his 1930 monograph (Volterra, 1930), is a powerful and general method for describing a continuous functional. In general, some functional, F [·], maps an “input” function, x(t′ ), t′ ≤ t, onto an output scalar, y(t): y(t) = F [x(t′ ), t′ ≤ t]

(1)

Volterra proposed that any continuous, finite-memory functional may be represented by a functional power se-

2

Steffen E. Eikenberry, Vasilis Z. Marmarelis

ries expansion (Volterra series) containing kernel functions: ∫

∞

y(t) = k0 + ∫

0 ∞∫ ∞

0

k1 (τ )x(t − τ )dτ + k2 (τ1 , τ2 )x(t − τ1 )x(t − τ2 )dτ1 dτ2 + ...

(2)

0

This is essentially a generalization of the Taylor series expansion: where a Taylor series takes a point as input to a function and gives the output point, the Volterra series takes a function as input to a functional and gives the output point (“the Taylor series of a line”). The system identification task consists of estimating the Volterra kernels, ki . This can be done efficiently by expanding the kernels on a basis of functions (Marmarelis, 1993). In this work, we expand the kernels on a Laguerre basis (“Laguerre expansion technique”), which allows kernels to be estimated from any input-output record that is sufficiently broadband using the method of ordinary least squares (OLS), as discussed further in Section 2.2. The Volterra series represents a general method for describing physiologic input-output dynamic relationships, and has been successfully applied to a number of systems (for a partial review see Marmarelis (2004). The related Wiener series methodology (see Wiener (1958); Lee and Schetzen (1965)) estimates Wiener kernels from Gaussian white noise (GWN) input and the corresponding output. Beginning with the 1974 work of Guttman et al. (1974), a fair number of authors have applied the Wiener methodology to white noise input to both real neurons and the H-H equations, e.g. (Guttman et al., 1974; Guttman and Feldman, 1975; Bryant and Segundo, 1976; Guttman et al., 1977; Buno et al., 1984; Korenberg et al., 1988; Bustamante and Bu˜ no, 1992; Lewis et al., 2000; Takahata et al., 2002). Poggio and Torre (1977) also derived Volterra representations of the leaky integrator and the integrate-and-fire neuron models. Marmarelis (1989) proposed modeling input-output data for single neurons by use of “neuronal modes,” a set of filters which represent the nonlinear dynamics involved in neuronal signal transformation. The latter includes the cascaded processes of somatodendritic integration and AP generation at the axon hillock. The outputs of these filters feed into a static nonlinearity followed by a threshold, resulting in a multi-input threshold. Marmarelis and Orme (1993) estimated neuronal modes from synthetic input-output data using Volterra series and the Laguerre expansion technique. This approach has been generalized to multiple inputs and outputs (Marmarelis, 2004). Extensive applications of the multi-input/multi-output Volterra methodology

have been made by Berger, Marmarelis, and colleagues, to the study of the functional characteristics of the hippocampal formation (Berger et al., 2010; Hampson et al., 2012a,b; Song et al., 2007, 2009; Zanos et al., 2008). Gerstner and van Hemmen (1992) proposed the socalled spike response model (SRM) for single-neuron dynamics (see also Gerstner (1995)). A neuron fires an action potential (AP) whenever an abstract “membrane potential,” h(t), exceeds some threshold θ. Membrane potential is determined as the sum of two functions: one accounts for the refractory period following an AP, while the other integrates the effects of synaptic inputs to the neuron. Kistler et al. (1997) proposed a Volterra style SRM for the H-H equations. In this model, the membrane potential output, u(t), is determined from the input current, I(t), according to a modified Volterra expansion: ∫ u(t) = η(t − tf ) +

∞

ε(1) (t − tf ; τ )I(t − τ )dτ + ... (3)

0

where tf is the time of the most recent AP, and an AP is considered to occurred if u(t) crosses the threshold θ from below. The kernel η(t − tf ) gives the AP and afterpotential waveforms, and it is imposed. The “response kernels,” ε(i) , determine the potential response to injected current and vary with time since the most recent spike, t − tf . Several more recent works have applied very similar Volterra style SRMs (the principal difference being the inclusion of a dynamic threshold) to real cortical neuron data (Jolivet et al., 2006, 2008). The fact that Kistler and colleagues found it necessary to impose the AP waveform as an additional kernel points to the central difficulty in applying the Volterra series approach. Namely, the output of the HH model is not simply a function of the past input, but it is characterized by two distinct dynamical regimes. Within the subthreshold regime, to a very good approximation, the output is a function of the recent input and can be represented well by a Volterra model. However, upon initiation of an AP, the ionic conductances change dramatically, and the form of the AP is practically fixed regardless of the particular values of the current input. Therefore, during an AP the input and the output are effectively decoupled. AP firing also initiates the processes of an afterpotential and refractoriness, which affect the forms of the subthreshold Volterra kernels describing the membrane response to input following the AP. Note that refractoriness is not a simple binary state, but varies continuously with the time since the most recent AP. Given these difficulties, we advocate a new Volterratype modeling framework to model a single H-H neuron.

A Nonlinear Autoregressive Volterra Model of the Hodgkin-Huxley Equations

We consider the membrane potential, y(t), as a function of both of the past current injection, x(t) and the suprathreshold past membrane potential, yˆ(t): y(t) = F [x(t′ ), yˆ(t′′ ); t′ ≤ t, t′′ < t]

(4)

where yˆ = y(t)H(y(t) − θ),

(5)

3

Table 1 Hodgkin-Huxley model parameters. Parameter

Value

CM VK VN a Vl g¯N a g¯K g¯l

1 µF cm−2 115 mV -12 mV 10.6 mV 120 mS cm−2 36 mS cm−2 0.3 mS cm−2

and H(x) is the Heaviside step function, defined as { H(x) =

1, x ≥ 0 . 0, x < 0

where (6)

As described in Section 2.2.3, we expand F [·] as a Volterra series and estimate the kernels with the Laguerre expansion technique. We find that this model, unlike a standard Volterra model, can accurately predict the membrane potential generated by the equation of an H-H neuron in response to white-noise current injection as well as specialized inputs such as a current pulse, current injection, and sinusoidal current. This model differs critically from the SRM of Kistler et al. (1997) in that we need not explicitly account for the AP waveform, afterpotential, or refractory period. All these are generated naturally by the modeling process of the proposed framework. Our model has the further advantage that the Volterra kernels can be estimated using ordinary least-squares estimation. While this study applies the proposed Volterra-type model with NAR component to an H-H neuron, this modeling framework is general and can be applied to any firing neuron or ensemble of neurons (simulated or real). 2 Methods 2.1 Data preparation: H-H model The H-H model considers the neuron membrane as a circuit consisting of four elements in parallel. Three represent the dynamics of sodium, potassium, and “leak” ion channels, and each consist of a battery in series with a resistor. The fourth is a capacitor. The full model follows:

(

) −V αn = βn = .125 exp 80 ( ) −V 2.5−.1V αm = exp(2.5−.1V )−1 , βm = .4 exp 18 ( −V ) 1 αh = .07 exp 20 , βh = exp(3 − .1V ) + 1 .1−.01V exp(1−.1V )−1 ,

(11) (12) (13)

The injected current is given as I(t) and V (t) is membrane potential. We use the parameter values of Hodgkin and Huxley’s original work (Hodgkin and Huxley, 1952), but modify them so membrane depolarization corresponds to a positive membrane potential (rather than negative as in Hodgkin and Huxley (1952)); the resting potential is 0 mV. These are given in Table 1. We have generated a series of data-sets using broadband white noise as input. Specifically, current values are drawn from a normal distribution with mean (µ) 0 and variance σ 2 , and current is injected at a rate of 1 kHz (i.e. a new, independent value for I(t) is chosen every 1 ms). The sampling interval, T , is fixed at 0.2 ms. We have generated data series of lengths 8,192, 16,384, and 32,768 ms for σ ∈ {0.5, 4, 8, 16, 32} (µA cm−2 ); we denote these as “σx data-sets,” where x2 is the variance of the current. We use a 16,384 ms σ32 data-set for model training, and typically use 8,192 ms σ32 , σ16 , σ8 , and σ4 data-sets for model testing (validation). When σ = 0.5, the H-H model remains subthreshold. For σ = 4, the H-H neuron fires at a relatively low frequency of approximately 36 Hz, while σ = 32 gives firing around 78 Hz.

2.2 Mathematical model dV CM = I(t) − g¯K n4 (V − VK ) − dt g¯N a m3 h(V − VN a ) + g¯l (V − Vl ) dn = αn (1 − n) − βn n dt dm = αm (1 − m) − βm m dt dh = αh (1 − h) − βh h dt

(7) (8) (9) (10)

We begin by introducing the basic ideas used in applying the Volterra series to modeling input-output data in Sections 2.2.1 and 2.2.2. In Section 2.2.3 we present the proposed novel form of a Nonlinear Auto-Regressive Volterra-type (NARV) model that is suitable for representing the dynamics of AP generation at the axon hillock of a single neuron. We note that, in our formulation, the input to this model is considered to be the

4


result of somatodendritic integration of all sources of neuronal stimulation received at the dendrite and the soma. Thus, in our formulation, the somatodendritic integration of neuronal stimulation away from the axon hillock ought to be be modeled by a separate functional operator, which is preceding the functional operator of AP generation that is described by the proposed NARV model. 2.2.1 Volterra series In continuous time, the Volterra series expansion gives the general relation between the input, x(t), and output, y(t), for a stationary, nonlinear dynamical system with finite memory as an infinite series of functionals that have hierarchical convolutional form (Marmarelis, 1993): ∫

“memory extent” of the system. Also note that, because x(t − τ1 )x(t − τ2 ) = x(t − τ2 )x(t − τ1 ), the kernels are symmetric with respect to their arguments. 2.2.2 Discrete Volterra series and kernel expansion technique Since data are collected as discrete time series and the order of the practically estimated Volterra model is necessarily finite, we consider the Qth-order discrete-time Volterra series that we might apply this approach to actual input-output data:

y(n) = k0 + T

M ∑

k1 (m)x(n − m) +

m=0

T2

M M ∑ ∑

k2 (m1 , m2 )x(n − m1 )x(n − m2 )

m1 =0 m2 =0 ∞

y(t) = k0 + k1 (τ )x(t − τ )dτ + ∫ ∞ ∫ 0∞ k2 (τ1 , τ2 )x(t − τ1 )x(t − τ2 )dτ1 dτ2 0 0 ∫ ∞ ∫ ∞ +... + ... kq (τ1 , ..., τq )dτ1 ...dτq + ... (14) 0

0

The multiple integrals on the RHS, referred to as the Volterra functionals, are multiple convolutions of the input signal with the Volterra kernels, where kq (τ1 , ..., τq ) is the qth-order Volterra kernel. The zeroth-order kernel, k0 , is simply the system output for null input. The first-order kernel is a weighting pattern which is convolved with the past to give input the first-order contribution to the present output. If no higher-order kernels exist, the system is linear and k1 (τ ) is simply the impulse response function. The second-order kernel is a two-dimensional weighting pattern for the pairwise interaction of the values of the input epoch taken as products. We define the input epoch as the input vector consisting of the present and past discrete input values covering the system memory extent. The second-order kernel represents the lowest order of system nonlinearity. Higher order kernels represent higher order nonlinearities in a hierarchical manner by weighing products of multiple values of the input epoch in order to determine the present output value. For example, the third-order kernel is a volume of weights for all triplet combinations of past input values. In practice to date, it is common to estimate kernels only up to second-order. Third-order kernels have been estimated on a few occasions but they are rather rare. In practice, the integrals are taken over only a finite interval of the past, from 0 to µ, where µ is the

+... + T Q

M ∑ m1 =0

...

M ∑

kQ (m1 , ..., mQ )

mQ =0

x(n − m1 )...x(n − mQ )

(15)

where T is the sampling interval, n = t/T is the discretetime index, and m = τ /T is the discrete-time lag. The number of lags is determined as M = µ/T . Note that the present input value, x(n), is not considered a lag, and therefore the summation index range is m = 0, ..., M . Given this framework, the system identification task consists of estimating the discrete values of the Volterra kernels. As Wiener originally suggested (Wiener, 1958), the number of parameters that need be estimated is reduced dramatically by expanding the kernels on a properly chosen orthonormal basis, such as the Laguerre basis, which has a built-in exponential term and, therefore, exhibits the relaxation characteristic typical of finite-memory systems. Watanabe and Stark (1975) first implemented this idea, noting that a small set of orthonormal Laguerre functions can act as a basis spanning the subspace of kernels for many physiological systems, including oculomotor control. We follow the implementation proposed later by Marmarelis (1993) (see also Marmarelis (2004)), which expands the kernels on a basis of discrete-time Laguerre functions (DLFs) developed by Ogura (1985), which can be calculated using the recursive relation: √ √ αbj (m − 1) + αbj−1 (m) − bj−1 (m − 1)(16) √ √ b0 (m) = αb0 (m − 1) + T 1 − αδi (m) (17) bj (m) =

where bj (m) is the jth-order DLF, m = 0, ..., M is the discrete-time index, T is the sampling interval, δi is the


5

Kronecker delta function, and α ∈ (0, 1) is a parameter that determines the rate of exponential relaxation (a smaller α corresponds to more rapid decay). We take bj (−1) to be zero for all j. The DLFs can also be computed via a rather computationally expensive closedform expression Ogura (1985). We choose a set of L DLFs, {bj (m)}, as our basis to expand each Volterra kernel:

kq

(m1 , ..., mq ) =

L−1 ∑

...

j1 =0

L−1 ∑

aq (j1 , ..., jq )bj1 (m1 )...bjq (mq )

(18)

jq =0

Substituting such kernel expansions into the series in Equation 15 and after some rearrangement of terms, we arrive at

y(n) = k0 +

Q L−1 ∑ ∑ q=1 j1 =0

...

L−1 ∑

aq (j1 , ..., jq )vj1 (n)...vjq (n)

j1 =0

(19) where the set of transformed inputs, {vj (n)}, are given by convolution of the input with the basis functions:

vj (n) = T

Fig. 1 Modular representation of the single-input modified discrete Volterra model (MDV).

M ∑

bj (m)x(n − m)

2.2.3 Nonlinear Auto-Regressive Volterra-tyoe model of AP generation We propose the Nonlinear Auto-Regressive Volterratype (NARV) model of AP generation that is shown schematically in Figure 2. The exogenous input is the injected current, x(n) (µA cm−2 ), and the output is the membrane potential, y(n) (mV). The thresholded output, yˆ(n):

(20)

m=0

We can further reduce the dimensionality of the problem by exploiting the symmetry of the Volterra kernels to arrive at the modified discrete Volterra (MDV) model:

yˆ(n) = y(n)H(y(n) − θ)

(22)

is fed back as the autoregressive component of the NARV model, where H(x) is the Heaviside step function, and jq−1 j1 Q L−1 θ is the membrane threshold potential for AP firing. ∑ ∑ ∑ ∑ y(n) = c0 + ... cq (j1 , ..., jq )vj1 (n)...vjq (n) We have also tried using a simple threshold yˆ(n) = q=1 j1 =0 j2 =0 jq =0 H(θ − y(n)), and while this option gives reasonable re(21) sults, it sometimes leads to numerical instability at the beginning and end of the APs and a poorer representawhere k0 = c0 , and cq (j1 , ..., jq ) = λq (j1 , ..., jq )aq (j1 , ..., jq ). tion of the AP waveform. The scaling factor λq is determined by the multiplicity of the indices (j1 , ..., jq ). We formulate the NARV model as a two-input Volterra Note that once the expansion coefficients have been model, where the autoregressive component is the secdetermined, it is a simple matter to reconstruct the ond input. Such an extension of the Volterra approach original Volterra kernels from Equation 18. to multiple inputs, initially studied by Marmarelis and The MDV model can be represented in modular McCann (1973) and Marmarelis and Naka (1973), deform as passing the input through a linear filterbank fines a set of “self-kernels” for each input along with a where the filters are our basis functions, {bj (m)}, yieldset of “cross-kernels” that represent the nonlinear intering L outputs which are passed through a multi-input actions between the inputs in determining the output. static nonlinearity, f [·], that generates the model outThe two-input Volterra model in discrete-time that is put. This is depicted schematically in Figure 1. equivalent to the NARV model is:

6


y(n) = k0,0 + T

M ∑

k1,0 (m)x(n − m)

m=0

+T

R ∑

k0,1 (r)ˆ y (n − r)

r=1

+T 2

M M ∑ ∑

k2,0 (m1 , m2 )x(n − m1 )x(n − m2 )

m1 =0 m2 =0

+T

2

R ∑ R ∑

k0,2 (r1 , r2 )ˆ y (n − r1 )ˆ y (n − r2 )

r1 =1 r2 =1

+T

2

M ∑ R ∑

k1,1 (m, r)x(n − m)ˆ y (n − r)

m=0 r=1 M ∑

+... + T Qx +Qy

m1 =0

...

M R ∑ ∑

...

mQx =0 r1 =1

R ∑ rQy =1

kQx ,Qy (m1 , ..., mQx , r1 , ...rQy )x(n − m1 )... x(n − mQx )ˆ y (n − r1 )...ˆ y (n − rQy )

(23)

where Qx is the order of the exogenous input component and Qy is the order of the autoregressive component. Any kernel ka,b is a self-kernel if ab = 0 and a cross-kernel otherwise. For a second-order model (Qx = Qy = 2) there is a single cross-kernel. For a third-order model, there are two additional cross-kernels, k2,1 and k1,2 . Unlike the self-kernels, the cross-kernels are not symmetric. Identifying the model in the above form is nearly intractable computationally. Therefore, we expand the kernels on two Laguerre bases of Lx and Ly DLFs, de(x) (y) noted {bj (m)} and {bl (r)}, corresponding to the two inputs x(n) and yˆ(n), respectively. Note that Chon et al. (1997) were the first to propose expanding an autoregressive variable, in addition to the exogenous input variable, on a basis of Laguerre functions. Each basis is characterized by a unique α, denoted αx and αy . As previously, we obtain a set of transformed input variables by convolving x(n) and yˆ(n) with their respective bases:

(x)

vj (n) = T

M ∑

(x)

bj (m)x(n − m)

(24)

(y)

(25)

m=0 (y)

vl (n) = T

R ∑

bl (r)ˆ y (n − r)

r=1

where j = 0, ..., Lx − 1 and l = 0, ..., Ly − 1. We may (x) consider the sets of transformed variables {vj (n)} and (y)

{vl (n)} as the outputs of x(n) passed through a lin(x) ear filterbank defined by {bj (m)}, and yˆ(n) passed

Fig. 2 Modular representation of the NARV model. (y)

through a linear filterbank defined by {bl (r)}, respectively. Taking account of the symmetry of the self-kernels, we get the final modified NARV model: y(n) = c0,0 + Qx L∑ j1 x −1 ∑ ∑

jq−1

q=1 j1 =0 j2 =0

+

∑

...

Qy Ly −1 l1 ∑ ∑ ∑

+

Qy

∑

(x)

lq−1

...

q=1 l1 =0 l2 =0 Qx ∑

(x)

cq,0 (j1 , ..., jq )vj1 (n)...vjq (n)

jq =0

L∑ x −1

qx =1 qy =1 j1 =0

∑

(y)

(y)

c0,q (l1 , ..., lq )vl1 (n)...vlq (n)

lq =0

...

L∑ y −1 x −1 L∑ jqx =0 l1 =0

Ly −1

...

∑

lqy =0

(x) (x) (y) (y) cqx ,qy (j1 , ..., jqx , l1 , ..., lqy )vj1 (n)...vjq (n)vj1 (n)...vjq (n) x y

(26)

where the cross-term summation is subject to qx + qy ≤ min{Qx , Qy }, and c0,0 = k0,0 . Since we expect that no membrane potential should arise in the absence of injected current, we set c0,0 = k0,0 = 0. Note that for higher-order models, cross-terms account for the majority of discrete expansion coefficients that must be estimated. As with the single-input MDV model, we can represent the modified NARV model in modular form, as shown in Figure 2. Since the NARV model has an autoregressive component, we must consider the effect of initial conditions for yˆ(n), n = 0, ..., R. When using an input-output


record to train the model, we simply use the known output to initialize yˆ(n). When predicting the timecourse of membrane potential in response to a novel exogenous input, we assume nothing is known about the output and simply set yˆ(n) = 0 for n = 0, ..., R. Since membrane dynamics within the subthreshold reason are dependent only on exogenous input, there is no sensitivity to initial conditions, although there may be a transient of finite duration which may be disregarded in terms of model estimation and computation of the model prediction error.

2.3 Model parametrization/training The basic model has seven structural parameters: Qx , Qy , M , R, Lx , Ly , αx , αy , and θ. We let Qcross represent the order of the cross-terms, with the obvious constraint Qcross ≤ min{Qx , Qy }. We defer our discussion on selecting the structural parameters to sections ?? and 3.2 of the Results, where the process and the chosen parameter values are presented in detail. To estimate the discrete expansion coefficients, we cast the model in matrix form as:

y = Vc + ϵ

(28)

Truncation of the model order (to second or third-order) is expected to result in some autocorrelation of the errors, violating the OLS assumption of uncorrelated errors, although the quality of the model predictions suggests that this is not a significant problem in this case (see Results). It is frequently the case that the Gram matrix is ill-conditioned or singular, and in this case we use the pseudoinverse, V+ , for estimation:

ˆ c = V+ y

Q∑ j−1 cross ∑ (Lx + Qx )! (Ly + Qy )! cross −i i + −2+ LQ Ly x Lx !Qx ! Ly !Qy ! j=2 i=1

(30) The first two terms account for the self-kernels, while the last is the contribution of the cross-terms. We subtract 2 because we fix the zero-order kernel at 0. For Lx = Ly = 5 and Qx = Qy = Qcross = 3, a total of 385 coefficients must be estimated, 275 of which are due to cross-terms. For a second-order model, only 65 coefficients, 25 from cross-terms, must be estimated. Thus, parsimony strongly favors a second-order model. Expansion coefficients are always estimated using the 16,384 ms σ32 training data-set. We have also tried training the model for lower power inputs, but have found that to get good estimates, the model must “see” a large number of APs. This can be accomplished through either a very long data record, or by using large σ for the injected current input. We take the latter approach. We also take this approach because we desire a model valid over a broad dynamic range, and in general the model must be trained over the dynamic range it is to make predictions on.

(27)

where y is the vector of all N output samples, the matrix V is constructed from the transformed inputs according to Equation 26, c is the coefficient vector also constructed from Equation 26, and ϵ is the modeling error. Since all coefficients enter into this equation linearly, we may simply estimate c by the method of ordinary least squares (OLS): ˆ c = (VT V)−1 VT y

P =

7

(29)

We generally use the pseudoinverse whenever an autoregressive component is included. The number of discrete expansion coefficient values to be estimated is:

2.4 Evaluation of model performance For given structural parameters, we estimate the discrete model expansion coefficients and assess model performance by 3 metrics: (1) the normalized mean square error (NMSE), (2) the ratio Nmodel : Ndata , where Ndata is the total number of APs occurring in a data-set and Nmodel is the number of APs predicted, and (3) a coincidence measure Γ , that estimates the degree of similarity between two spike trains. This estimator was introduced by Kistler et al. (1997) and has been used in several other works to evaluate spiking neuron models (Jolivet et al., 2008); it is defined as: ( Γ =

Ncoinc − ⟨Ncoinc ⟩ .5(Ndata + Nmodel )

)( ) 1 Λ

(31)

where Ndata Nmodel (32) K Nmodel Λ = 1− (33) K and the data record has been divided into K bins of length 2∆, i.e. K = Trecord /(2∆). We choose ∆ = 2 ms as the default value. Using this metric requires some post-processing of the output record. First, we find the ⟨Ncoinc ⟩ =

8

time of initiation for every AP for both the data and model output, and an AP is considered to have occurred whenever there is an upward deflection of the membrane voltage greater than 10 mV that is sustained for 1 ms. To help avoid false positives, we also impose a brief refractory period: no additional AP may be counted for 4 ms following the initiation of any detected AP. The total AP counts are Ndata and Nmodel . Scanning through every AP in the real data (or model output), we consider the model (or real data) to have fired to a coincidental AP if there is a model AP within ±∆p . This yields Ncoinc . To avoid artificially inflating the number of coincidences, we do not allow any AP in the data to correspond with more than one predicted AP. The parameter ⟨Ncoinc ⟩ gives the expected number of coincidences generated by a Poisson process with the same AP frequency as the model, and dividing by Λ normalizes the measure such that Γ = 0 when all coincidences occur by pure chance. The restriction that no AP correspond to more than one other causes this estimator to be equivalent to the coincidence factor “without replacement” used by Gerstner and Naud (2009). This and other measures of spike-train similarity are discussed by Naud et al. (2011). We find the best parameter set using both NMSE and Γ as metrics, but consider Γ as the primary metric of interest.

3 Results All results are generated using a model trained with the 16,384 ms σ32 training data-set. We present results for the model predictions on the four 8,192 ms testing datasets. Unless otherwise stated, all results are generated by either a fully second- or third-order model, i.e. Qx = Qy = Qcross = 2 or Qx = Qy = Qcross = 3, and with Lx = Ly = 5, αx = .4, αy = 0.7, and θ = 4.5 mV.

3.1 Optimal model order ?? To evaluate model performance we use the coincidence measure, Γ , and the ratio Nmodel : Ndata , where Ndata is the actual number of APs to occur in the dataset and Nmodel is the number of APs predicted. Here, we present results for models of different orders. The model order is determined by Qx , Qy , and Qcross , where Qcross is the order of cross-kernels. We constrain Qy ≤ Qx , and obviously Qcross ≤ min{Qx , Qy }. Results for models of all possible combinations of Qx , Qy , and Qcross up to third-order, subject to these constraints, are given in Figure 3. Also included for comparison is a fifth-order model with no autoregressive


component (Qx = 5, Qy = 0). From Figure 3, the models cluster into two groups: in the first group all models give comparable, relatively high performance, while performance is quite poor in the second group. The four high performance models are: (1) Qx = Qy = Qcross = 2; (2) Qx = Qy = Qcross = 3; (3) Qx = Qy = 3, Qcross = 2; and (4) Qx = 3, Qy = Qcross = 2. Thus, we conclude that the the model must be at least secondorder in both the exogenous and autoregressive input, and inclusion of cross-terms to at least second-order is essential. Model performance is uniformly better for high-power inputs. The low performing group may be subdivided into two clusters, one with mid-low performance. Interestingly, the mid-low group includes all those models first-order in the autoregressive term, while the lowestperforming models are second- or third-order in the autoregressive component but lack cross-terms. The fifthorder MDV (exogenous input only) performs in the mid-low group. We have also examined the NMSE of the estimated models on testing data-sets to determine if this metric, which is used to estimate the model expansion coefficients, agrees with the AP-related performance criteria. As shown in Figure 4, the same rough clustering into three groups is seen, but differences in performance are much less dramatic and are not even detectable for the lowest-power input. Interestingly, the weaker models perform worst under the AP-related criteria for the lowest-power input. Of the four high performance models, the weakest is that with Qx = 3, Qy = 3, Qcross = 2, while the other three give essentially identical performance. It is surprising that the second-order model (i.e Qx = Qy = Qcross = 2) is as good as the third-order models. Even more unexpected is the fact that the second-order model out-performs the model third-order in the selfkernels but only second-order in the cross-kernels. Results for the performance of the fully second-order and third-order models for several different αy values are given in Figure 5. The third-order model gives slightly better results for the σ32 data-set, but the second-order model is superior for lower-power inputs. Furthermore, the second-order model performance is less sensitive to the choice of αy . Given the clear parity of the second and third-order models, but the much greater parsimony of the secondorder model, we generally favor the second-order model. However, we do find that compared to a fully thirdorder model with Qx = Qy = Qcross = 3, the secondorder model gives markedly inferior performance in response to a constant current injection, as discussed further in Section 3.5.


9

Fig. 3 Model performance for a variety of model orders as assessed by the coincidence factor, Γ (left panel) and the Nmodel : Ndata ratio (right panel). The four high performing models are those with Qx ≥ 2, Qy ≥ 2, and Qcross ≥ 2. Note that performance is plotted for the four testing data-sets.

Fig. 4 Model performance as assessed by the NMSE for a variety of model orders. Assessment by the NMSE is in general agreement with assessment by the coincidence factor and the Nmodel : Ndata ratio (see Figure 3). Results are for the four testing data-sets. Note that instabilities in the numerical solution for the Qx = Qy = 3, Qcross = 0 model result in the NMSE being undefined for the σ16 and σ32 testing data-sets.

3.2 Other structural parameter selection and parameter sensitivity

autoregressive components. We have T = .2 ms, and we assume a maximum memory of 20 ms, giving M = R = 100.

In this section we present our method and results for choosing the model structural parameters, M , R, Lx , Ly , αx , αy , and θ, given the model order (Qx , Qy , and Qcross ). We have M = µx /T and R = µy /T , where µx and µy are the system memories for the input and

As described in Section 2.3, for given structural parameters, we estimate the expansion coefficients using the 16,384 ms σ32 training data-set (see Section 2.1 for the data preparation), and we use the NMSE and coincidence factor Γ to evaluate performance under the

10


Fig. 5 Model performance for the fully second and third-order models using different values of αy .

given structural parameters. We perform the following procedures for both the fully second- and third-order models. It is our experience that 3–7 DLFs is adequate for most applications, and model parsimony strongly favors as few DLFs as possible. We have performed a NelderMead simplex search for the best αx , αy , θ under Lx = Ly = 3, 5, 7 using the σ32 and σ4 testing data-sets, and find that model performance under Lx = Ly = 5 is superior to only three DLFs, but is comparable to using Lx = Ly = 7. We have also observed that for Lx = Ly ≥ 9 instabilities in the solution sometimes occur, and so we fix Lx = Ly = 5. We now perform a more exhaustive exploration of model performance in the αx , αy , and θ parameter space. To ensure that the estimated model applies over the entire dynamic range of input, we calculate the NMSE and Γ for all four 8,192 ms testing data-sets, where σ = 32, 16, 8, 4. Moreover, for each testing dataset, we generate 3D volumes of NMSE and Γ as functions of αx , αy , and θ. From inspection of these volumes, we find a range of αx , αy , and θ that give good results over the entire dynamic range of input. We find that NMSE and Γ for all testing data-sets are essentially independent of αx when αx is in the range [0.4, 0.9]. However, inspection of the AP waveform shows that larger αx values can give “jitters” in

the membrane potential at the AP peak, and so we fix αx = 0.4. Such choppiness in the waveform also sometimes occurs for smaller αy . The optimal value of αy varies with θ, but in general the best fit occurs when αy ∈ [.6, .8]. While θ < 3 mV can give good results, we consider such values to be physically unlikely, and they give excessive firing for small specialized inputs as discussed below. Therefore, we restrict θ ∈ [2, 8] and examine NMSE and Γ as 2-D functions (surfaces) of θ and αy for the testing datasets (we do not present these surfaces in the interest of space). In general, a larger θ gives better results for higher power input, while smaller θ values are best for low-power inputs. We also take the average of the four surfaces, and peak values for the average Γ surface occur for αy ∈ [0.675, 0.75] and θ ∈ [3.0, 4.5], with the overall maximum at αy = 0.725 and θ = 3.5 mV for a third-order model. To refine and constrain our estimate of the threshold, θ, we also have examined model response to two specialized inputs: (1) a 1 ms pulse of injected current, and (2) a constant current injection. We vary the current magnitude over the subthreshold to suprathreshold transition and we conclude that θ ≥ 4.5 mV gives the best response to small pulses of current. Unfortunately, the model appears to predict an overvigorous response to a constant current injection regardless of θ. The val-


ues for θ and αy also affect the I-f curves for constant current injection. This is discussed further in Section 3.5.2 (see Figure 11). We use αy = 0.7 and θ = 4.5 mV as default values for all results. The results of this procedure are similar for the second and third-order models, but the parameter sensitivity is smaller for the second-order model (i.e. a broader range of αy and θ values give good performance), and the range of αy for optimal performance is shifted toward larger values. We also find that the data-record need not be as long for training of a second-order model.

3.3 Model time-series predictions Using the model estimated for the σ32 training dataset, we predict the membrane potential time-series for the four 8,192 ms testing data-sets. That is, the model is given the input record and predicts the output; we compare model predications against the observed output. Figure 6 shows a 500 ms windows of the membrane potential for the σ32 and σ4 testing data-sets. To avoid any bias in what sample window we present, we arbitrarily choose to begin each window at 1000 ms. Similarly, Figure 7 gives 1500 ms samples of the postprocessed spike trains, beginning at t = 1000 ms.

3.4 Predicted versus actual interspike interval distributions We run the model on the 32,768 ms testing data-sets and determine the interspike interval histogram. Figure 8 shows the histograms predicted by a fully secondorder model and the actual histograms for all four datasets. It is clear that the NARV model accurately predicts the distribution of interspike times for the H-H model given a noisy input across a broad range of inputpower, even if some individual spike times are in error.

3.5 Model predictions for specialized inputs We evaluate the ability of the model to predict response to two special types of exogenous input: (1) a current pulse, and (2) a constant current injection. As discussed in Section 3.2, these results have also been used to help determine the appropriate firing threshold, θ. We have also examined the response to a sinusoidal current and have found that the model generally gives good results, but in the interest of space we omit such results.

11

3.5.1 Current pulse We construct an input sequence consisting of 1 ms wide pulses of current. The current amplitude begins sufficiently small such that no AP is fired, and the amplitude is slowly incremented to the point that APs are fired. We find that model predictions for this input type are sensitive to the choice of θ, with θ < 4.5 mV giving spurious AP firing. For the fully third-order and secondorder models, predictions match the data well and the model predicted AP morphology generally matches the actual APs very nicely. If cross-kernels are not included, the model fails to predict AP firing for a current pulse of less than 16 µA cm−2 , yet APs are generated for a pulse of only 7 µA cm−2 . 3.5.2 Constant current injection In response to a constant current of sufficient magnitude, an indefinite spike train with a constant interspike time occurs. The NARV model successfully predicts this behavior, and several examples of the predicted membrane potential time-series and actual time-series are shown in Figure 10. Contrary to our other findings, the third-order model is markedly superior to the secondorder model in this setting. We emphasize that a (finite-time) Volterra model that only considers the exogenous input is unable to produce such limit cycle behavior. Since such a model is finite-time, the input epoch is identical at all times, and clearly the model cannot produce different outputs for the same input epoch. However, the NARV model does converge to a limit cycle for a constant exogenous input epoch. This is because while the exogenous epoch is constant, the autoregressive component is non-constant and regularly switches between the suprathreshold and supthreshold regime. We find that model first-order in both the exogenous and autoregressive component is capable of limit-cycle behavior, although the form and timing of the AP train is greatly in error. Second- and third-order models perform better, and we find that for these higher order models cross-kernels to at least second-order must be included for sustained spiking. While the model yields the correct qualitative behavior, there is typically significant quantitative error. For the H-H generated data, a constant current below 7 µA cm−2 fails to generate a continuous spike train, while the model predicts a continuous spike train even for very small current injections. As the current is increased, the model and data come into better agreement, and we observe a range of current values where model and data agree nearly perfectly: the interspike intervals are equal, and the precise timing of the actual

12


Fig. 6 Voltage tracings giving the second-order model prediction and actual data for the σ32 and σ4 testing data-sets.

Fig. 7 Comparison of the second-order model predicted spike-trains versus the actual spike trains for all four testing data-sets.


13

Fig. 8 Predicted and actual interspike time histograms for 32,768 ms σ32 , σ16 , σ8 , and σ4 data records. The model is fully second-order.

Fig. 9 Data is generated by delivering a sequence of 1 ms pulses of current every 35 ms to the H-H equations. The amplitude of the first pulse is 3 µA / cm−2 and is increased by 1 µA / cm−2 every subsequent pulse. The same input record is given to our estimated model, which gives comparable results. Here, the model is fully second-order.

and predicted APs coincides. As current is increased further, the interspike times diverge, and most APs are no longer coincidental. Several example time-series are shown in Figure 10. We generate a series of I-f curves, which give the frequency of AP firing for 1000 ms of constant current injection. The third-order model and data agree reasonably well over most of the current range considered (up to 64 µA cm−2 , as this is roughly the dynamic range over which the model is trained). The secondorder model gives qualitatively reasonable results, but

dramatically overestimates the number of APs. I-f curves and the interspike intervals are given in Figure 11.

3.6 The refractory period is correctly predicted The NARV model succeeds in predicting the existence of a relative and absolute refractory period following an AP. By “absolute refractory period,” we mean an interval of time following an AP in which even a very large stimulus is incapable of evoking a second AP. In

14


Fig. 10 Data is generated by delivering a constant injection of current, beginning at t = 20 ms. Comparisons of the model prediction versus H-H data for injections of 10 and 30 µA / cm−2 of current are shown for the second-order (left panels) and third-order model (right panels). While both give qualitatively correct results (sustained spike-trains), the third-order model is somewhat superior. For both models, αy = 0.7 and θ = 4.5 mV.

reality, we have observed that a truly massive current injected arbitrarily soon after the first AP can induce an AP-like waveform. Therefore, all refractoriness is essentially relative, and our use of the term “absolute” is somewhat loose. By “relative refractory period,” we mean an interval following the absolute refractory period during which large (but realistic) stimuli can evoke an AP, but smaller stimuli which would otherwise trigger an AP still fail to. We perform a numerical experiment where we inject a 1 ms pulse of current of 10 µA cm−2 . After a small interval of time, we inject a much larger pulse of 50 µA cm−2 . As shown in Figure 12, for an interval of less than 9 ms, no AP is fired, demonstrating the existence of a (nearly) absolute refractory period. If we follow the initial current pulse with a pulse of the same magnitude, no AP is fired until an interval of 15 ms has elapsed, demonstrating the relative refractory period. We find that the fully third-order model matches the data best, although the second-order model is nearly as good. While it does not appear that cross-terms are strictly necessary for the model to yield a refractory

period, results are very poor if the cross-kernels are omitted.

3.7 Shape of the estimated Volterra kernels We graphically present the Volterra kernels estimated for the second-order model with αy = 0.7 and θ = 4.5 mV. First and second-order self-kernels for the exogenous and autoregressive inputs are shown in Figure 13 and 14, respectively. The second-order cross-kernel is shown in Figure 15.

3.8 Model performance under noisy data To test the robustness of the model estimation process in the presence of noise-contaminated data, we add Gaussian white noise to both the training and testing data-sets. We have generated results for SNRdB = 10, 3, and 1. Figure 16 demonstrates that the ability to predict APs, as measured by the coincidence factor Γ , is only appreciably degraded by an SNRdB = 1. A 300 ms window (starting at 1000 ms) of the noisy membrane


15

Fig. 11 The left panels show I-f curves plotting AP firing frequency as a function of injected current (constant current injection), and the right panels give the interspike intervals. Each panel gives results for the model trained with αy = 0.7, 0.75, and 0.8, along with the actual data. The top panels give results for the second-order model, and the bottom panels show third-order model results. Note that the sudden jump in AP frequency for the second-order model with α = 0.7 at a current injection of about 30 µA cm−2 represents a transition from proper APs to rapid membrane fluctuations that do not really constitute true APs, but still meet our chosen AP criterion (10 mV positive membrane deflection sustained for 1 ms). Such a transition occurs in the H-H model as well, but at larger current injections.

potential time-series and second-order model prediction for the σ32 testing data-set is given in Figure 17. These results give some confidence that the model may be successful applied to noisy recordings of real neurons.

order kernel also varies slightly between the two models, being a positive integrator for the NARV model, while the MDV model yields a first-order kernel that is mostly positive, but briefly becomes negative for large delays.

3.9 Model estimation and predictions under purely subthreshold dynamics

4 Discussion

We briefly consider the case when membrane dynamics remain purely subthreshold. The autoregressive component of the NARV model is identically zero, and the model reduces to a single-input MDV model. We find that the NARV model estimated for data that varies between the subthreshold and suprathreshold regime (i.e the σ32 data-set) gives satisfactory performance on a purely subthreshold data-set. However, we also estimate a first-order MDV model directly from a subthreshold data-series (8,192 ms, σ = 0.5), and find that such a model outperforms the NARV model operating solely in the subthreshold region. The shape of the first-

We have proposed a new methodology for input-output modeling of AP generation at the axon hillock of a neuron that utilizes a nonlinear autoregressive Volterratype (NARV) framework to represent the causal relationship between the injected/somatic current (input) and the membrane potential (output). The NARV model represents its output, a putative membrane potential at the axon hillock, in terms of the input current and feedback from the generated (suprathreshold) APs. This model takes the general form of a Volterra model with two inputs, namely the actual current input (corresponding to a nonlinear “forward” or “moving-average” component) and the suprathreshold APs (correspond-

16


Fig. 12 Numerical experiment demonstrating that the model successfully predicts the existence of both (nearly) absolute and relative refractory periods. The bars show the timing of current pulses, with the area of the bars proportional to the amplitude of the pulse (all pulses are 1 ms in width). For this figure, the model is third-order. The second-order model is nearly as good, but predicts a slightly shorter refractory period (e.g. 14 ms instead of 15 ms for the second experiment).

ing to a nonlinear “feedback” or “auto-regressive” component). The latter is active only when the neuron is in the “excited state” of generating an AP and accounts for post-firing processes, such as refractoriness and after-potentials. The NARV model, as a two-input Volterra model, has cross-terms representing the dynamic interactions between the two inputs (input current and generated APs) as they affect the membrane potential output. The sequence of generated APs can be predicted by the NARV model for any current input within the dynamic range and bandwidth of the input ensemble used for its estimation, by applying a fixed-threshold operator on the model output (membrane potential). The efficient estimation of the unknown kernels of the NARV model is accomplished with the use of band-limited Gaussian white-noise current input and Laguerre expansion of the kernels (Marmarelis, 2004). A schematic of the NARV model configuration is given in Figure 2. We have found that 5 Laguerre functions (with appropriate Laguerre parameter α) are adequate for representing each set of kernels for the two inputs. A NARV model of second-order is adequate for predicting the simulated H-H data for random broadband inputs, but

the third-order model yields better predictions for constant current inputs. Our main findings are: 1. The proposed NARV model predicts well the simulation outcome of the H-H equations for all input waveforms within the dynamic range and bandwidth of the input ensemble used for its estimation, both in the subthreshold and suprathreshold regions. 2. The predictive performance of a second-order NARV model is satisfactory for all input waveforms except constant current. A third-order NARV model is found to perform considerably better for constant input current. 3. The performance of the NARV models is better than their Volterra counterparts. In this regard, it is critical to note that, unlike a Volterra model, the NARV model exhibits limit-cycle behavior in response to a constant current input. 4. The estimation of the NARV model kernels has been found to be accurate and efficient via the Laguerre expansion technique over the entire dynamic range of inputs. However, the model predictions are best for high-power random inputs.


Fig. 13 The first and second-order self-kernels (k1,0 and k2,0 ) for the exogenous input, x(m).

Fig. 14 The first and second-order autoregressive self-kernels (k0,1 and k0,2 ) for the autoregressive input, yˆ(m).

17

18


Fig. 15 The second-order cross-kernel, k1,1 .

Fig. 16 Model performance on noisy data-sets as assessed by the coincidence factor. The left and right panels give results for the fully second and third-order models, respectively. Note that in each case the model is estimated from the σ32 training data-set contaminated with the indicated level of noise, and performance is assessed on the indicated training data-set also contaminated with the same level of noise.

5. The NARV model reproduces precisely the afterpotential and refractory characteristics of the H-H model for all inputs. 6. The inclusion of cross-kernels describing the nonlinear interaction between the exogenous current input and the autoregressive input (i.e. the feedback of suprathreshold APs) is essential for satisfactory model performance. For higher-order models (Qx ≥ 2, Qy ≥ 2), the inclusion of cross-terms appears to

be essential for generating limit-cycle behavior in response to a constant current input. 7. The forward component of the NARV model (i.e. the terms that depend only on the input current) is almost linear (see Figure 13, where the contribution of k2,0 is relatively minor) and its first-order kernel exhibits an integrative characteristic. To confirm that the second-order forward kernel is indeed unimportant, we estimated a second-order NARV model


19

Fig. 17 Voltage tracings for noisy training data-sets and second-order model (estimated using noisy data) predictions. From top to bottom the panels gives results for SNRdB = 10, 3, and 1.

lacking this kernel, but retaining the second-order cross-kernel, and found that its predictive capabilities are comparable to the full second-order model (results omitted for space). 8. Accurate waveforms of the AP and the refractory period are obtained directly from the estimated NARV model, and no prior assumptions are needed for that purpose, as in previous studies. 9. The only requirement for NARV model estimation is the availability of broadband input-output data that test the system under a variety of input conditions. This makes the data-based NARV model suitable for natural operating conditions. This desirable property and the good performance of the NARV model under noisy simulated data (see Section 3.8) suggest that it may be capable of predicting real neuron firing in response to arbitrary input waveforms, but this conclusion must be confirmed by testing the model against real data. 10. The threshold of the NARV model has been determined through a process of successive trials of model estimation and performance evaluation. It is a somewhat curious result that model performance is better for high-power inputs, for which it is trained, than low-power inputs, although we offer a

heuristic explanation. Suppose the membrane potential is subthreshold and the input is low-power white-noise. Then each change in the input current has a small effect on the membrane potential. If such a change pushes the membrane over threshold, it shall be only by a very small margin. Thus, the model must track membrane potential very precisely to accurately predict this threshold crossing. This suggests that any model with a threshold for firing may be intrinsically less reliable for low-power versus high-power inputs. The model is also trained much more efficiently by high-power inputs, which is likely due to the fact that the neuron membrane operates in two distinctly different dynamic regimes: subthreshold and suprathreshold. High-power inputs provide adequate data on both dynamic regimes and facilitate accurate estimation of the model for both regimes – something not possible for low-power inputs that do not provide adequate data in the suprathreshold regime. It is possible that the system dynamics are marginally different for low-power compared to high-power inputs. Therefore, we have tried training the model with a mix of input power levels, but this does not improve performance over simply training with only high-power input (results not shown).

20

The NARV model predicts firing in response to a current pulse quite well, but for any reasonable value of θ sustained firing in response to a constant current occurs at current injections much too low (see Figure 11). The threshold θ must be increased unreasonably high for this behavior to be avoided. Kistler et al. (1997) found that their model similarly overestimated AP firing in response to small injected current. From this, they suggested that there exists no strict membrane potential threshold for firing. Our results corroborate this conclusion. We have also examined the somewhat similar case of anode break excitation, where the membrane potential is hyperpolarized for an extended period. Upon returning to rest, a spontaneous AP occurs as a result of sodium channel activation. The NARV model can only give this behavior if θ is made unreasonably low (results omitted for the sake of space). Thus, we can conclude that an H-H neuron is not strictly a threshold element, but the threshold approximation applies well for most natural inputs (e.g. white noise). How to account for this in a Volterra framework is unclear, but to do so in a future iteration is a worthwhile goal. It is unclear whether a second-order or third-order NARV model is preferable for equivalent representation of the H-H equations. The two give comparable results for most input waveforms (including random) and the second-order NARV model has fewer free parameters. In that sense, it appears to be preferable because it gives satisfactory performance with greater parsimony. However, the third-order model performs considerably better for constant input currents and sinusoidal input currents (results are not presented in the interest of space). The third-order model is also slightly better at predicting the refractory period. This suggests that while much of the essential behavior of the H-H model is captured by second-order nonlinearities, third-order nonlinearities do exist and have subtle effects on model performance. This fact notwithstanding, the secondorder model has the important practical advantages of greater parsimony and requiring shorter data-records for reliable estimation. This trade-off has to be examined in each case based on the specific prevailing considerations.


the feedback current is given by convolution of the past spike train with a fixed afterpotential current waveform. This model is conceptually similar to a linearized version of the NARV model, with the feedback current analogous to the thresholded feedback of the NARV model. Despite this conceptual similarity, there are several crucial differences between the models: the generalized linear model generates discrete spikes, and the underlying membrane potential is automatically reset to zero whenever a spike occurs. The NARV model continuously tracks membrane potential and the reset of the membrane potential following a spike cannot be imposed. The feedback component in the generalized linear model is a convolution with previous discrete spikes, while the feedback component in the NARV model is a non-linear convolution with the past continuous membrane potential. The generalized linear model has been used in other recent works, such as that by Pillow et al. (2008), who considered stimulation of a population of such neurons with the inclusion of coupling filters that allow the spiking activity of nearby cells to affect each other, and in (Mensi et al., 2012). Volterra models of input-output transformations of spike trains in the hippocampal formation by, e.g. Berger et al. (2010), also employ a threshold for firing along with a linear feedback kernel for the thresholded output. These are the most similar of any existing models in their mathematical construction to the current work, but are likewise dissimilar from the NARV model in that they consider point-process, rather than continuous, data. Considering the membrane potential in a continuous manner appears to affect the necessary autoregressive model order. In the current work, we have found that higher order autoregressive components with cross-terms are essential to accurately capturing the precise membrane dynamics during and after AP firing. This has not been the case in the aforementioned works which consider point-process data, where a linear feedback term has proven sufficient.

This work adds to a small but existing literature on autoregressive Volterra modeling. For example, Barahona and Poon (1996) proposed a closed-loop version of the discrete Volterra-Wiener-Korenberg series and proposed a methodology by which it can be used to Other models of spiking neurons use an autoregresdetect the presence of nonlinear determinism in expersive structure similar to the one proposed here. Pillow imentally obtained time-series. Schiff et al. (1995), foland colleagues have proposed the generalized integratelowing the work of Victor and Canel (1992), applied a and-fire (IF) model (or “generalized linear model”) as nonlinear autoregressive (NLAR) model equivalent to a a successor to the SRM (Pillow et al., 2005). This model second-order autoregressive Volterra model to analysis considers a leaky integrate-and-fire spike generator driven of EEG signals. However, because of the problem of paby a linearly filtered stimulus, a spike history-dependent rameter explosion they only estimated NLARs with a feedback current, and a noise current. A spike fires single nonlinear term. As previously mentioned, Chon whenever the membrane exceeds some threshold, and et al. (1997) proposed applying the Laguerre expansion


technique to estimation of autoregressive Volterra models, avoiding the curse of dimensionality that plagued earlier works. We note that an autoregressive term can be equivalent to an infinite series of moving average terms, and previous studies have examined the relation between classical Volterra series and nonlinear (autoregressive) difference equations (Diaz and Desrochers, 1988; Jones and Billings, 1989). This relates to the current work in that it suggests that the autoregressive Volterra model may admit a classical Volterra (or moving average) representation, but with many higher orders. It is also useful for future studies of this problem to summarize our unsuccessful efforts. We have found that autoregressive modeling without a threshold does not improve results over a traditional Volterra model and frequently results in numerical instability. Likewise, derivative feedback fails to improve results. Given some of the difficulty of a hard threshold (θ), we experimented with a soft (sigmoidal) threshold, but this typically resulted in somewhat inferior results. Of all our efforts, only those with a thresholded autoregressive component yield the essential limit-cycle behavior of the H-H model for constant current inputs. The proposed NARV model seems to offer a databased alternative to the H-H model of AP generation. Future efforts will be directed towards improving its compactness by use of the concept of Principal Dynamic Modes (PDM) (Marmarelis, 2004) and advancing its biological interpretation by ascribing specific biological mechanisms to the various features (kernels or PDMs) of the model. Acknowledgements This work was supported in part by the Biomedical Simulations Resource at the University of Southern California under NIH grant P41-EB001978. We also thank the anonymous reviewers for their careful reading and many helpful comments.

References Barahona, M, and Poon, C. (1996). Detection of nonlinear dynamics in short, noisy time series. Nature, 381, 215–217 Berger, T. W., Song, D., Chan, R., and Marmarelis, V. Z. (2010). The neurobiological basis of cognition: identification by multi-input, multi-output nonlinear dynamic modeling. Proceedings of IEEE, 98, 356– 374. Bryant, H. L., and Segundo, J. P. (1976). Spike initiation by transmembrane current: a white-noise analysis. The Journal of Physiology, 260, 279–314.

21

Bu˜ no, W., Bustamante, J., and Fuentes, J. (1984). White noise analysis of pace-maker-response interactions and non-linearities in slowly adapting crayfish stretch receptor. The Journal of Physiology, 350, 55– 80. Bustamante, J., and Bu˜ no, W. (1992). Signal transduction and nonlinearities revealed by white noise inputs in the fast adapting crayfish stretch receptor. Experimental Brain Research, 88, 303–312. Chon, K. H., Richard, C. J., Holstein-Rathlou, N.-H. (1997). Compact and accurate linear and nonlinear autoregressive moving average model parameter estimation using Laguerre functions. Annals of Biomedical Engineering. 25, 731–738. Diaz, H., and Desrochers, A. A. (1988). Modeling of nonlinear discrete-time systems from input-output data. Automatica, 24, 629–641. Gerstner, W. (1995). Time structure of the activity in neural network models. Physical Review E, 51, 738– 758. Gerstner, W., and van Hemmen, J., L. (1992). Associative memory in a network of ‘spiking’ neurons. Network, 3, 139–164. Gerstner, W., and Naud, R. (2009). How good are neuron models? Science, 326, 379–380. Guttman, R., and Feldman, L. (1975). White noise measurement of squid axon membrane impedance. Biochemical and Biophysical Research Communications, 67, 427–432. Guttman, R., Feldman, L., and Lecar, H. (1974). Squid Axon Membrane Response to White Noise Stimulation. Biophysical Journal, 14, 941–955. Guttman, R., Grisell, R., and Feldman, L. (1977). Strength-frequency relationship for white noise stimulation of squid axons. Mathematical Biosciences, 33, 335–343. Hampson R. E., Song, D., Chan, R. H. M., Sweatt, A. J., Fuqua, J., Gerhardt, G. A., Shin, D., Marmarelis, V. Z., Berger, T. W., Deadwyler, S. A. (2012b). A nonlinear model for hippocampal cognitive prostheses: Memory facilitation by hippocampal ensemble stimulation. IEEE Transactions on Neural Systems and Rehabilitation Engineering. In press. Hampson, R. E., Song, D., Chan, R. H. M., Sweatt, A. J., Riley, M. R., Goonawardena, A. V., Marmarelis, V. Z., Gerhardt, G. A., Berger, T. W., Deadwyler, S. A. (2012a). Closing the loop for memory prosthesis: Detecting the role of hippocampal neural ensembles using nonlinear models. IEEE Transactions on Neural Systems and Rehabilitation Engineering. In press. Hodgkin, A. L., and Huxley, A., L. (1952). A quantitative description of membrane current and its application to conduction and excitation in nerve. Journal

22

of Physiolology, 117, 500-544. Jolivet, R., Kobayashi, R., Rauch, A., Naud, R., Shinomoto, S., and Gerstner, W. (2008). A benchmark test for a quantitative assessment of simple neuron models. Journal of Neuroscience Methods, 169, 417424. Jolivet R, Rauch A, Lscher HR, Gerstner W. (2006). Predicting spike timing of neocortical pyramidal neurons by simple threshold models. Journal of Computational Neuroscience, 21, 35–49. Jones, J. C. P., and Billings, S. A. (1989). Recursive algorithm for computing the frequency response of a class of non-linear difference equation models. International Journal of Control, 50, 1925–40. Kistler, W. M., Gerstner, W., van Hemmen, J. L. (1997). Reduction of Hodgkin-Huxley equations to a single-variable threshold model. Neural Computation, 9:1015–1045. Korenberg, M., J., French, A., S., and Voo, S., K., L. (1988). White-noise analysis of nonlinear behavior in an insect sensory neuron: kernel and cascade approaches. Biological Cybernetics, 58, 313–320. Lee, Y. W., and Schetzen, M. (1965). Measurement of the Wiener kernels of a nonlinear system by crosscorrelation. International Journal of Control, 2, 237– 254. Lewis, E. R., Henry, K. R., and Yamada, W. M. (2000). Essential roles of noise in neural coding and in studies of neural coding. BioSystems, 58, 109–115. Marmarelis, V. Z. (1993). Identification of nonlinear biological systems using laguerre expansions of kernels. Annals of Biomedical Engineering, 21, 573–589. Marmarelis, V. Z. (1997). Modeling methodology for nonlinear physiological systems. Annals of Biomedical Engineering, 25, 239–251. Marmarelis, V.Z. (2004). Nonlinear Dynamic Modeling of Physiological Systems. Hoboken: Wiley-IEEE Press. Marmarelis, V. Z. (1989). Signal transformation and coding in neural systems. IEEE Transactions on Biomedical Engineering, 36, 15–24. Marmarelis, P. Z., and McCann, G. D. (1973). Development and application of white-noise modeling techniques for studies of insect visual nervous system. Biological Cybernetics, 12, 74–89. Marmarelis, P.Z., and Naka, K. I. (1973). Nonlinear analysis and synthesis of receptive-field responses in the catfish retina. 3. Two-input white-noise analysis. Journal of Neurophysiology, 36, 634–48. Marmarelis, V. Z., and Orme, M. E. (1993). Modeling of neural systems by use of neuronal modes. IEEE Transactions on Biomedical Engineering, 40, 1149– 1158.


Mensi, S., Naud, R., Pozzorini, C., Avermann, M., Petersen, C. C., Gerstner, W. (2012). Parameter extraction and classification of three cortical neuron types reveals two distinct adaptation mechanisms. Journal of Neurophysiology, 107, 1756–75. Naud, R., Gerhard, F., Mensi, S., Gerstner, W. (2011). Improved similarity measures for small sets of spike trains. Neural Computation, 23, 3016–69. Ogura, H. (1985). Estimation of Wiener kernels of a nonlinear system and a fast algorithm using digital Laguerre filters. 15th NIBB Conference, Okazaki, Japan. Pillow, J. W., Paninski, L., Uzzell, V. J., Simoncelli, E. P., Chichilnisky EJ. (2005). Prediction and decoding of retinal ganglion cell responses with a probabilistic spiking model. The Jounral of Neuroscience, 25, 11003–13. Pillow, J. W., Shlens, J., Paninski, L., Sher, A., Litke, A. M., Chichilnisky, E. J., Simoncelli, E. P. (2008). Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature. 454, 995–9. Poggio, T., and Torre, V. (1977). A Volterra representation for some neuron models. Biological Cybernetics, 27, 113–124. Schiff, N. D., Victor, J. D., Canel, A., Labar, D. R. (1995). Characteristic nonlinearities of the 3/s ictal electroencephalogram identified by nonlinear autoregressive analysis. Biological Cybernetics, 72, 519–26. Song, D., Chan, R. H., Marmarelis, V. Z., Hampson, R. E., Deadwyler, S. A., Berger, T. W. (2007). Nonlinear dynamic modeling of spike train transformations for hippocampal-cortical prostheses. IEEE Transactions on Biomedical Engineering, 54, 1053–1066. Song, D., Chan, R. H. M., Marmarelis, V. Z., Hampson, R. E., Deadwyler, S. A., Berger, T. W. (2009). Nonlinear modeling of neural population dynamics for hippocampal prostheses. Neural Networks, 22, 1340– 1351. Takahata, T., Tanabe, S., and Pakdaman, K. (2002). White-noise stimulation of the HodgkinHuxley model. Biological Cybernetics, 86, 403-417. Victor, J. D., and Canel, A. (1992). A relation between the Akaike criterion and reliability of parameter estimates, with application to nonlinear autoregressive modelling of ictal EEG. Annals of Biomedical Engineering, 20, 167–80. Volterra, V. (1930). Theory of functionals and of integro-differential equations. New York: Dover Publications. Watanabe, A., & Stark, L. (1975). Kernel method for nonlinear analysis: identification of a biological control system. Mathematical Biosciences, 27, 99-108.


Wiener, N. (1958). Nonlinear problems in random theory. Cambridge, MA: MIT Press. Zanos, T. P., Courellis, S. H., Berger, T. W., Hampson, R. E., Deadwyler, S. A., Marmarelis, V. Z. (2008). Nonlinear modeling of causal interrelationships in neuronal ensembles. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 16, 336–352.

23