Asymptotic and numerical techniques for systems with ... - cscamm

3 downloads 0 Views 3MB Size Report
Grey line: ε = 1/256; full black line: ε = 1/128; dashed black line: ε = 1/128 with time ...... (1,1) (1,4) (1,16) (1,64) (1,256) (1,1024). CPU. 0.62. 1.32. 2.98. 9.56.
Asymptotic and numerical techniques for systems with multiple time-scales Eric Vanden-Eijnden Courant Institute

2.5 2 1.5

x, y

1 0.5 0 −0.5 −1 −1.5 0

1

2

3

4

5

time

The solution of

√ X˙ = −Y 3 + cos(t) + sin( 2t) Y˙ = −ε−1 (Y − X)

n

when ε = 0.1 and we took X(0) = 2, Y (0) = −1. X is shown in blue, and Y in green. Also shown in red is the solution of the limiting equation √ X˙ = −X 3 + cos(t) + sin( 2t)

3

2

x, y

1

0

−1

−2

−3 0

1

2

3

4

5

time

The solution of √ X˙ = −Y 3 + cos(t) + sin( 2t) dY = −ε−1 (Y − X)dt + ε−1/2 dW



when ε = 0.01 and X(0) = 2, Y (0) = −1. X is shown in blue, and Y in green. Also shown in red is the solution of the limiting equation √ 3 ˙ X = −X + X + cos(t) + sin( 2t) Notice how noisy Y is.

Outline

Lecture 1: Theoretical aspects Averaging principle for singularly perturbed Markov processes; Homogenization

Lecture 2: Numerical Aspects HHM-like multiscale integrators; Seamless integrators (boosting method).

Lecture 3: Generalization to Markov jump processes Gillespie stochastic simulation algorithm (SSA); Nested SSA (nSSA).

Lecture 1: Singular perturbations techniques for Markov processes Consider



X˙ = f (X, Y ), dY = ε−1 g(X, Y )dt + ε−1/2 σ(X, Y )dW (t),

(?)

Assume that: (i) the evolution Y at every fixed X = x is ergodic with respect to the probability distribution dµx (y) and (ii)

Z F (x) =

f (x, y)dµx (y) R

exists

m

Then in the limit as ε → 0 the evolution for X solution of (?) is governed by X˙ = F (X) In addition

Z

1 F (x) = f (x, y)dµx (y) = lim T →∞ T Rm

Z

T

f (x, Ytx )dt

0

where dY x = ε−1 g(x, Y x )dt + ε−1/2 σ(x, Y x )dW (t)

Derivation 1. Consider the simpler equation X˙ = f (X, g(t/ε)) Assume that ∀y :

|f (x1 , y) − f (x2 , y)| ≤ K|x1 − x2 |

and ∀x : Then (assuming Xt=0

Z

1 lim T →∞ T = x)

Z

T

f (x, g(s))ds = F (x) 0

t

Xt − x =

f (Xs , g(s/ε))ds

Z0t

Z

t

f (x, g(s/ε))ds + (f (Xs , g(s/ε)) − f (x, g(s/ε))) ds 0   ε Z t/ε f (x, g(u))du + δ(t, ε) =t t 0 =

0

Letting ε/t → 0, the term in parenthesis converges to F (x). On the other hand, for all ε, |δ(t, ε)| ≤ Kt2 . Thus, as ε/t → 0, this equation converges to Xt − x = tF (x) + O(t2 ) which is the forward Euler discretization of X˙ = F (X).

Derivation 2. Let L be the infinitesimal generator of the Markov process generated by



X˙ = f (X, Y ), dY = ε−1 g(X, Y )dt + ε−1/2 σ(X, Y )dW (t),

(?)

i.e.

 1 lim Ex,y φ(Xt, Yt) − φ(x, y) = (Lφ)(x, y) t→0+ t Then L = L0 + ε−1 L1 , with  L0 = f (x, y) · ∇x L1 = g(x, y) · ∇y + (σσ T )(x, y) : ∇y ∇y and u(x, y, t) = Ex,y φ(Xt, Yt) satisfies the backward Kolmogorov equation ∂u = L0 u + ε−1 L1 u, ∂t

u|t=0 = φ

Look for a solution in the form of u = u0 + εu1 + O(ε2 ) so that limε→0 u = u0 (formally).

(BKE)

Inserting u = u0 + εu1 + O(ε2 ) into (BKE) and equating equal powers in ε leads to the hierarchy of equations

  L u = 0,   1 0 ∂u0 − L0 u0 , L1 u1 =  ∂t  L u = · · · 1 2

(??)

The first equation tells that u0 belong to the null-space of L1 . The assumption that the evolution of Y at every fixed X = x (i.e. the process with generator L1 ) is ergodic with respect to the probability distribution µx (y) implies that for every x, the null-space of L1 is spanned by functions constant in y, i.e. u0 = u0 (x, t)

Since the null-space of L1 is non-trivial, the next equations each requires a solvability condition, namely that their right hand-side belongs to the range of L1 .

To see what this solvability condition actually is, take the expectation of both sides of the second equation in (??) with respect to dµx (x). This gives

Z 0=

dµx (y) R

m

 ∂u

0

∂t

− L0 u0



Explicitly, this equation is ∂u0 = F (x) · ∇x u0 ∂t

()

where

Z F (x) =

f (x, y)dµx (y) Rm

() is the backward Kolmogorov equation of the limiting equation X˙ = F (X)

Remark: Computing the expectation wrt µx (y) in practice. We have (eL1 tφ)(x, y) →

Z φ(x, y)dµx (y)

as t → ∞

Rm

In other words, if v(x, y, t) satisfies ∂v = L1 u, v|t=0 = φ ∂t so that formally v(x, y, t) = (eL1 tφ)(x, y), then we have

Z lim v(x, y, t) =

t→∞

But since v(x, y, t) = Ey φ(Y

φ(x, y)dµx (y) R

x)

m

where

dY x = ε−1 g(x, Y x )dt + ε−1/2 σ(x, Y x )dW (t) it follows that

Z

φ(x, y)dµx (y) = lim Ey φ(Ytx ) t→∞ Rm Z 1 T = lim φ(Ytx )dt T →∞ T 0

Example: the Lorenz 96 (L96) model L96 consists of K slow variables Xk coupled to J × K fast variables Yj,k whose evolution is governed by  J  X  h x  X˙ k = −Xk−1(Xk−2 − Xk+1) − Xk + Fx + Yj,k J j=1    1  Y˙j,k = −Yj+1,k (Yj+2,k − Yj−1,k ) − Yj,k + hy Xk . ε We will study (L96) with Fx = 10, hx = −0.8, hy = 1, K = 9, J = 8, and two values of ε: ε = 1/128 and ε = 1/1024. Empirical way to check whether (L96) converges, as ε → 0, towards X˙ k = −Xk−1 (Xk−2 − Xk+1 ) − Xk + Fx + Gk (X) where

Z Gk (x) = RKJ

J h X x

J



yj,k dµx (y)

j=1

and µx (y) is the equilibrium measure of

 1 x x x x −Yj+1,k (Yj+2,k − Yj−1,k ) − Yj,k + hy xk ε assuming that it exists for every x. x Y˙j,k =

20 15 10

15 5 0

10

-5

5

0

-5

2

3

4

5

6

7

8

Fig. 1. Typical time-series of the slow (black line) and fast (grey line) modes; K = 9, J = 8, ε = 1/128. The subplot displays a typical snapshot of the slow and fast modes at a given time.

Typical time-series the slow (black line) fast (grey line) modes; 3.1 Properties of the of system and existence of aand limiting dynamics K = 9, J = 8, ε = 1/128. The subplot displays a typical snapshot of the slow and fast modes at a given time.

In the parameter setting that we use, the solutions of (19) are chaotic. Typical time-series of a slow variable Xk and a fast variable Yj,k in the associated subsector are shown in figure 1; the subplot displays a snapshot of the amplitude of the modes at a given time. The chaotic behavior can be inferred from the

0.125

0.1

0.075

0.05

0.025

0 -10

-5

0

5

10

15

Fig. 2. PDF of the slow variable; K = 9, J = 8, black line: ε = 1/128, grey line: ε = 1/1024. The insensitivity in ε of the PDFs indicates that the slow variables have already converged close to their limiting behavior when ε = 1/128.

PDF of the slow variable; K = 9, J = 8, black line: ε = 1/128, grey line: ε = 1/1024. The insensitivity in ε of the PDFs indicates that the slow variables have already converged close to their limiting Even though thereε is=a1/128. separation of time-scales between the slow evolution of behavior when

with appropriate ν, ω, and C0 ≈ Ck,k (0).

the Xk ’s and the fast evolution of the Yj,k ’s since ε = 1/128, such separation is not obviously apparent from the correlation functions of these modes. In particular, figure 3 shows that, after a short transient decay, the correlation

20 7

6

15

5

10 4

3

5

0

0.0125

0.025

0.0375

0.05

0

-5

0

5

10

15

20

25

30

Fig. 3. ACFs of the slow (thick line) and fast (thin line) variables; K = 9, J = 8, black line: ε = 1/128, grey line: ε = 1/1024. The insensitivity in ε of the ACFs ACFs of the slow (thick line) and fast (thin line) variables; ε = 1/128, for the slow modes indicates that the slow variables have already converged close grey line: ε = 1/1024. The subplot is the zoom-in of the main graph to their limiting behavior when ε = 1/128. The subplot is the zoom-in of the main which shows the transient decay of the ACFs of the fast modes graph which shows decay of the of the fast modes becoming becoming fasterthe as transient ε is decreased: thisACFs is the only signature in the faster as ε is decreased: this is the only signature in the ACFs of the fact that the ACFs of the fact that the Yj,k ’s are faster. Yj,k ’s are faster.

check the ergodicity of the fast modes at fixed Xk = xk , – i.e. the solution of

4

3

2

1

0 -3

-2.5

-2

-1

-1.5

-0.5

0

! Fig. 4. Typical PDFs of the coupling term (hx /J) Jj=1 Zj,k (x) for various values of x. These PDFs are robust against variations inPthe initial conditions for (23) J x for various values Typical PDFs the coupling term (hx /J) indicating that theofdynamics of the fast modes conditional on the slow ones being j=1 Yj,k of x.is ergodic. These PDFs are robust against these variations in the condifixed Notice however how different PDFs look: thisinitial indicates that tions indicating that the dynamics of the fast modes conditional the feedback of the slow variables Xk on the fast ones Yj,k is significant in L96. on the slow ones being fixed is ergodic (?).

3.2 Direct-solvers versus the multiscale scheme

For the direct-solver we use the classical fourth-order Runge-Kutta method −11

20 0.125 0.1

15

0.075 0.05

10

0.025 0 -10

5

-5

0

5

10

15

0

-5

0

5

10

15

20

25

30

Fig. 5. Comparison between the ACF obtained via the multiscale scheme (black line) and via the direct-solver with ε = 1/128 (full grey line); K = 9, J = 8. The Comparison between the ACFs and PDFs; black line: limiting dycurves are so closed that it is difficult to distinguish them. The subplot displays namics; full grey line: ε = 1/128. Also shown in dashed grey are the the PDF of the slow mode via the multiscale (blackdynamics line) and corresponding ACF andobtained PDF produced by the scheme truncated via the direct-solver with = 1/128 (full greyX line). Here ∆t = ones, 2−7 = Y1/128, where the coupling of εthe slow modes with the fast k j,k , is 7 = 128 and the multiscale scheme Nartificially = 1, and R = 1. Thus cost(multiscale) = 2 is 1 switched off. 4 cost(direct)/cost(multiscale) = 2 = 16 times more efficient than the direct-solver. Also shown in dashed grey are the corresponding ACF and PDF produced by the truncated dynamics where the coupling of the slow modes Xk with the fast ones, Yj,k , is artificially switched off. The discrepancy indicates that the effect of the fast

Black points: scatterplot of the forcing F (x) in the limiting dynamics. P Fig. Black points: scatterplot of the forcing F (x) term produced by theJ multiGrey7. points: scatter plot of the bare coupling (hx /J) j=1 Yj,k scale (R = 4096). Grey points: scatter plot of the bare coupling term whenscheme ε= 1/128. ! J (hx /J) j=1 Zj,k (x) produced by the direct-solver when ε = 1/128. K = 9, J = 8. The width of the cloud obtained via the multiscale scheme indicates that Fk (x) ≈ F (xk ) is a rather bad approximation. In contrast, the width of the cloud obtained via the direct-solver is more difficult to interpret since it is also due to statistical fluctuation.

without wasting time evaluating F (x) in regions that are not visited by the

Diffusive time-scales (homogenization) Suppose that

Z F (x) =

f (x, y)dµx (y) = 0. R

(∆)

m

Then the limiting equation on the O(1) time-scale is trivial, X˙ = 0, and the interesting dynamics arises on the diffusive time-scale O(ε−1 ). Rescale time as t 7→ t/ε and consider then  X˙ = ε−1 f (X, Y ), dY = ε−2 g(X, Y )dt + ε−1 σ(X, Y )dWt. Proceeding as before we arrive at the following limiting equation when ε → 0: dX = f¯(X)dt + σ ¯(X) dBt, where ¯φ)(x) ≡ f¯(x) · ∇x φ(x) + 1 (¯ (L σσ ¯T )(x) : ∇x ∇x φ(x) 2 Z ∞ Z  ≡ dt dµx (y)f (x, y) · ∇x Ey f (x, Ytx ) · ∇x φ(x) 0

Rm

and dY x = ε−2 g(x, Y x )dt + ε−1 σ(x, Y x )dWt.

Derivation. The backward Kolmogorov equation for u(x, y, t) = Ex,y f (X) is now ∂u = ε−1 L0 u + ε−2 L1 u. (BKE) ∂t Inserting the expansion u = u0 + εu1 + ε2 u2 + O(ε2 ) (we will have to go one order in ε higher than before) in this equation now gives  L1 u0 = 0,      L1u1 = −L0u0, ∂u0  − L0 u1 , L u = 1 2   ∂t   L1 u3 = · · · The first equation tells that u0 (x, y, t) = u0 (x, t). The solvability condition for the second equation is satisfied by assumption because of (∆). Therefore this equation can be formally solved as u1 = −L−1 1 L2 u0 . Insert this expression in the third equation. The solvability condition for the resulting equation gives the limiting equation for u0 : ∂u0 ¯u0 , =L ∂t where ¯= L

Z Rm

dµx (y)L0 L−1 1 L0 .

(LBKE)

To see what this equation is explicitly, notice that −L−1 1 g(y) is the steady state solution of ∂v = L1 v + g(y). ∂t The solution of this equation with the initial condition v(y, 0) = 0 can be represented by Feynman-Kac formula as Z t v(y, t) = Ey g(Ysx )ds, 0

Therefore −L−1 1 g(y)

Z = Ey



g(Ytx )dt,

0

¯ in the limiting backward Kolmogorov equation and the operator L (LBKE) of the limiting equation dX = f¯(X)dt + σ ¯(X) dBt,

Example: the Lorenz 96 (L96) model Consider the following modification of (l96)  J  X   h x  X˙ k = −ε (Xk−1(Xk−2 − Xk+1) + Xk ) + Yj,k+1 − Yj,k−1 J j=1    1  Y˙j,k = −Yj+1,k (Yj+2,k − Yj−1,k ) − Yj,k + Fy + hy Xk , ε (L960 ) By symmetry J h X x

Z 0= R

KJ

J

j=1

yj,k+1 − yj,k−1



dµx (y)

7 0.2

6 0.15

5

0.1

4

0.05

3

0 -10

-5

0

5

10

2

1

0

0

100

200

300

400

500

Fig. 14. ACFs and PDFs (subplot) of the slow variable Xk evolving under (57). Grey line:and ε = 1/256; black line:ofε = 1/128; blackXline: ε = 1/128 with ACFs PDFsfull (subplot) the slowdashed variable k evolving under time rescaled as t line: → 2t consistent with (56). Theline: near εperfect matchdashed confirmsblack that (L96’). Grey ε = 1/256; full black = 1/128; evolution of X some limiting as dynamics the O(1/ε) time-scale. k converges line: ε = 1/128 with to time rescaled t → 2t.onThe near perfect match confirms that evolution of Xk converges to some limiting dynamics on the O(1/ε) time-scale.

is satisfied. Indeed, the conditional measure entering (51) does not depend on x since the equation used in the microsolver is 1 Z˙ j,k = (−Zj+1,k (Zj+2,k − Zj−1,k ) − Zj,k + Fy ) .

(58)

Coupled truncated Burgers-Hopf (TBH). I. Stable periodic orbits

 1  ˙  b1 X2 Y1 + aY1 (R2 − (X12 + X22 )) − bX2 (α + (X12 + X22 )), X = 1   ε        1  2 2 2 2 2  ˙ X = b X Y + ax (R − (X + X )) + bX (α + (X + X )),  2 2 1 1 2 1 1 2 1 2   ε   ik X 1  ∗ ∗ ˙  Y = −Re U U + b3 δ1,k X1 X2 , k  p q  2 ε   p+q+k=0         ik X  ˙  Up∗ Uq∗ ,   Zk = −Im 2 p+q+k=0 where Uk = Yk + iZk . Truncated system: stable periodic orbit (limit cycle) (X1 (t), X2 (t)) = R (cos ωt, sin ωt)

with frequency ω = b(α + R2 ).

NB: Truncated Burgers

(Majda & Timofeyev, 2000)

Fourier-Galerkin truncation of the inviscid Burgers-Hopf equation, ut + 12 (u2 )x = 0: U˙ k = −

ik 2

X

Up∗ Uq∗ ,

|k| < Λ

k+p+q=0 |p|,|q|≤Λ

Features common with many complex systems. In particular: • Display deterministic P chaos. • Ergodic on E = k |Uk |2 . • Scaling law for the correlation functions with tk ≈ O(k−1 ). Here: Used as a model for unresolved modes. Couple truncated Burgers with two resolved variables.

 1  ˙  X = b1 X2 Y1 + aX1 (R2 − (X12 + X22 )) − bX2 (α + (X12 + X22 )), 1   ε        1   ˙ 2 = b2 X1 Y1 + aX2 (R2 − (X12 + X22 )) + bX1 (α + (X12 + X22 )), X    ε   ik X 1  ˙  Yk = −Re Up∗ Uq∗ + b3 δ1,k X1 X2 ,   2 ε   p+q+k=0         ik X  ∗ ∗ ˙  Z = −Im U Uq , k  p  2 p+q+k=0 Limiting SDEs:

 dX1 = b1 b2 X1 dt + N1 X1 X22 dt + σ1 X2 dW (t)      + aX1 (R2 − (X12 + X22 ))dt − bX2 (α + (X12 + X22 ))dt,     dX2 = b1 b2 X2 dt + N2 X2 X12 dt + σ2 X1 dW (t)    + aX2 (R2 − (X12 + X22 ))dt + bX1 (α + X12 + X22 ))dt,

(a)

(b)

1

x

2

1

0

0

−1

−1 −1

0

1

−1

0

1

x1

Contour plots of the joint probability density for the climate variables X1 and X2 ; (a) deterministic system with 102 variables; (b) limit SDE. There only remains a ghost of the limit cycle.

(a)

(b)

3

3 2.5

PDF of x1

PDF of x1

Reduced model

DNS

2.5 2

1.5 1

0.5 0

2 1.5 1 0.5

−2

0

x1

0

2

−2

(c)

2

3 DNS

2.5

2

2 1.5 1 0.5

Reduced model

2.5

PDF of x

PDF of x2

x1

(d)

3

0

0

2

1.5 1 0.5

−2

0

x2

2

0

−2

0

x2 2

Marginal PDFs of X1 and X2 for the simulations of the full deterministic system with 102 variables (DNS) and the limit SDE.

(b)

0.8

corr of x2

corr of x1

(a)

0.6 0.4 0.2

0.8 0.6 0.4 0.2

0

0

−0.2

−0.2

−0.4 0

20

40

60

t

80

−0.4 0

20

60

t

80

60

t

80

(d)

0.4 1

0.3

K2(t)

cross corr of x1, x2

(c)

40

0.2 0.1 0

0.8 0.6

−0.1 −0.2 0

20

40

60 t

80

0.4 0

20

40

Two-point statistics for X1 and X2 ; solid lines - deterministic system with 102 degrees of freedom; dashed lines - limit SDE; (a), (b) correlation functions of X1 and X2 , respectively; (c) cross-correlation function of X1 and X2 ; (d) normalized correlation of energy, hX22 (t)X22 (0)i K2 (t) = hX22 i2 + 2hX2 (t)X2 (0)i2

Coupled truncated Burgers-Hopf (TBH). II. Multiple equilibria

 1  ˙ = b1 Y1 Z1 + λ(1 − αx2 )x, X    ε         Y˙ = −Re ik X U ∗U ∗ + 1 b δ XZ , 2 1,k k k p q 2 ε p+q+k=0         ik X 1  ∗ ∗  ˙ Z = −Im u u + b3 δ1,k XYk ,  p q  k 2 ε¯ p+q+k=0

Limiting SDE: dX = −γXdt + σdW (t) + λ(1 − αX 2 )Xdt

(a)

(b)

0.6

0.6

0.6

λ = 1.2 PDF of x

(c)

λ = 0.5

λ=0.15

0.4

0.4

0.4

0.2

0.2

0.2

0

−2

0

x

2

0

−2

0

2

0

−2

0

2

PDF of X for the simulations of the deterministic model (solid lines) and the limit SDE (dashed lines) in three regimes, λ = 1.2, 0.5, 0.15.

(a)

corr of x

1

1

1 λ =0.15

λ =0.5

λ =1.2 0.5

0 0

(c)

(b)

0.5

5

10

15

0 0

0.5

5

10

15

0 0

5

10

15

t

K(t)

(d)

(e)

(f)

1.2

1.2

1.2

1

1

1

0.8

0.8

λ =1.2

0.6 0.4 0

0.8

λ =0.5

0.6 5

10

15

0.4 0

λ =0.15

0.6 5

10

15

0.4 0

5

10

15

t

Two-point statistics of X in three regimes, λ = 1.2, 0.5, 0.15 for the deterministic equations (solid lines) and limit SDE (dashed lines) (a), (b), (c) correlation function of x; (d), (e), (f) correlation of energy, K(t), in x.

Generalizations Let Zt ∈ S be the sample path of a continuous-time Markov process with generator L = L0 + ε−1 L1 and assume that L1 has several ergodic components indexed by x ∈ S 0 with equilibrium distribution µx (z) Then, as ε → 0 there exists a limiting on S 0 with generator ¯ = Eµx L0 L

Similar results available on diffusive time-scales.

Can be applied to SDEs, Markov chains (i.e. discrete state-space), deterministic systems (periodic or chaotic), etc.

Other situations with limiting dynamics? Mori-Zwanzig projection Consider



X˙ = f (X, Y ), Y˙ = g(X, Y ),

and denote by ϕ(X[0,t] ) the solution of the equation for Y at time t assuming that X is known on [0, t] Observe that ϕ(X[0,t] ) is a functional of {Xs , s ∈ [0, t]}. Then X satisfies the closed equation X˙ = f (X, ϕ(X[0,t] ))

In general, X is not Markov!

It becomes Markov when Y is faster, or ...?

Other possibility: Weak coupling The system  N X  1 ˙ X= f (X, Y n), N n=1  ˙n Y = g(X, Y n), (all the Y n coupled only via X) may have a limit behavior as N → ∞

Example: Kac-Zwanzig model with Hamiltonian

H(q, p) =

N X j=0

p2j /(2mj ) + V (q0 ) +

N X

αj (qj − q0 )2

j=1

With specific choices of {mj , αj }j=1,...,N , there exists a closed SDE for (q0 , q˙0 , q¨0 ) as N → ∞

Lecture 2: Numerical aspects

Implicit schemes: why they work for stiff ODEs, why they don’t for rapidly oscillatory and stochastic systems. HMM-like integrators: based on averaging principle. Boosting method: stochastic systems.

an new seamless extrapolation method for

2.5 2 1.5

x, y

1 0.5 0 −0.5 −1 −1.5 0

1

2

3

4

5

time

The solution of

√ X˙ = −Y 3 + cos(t) + sin( 2t) Y˙ = −ε−1 (Y − X)

n

when ε = 0.1 and we took X(0) = 2, Y (0) = −1. X is shown in blue, and Y in green. Also shown in red is the solution of the limiting equation √ X˙ = −X 3 + cos(t) + sin( 2t)

Consider the stiff ODE 

X˙ = f (X, Y ) Y˙ = −ε−1 (Y − φ(X))

(?)

If ε  1, Y is very fast and it will adjust rapidly to the current value of X, i.e. after short O(ε) transient we will have Y = φ(X) + O(ε) at all times.

Then the equation for slow variables X reduces to X˙ = f (X, φ(X))

(??)

Can be integrated efficiently using an implicit scheme, like e.g.:

 X n+1 = X n + ∆tf (X n+1, Y n+1) ∆t n+1 Y n+1 = Y n − − φ(X n+1 )) (Y ε When ε  ∆t  1:  n+1 Y = φ(X n) + O(∆t) + O(ε), X n+1 = X n + ∆tf (X n, φ(X n)) + O(∆t2 ) + O(ε) However: Implicit schemes are ill-suited for rapidly oscillatory or stochastic systems!

3

2

x, y

1

0

−1

−2

−3 0

1

2

3

4

5

time

The solution of √ X˙ = −Y 3 + cos(t) + sin( 2t) dY = −ε−1 (Y − X)dt + ε−1/2 dW



when ε = 0.01 and X(0) = 2, Y (0) = −1. X is shown in blue, and Y in green. Also shown in red is the solution of the limiting equation √ 3 ˙ X = −X + X + cos(t) + sin( 2t) Notice how noisy Y is.

Recall key limit thm: Consider  X˙ = f (X, Y ), dY = ε−1 g(X, Y )dt + ε−1/2 σ(X, Y )dW (t),

(?)

Assume that: (i) the evolution Y at every fixed X = x is ergodic with respect to the probability distribution dµx (y) and (ii)

Z F (x) =

f (x, y)dµx (y) R

exists

m

Then in the limit as ε → 0 the evolution for X solution of (?) is governed by X˙ = F (X) In addition

Z

1 F (x) = f (x, y)dµx (y) = lim T →∞ T Rm

Z

T

f (x, Ytx )dt

0

where dY x = ε−1 g(x, Y x )dt + ε−1/2 σ(x, Y x )dW (t)

Basic HMM-like integrator Use X n+1 = X n + ∆t F˜n,

X0 = X(t = 0)

Here F˜n is an approximation of F (Xn) obtained as MX +MT 1 F˜n = f (X n, Y n,m) MT m=M

where

r

δt δt g(X n, Y n,m) + σ(X n, Y n,m)ξ m , ε ε = Y (t = 0), Y n+1,0 = Y n,M +MT −1

Y n,m+1 = Y n,m + Y0,0

Why is this better? Basically, because M and MT are O(1) in ε! In other words, one can reach an O(1) time-scale with a O(1) number of steps. In contrast, the direct scheme  X p+1 = X p + δt f (X p, Y p), δt Y p+1 = Y p + g(X p , Y p ), ε takes O(ε−1 ) steps to reach an O(1) time-scale.

Error estimate (pessimistic): Thm: For any T > 0, there exists a constant C > 0 such that

 E

sup 0≤n≤T /∆t

n



|X(n∆t) − X | ≤ C



r ε + (∆t)k + (δt/ε)l +

ε∆t MT δt + 1

!

Efficiency depends on how large MT and M needs to be. Recall that 1 F˜n = M

M +M XT −1

f (X n, Y n,m)

m=MT

M tampers the relaxation errors; MT controls the size of the time-averaging window.

The nice surprise: the HMM-like multiscale integrator works even if MT = 1 (no time averaging) provided only that ε

ε∆t 1 M δt

The factor λ = ∆t/M δt also gives the efficiency boost the HMM-like multiscale integrators over a direct scheme.

Why is time-averaging unnecessary (since in this case the approximation on F˜n is very bad at each time step)?

A simple illustrative example:  X˙ = −Y 1 1 dY = − (Y − X)dt + √ dW (t) ε ε Here 2

e−(y−x) dµx (y) = √ dy π and so the limiting equation is X˙ = −X To understand the effect of (not) averaging, use ξn  X = X − ∆t X + √ , 2 n ξ = independent N (0, 1) n+1

n

n

Then n−1

∆t X (1 − ∆t)j ξ n−j X = x(1 − ∆t) + √ 2 j=1 n

n

and n−1

∆t2 X n n 2 E|X − x(1 − ∆t) | = (1 − ∆t)2j = O(∆t) 2 j=1

[n = O(∆t−1 )]

Same example, including relaxation errors:  X˙ = −Y 1 1 dY = − (Y − X)dt + √ dW (t) ε ε

(?)

Scheme: 1. Use: Y n,m+1 = Y n,m −

δt n,m (Y − X n) + ε

r

δt n,m ξ , ε

Y n,0 = Y n−1,M

2. Use X n+1 = X n − ∆t Y n,M 3. Repeat. Denoting λ = ∆t/M δt, this integrator approximates  X˙ = −Y 1 1 dY = − (Y − X)dt + √ dW (t) λε λε

(??)

Hence, by the limit theorem, it approximates (?) and boost the efficiency by λ if ε  ελ  1

or equivalently

M δt  ∆t  ε−1 M δt

Example: the Lorenz 96 (L96) model

 J   hx X  X˙ k = −Xk−1(Xk−2 − Xk+1) − Xk + Fx + Yj,k J j=1    1  Y˙j,k = −Yj+1,k (Yj+2,k − Yj−1,k ) − Yj,k + hy Xk . ε

(L96)

with Fx = 10, hx = −0.8, hy = 1, K = 9, J = 8, and ε = 1/128.

MT R −7 ε=2 Truncated 1 4 1 2 1 1 2 1 4 1 1 1 2 1

∆t 2−11 2−6 2−4 2−5 2−6 2−5 2−4 2−7 2−7

Gain — — 32 32 32 32 32 16 8

ν 0.135 0.287 0.201 0.171 0.162 0.162 0.158 0.137 0.135

error(%) — 113 49 27 20 20 17 2 0

ω 3.81 3.57 3.76 3.89 3.88 3.88 3.87 3.84 3.83

error(%) — 6.3 1.3 2.1 1.9 1.9 1.6 0.8 0.5

Gain in efficiency of the multiscale scheme over a direct solver for various values of the control parameters in the multiscale algorithm. The error and the gain are calculated relative to the simulation with ε = 2−7 . The parameters ν and ω are obtained by fitting by C0 cos(ωt)e−νt the ACF produced by the simulations.

20 0.125 0.1

15

0.075 0.05

10

0.025 0 -10

5

-5

0

5

10

15

0

-5

0

5

10

15

20

25

30

Fig. 5. Comparison between the ACF obtained via the multiscale scheme (black line) and via the direct-solver with ε = 1/128 (full grey line); K = 9, J = 8. The Comparison between the ACFs and PDFs; black line: multiscale curves are so closed that it is difficult to distinguish them. The subplot displays solver; full grey line: direct solver with ε = 1/128. Efficiency gain: the the slow obtained via the scheme ACF (blackand line)PDF and 16.PDF Alsoofshown in mode dashed grey are the multiscale corresponding −7 = 1/128, via the direct-solver with ε = 1/128 (full grey line). Here ∆t = 2of produced by the truncated dynamics where the coupling the slow 7 = 128 and the multiscale scheme is Nmodes = 1, and R = 1. Thus cost(multiscale) = 2 1 Xk with the fast ones, Yj,k , is artificially switched off. cost(direct)/cost(multiscale) = 24 = 16 times more efficient than the direct-solver. Also shown in dashed grey are the corresponding ACF and PDF produced by the truncated dynamics where the coupling of the slow modes Xk with the fast ones, Yj,k , is artificially switched off. The discrepancy indicates that the effect of the fast

Example: L96 model with space-time scale separation

 J  X  h x  X˙ k = −Xk−1(Xk−2 − Xk+1) − Xk + Fx + Yj,k J j=1    1  Y˙j,k = −Yj+1,k (Yj+2,k − Yj−1,k ) − Yj,k + hy Xk . ε with Fx = 10, hx = −0.8, hy = 1, K = 9, and J = 1/ε = 128.

(sL96)

20 15 10

15 5 0

10

-5

5

0

-5

2

3

4

5

6

7

8

Fig. 9. Typical time-series of the slow (black line) and fast (grey line) modes; K = 9, J = 1/ε = 128. The subplot displays a typical snapshot of the slow and fast modes at a given time. Typical time-series of the slow (black line) and fast (grey line) modes; are sufficiently weak=and short-range on the displays average. Then, at any given time, K = 9, J = 1/ε 128. The subplot a typical snapshot of !J !J the term (h /J) Y (and hence (h /J) Z (x)) self-averages in the x j,k j=1modes the slow and fast at a givenx time. j=1 j,k

limit as J → ∞ (i.e. ε → 0 since J = #1/ε$) to a limit which, by the law of large numbers, satisfies "1/ε#

lim εhx

"

Yj,k = hx EXk Yj,k ≡ F (Xk ),

(31)

Working assumption: spatial interaction between the Yj,k are sufficiently weak and short-range on the average.

P Then, at any given time, the term (hx /J) Jj=1 Yj,k self-averages in the limit as J → ∞ (i.e. ε → 0 since J = b1/εc) to a limit which, by the law of large numbers, satisfies b1/εc

lim εhx

ε→0

X

Yj,k = hx EXk Yj,k ≡ F (Xk ),

j=1

Here EXk Yj,k is the conditional average of any given Yj,k at fixed Xk . Notice that this implies F (Xk ) instead of F (X1 , . . . , XK ). This assumption is corroborated by numerical experiments.

Pb1/εc Y (no averaging here); Scatterplots of the bare forcing εhx !j=1 !1/ε" j,k Fig. Scatterplots the=bare forcing εhxgrey: averaging here); light10.grey: J = of 1/ε 128; dark Jj,k=(no 1/ε = 4096 (Klight = j=1 Y grey: = 1/ε = 128; dark J = 1/ε = 4096 (K =that 9). The sharpness these 9). JThe sharpness of grey: these graphs confirm bare forcingof selfgraphs confirm that barewith forcing self-averages consistent with The subplot: averages consistent (??). The subplot: J = 1/ε(31). = 1024, hx = J1.2 = (instead 1/ε = 1024, (instead hx = −0.8 taken otherwise); it can by be of hhxx = = 1.2 −0.8 takenofotherwise); it can be obtained rescaling forcing the main by by 1.2/(-0.8). obtained bythe rescaling the in forcing in the plot mainrespectively plot respectively 1.2/(-0.8). The thin dashed lines shows the stability band X ∈ (−0.5, 8/9) and the corresponding forcing F (X) = hy hx X.

20 0.125 0.1

15

0.075 0.05

10

0.025 0 -10

5

-5

0

5

10

15

0

-5

0

5

10

20

15

25

30

Fig. 11. Comparison of ACFs and PDFs (subplot) of the slow mode for various simulations in Jof=ACFs 1/ε regime. Light grey: J = 128; grey: J = 256for (practically Comparison and PDFs (subplot) ofdark the slow mode various indistinguishable from the previous one); solid black: result from the multiscale simulations in J = 1/ε regime. Light grey: J = 128; dark grey: J = scheme with tabulated forcing (J = 16);from dashed resultone); from the multiscale 256 (practically indistinguishable theblack: previous solid black: −7 scheme 8, ∆t = 2 , N1scheme = R = 1.with tabulated forcing (J = 16; result with fromJ = the multiscale gain = ∞); dashed black: result from the multiscale scheme with J = 8, ∆t = 2−7 , MT = R = 1 (gain = 256).

in the cost since it is different from the one used in the direct simulations) "

10

Pb1/εc with J = Dark grey: scatterplots of the bare forcing εhx j=1 Yj,k !!1/ε" Fig.= 12. Dark grey: the bare forcing forcing Fεh Yj,k with 1/ε 4096 (K = 9). scatterplots Light grey:of effective produced x j=1 k (X) J = 1/ε = 4096 (K = 9). Light grey: effective forcing Fk (X) on-the-fly on-the-fly by the multiscale scheme with K = 9, Jproduced = 8, M T = 1, by −7 −7 scheme K = black 9, J = line: 8, N1 =tabulated 1, R = 64, effective ∆t = 2 . Solid black Rthe = multiscale 64, ∆t = 2 .with Solid forcing computed via the multiscale scheme with Kmultiscale = 9, J =scheme 16. with K = 9, line: tabulated effective forcing computed via the J = 16.

Warning: The problem of hidden slow variables

 J X   1  2 2  ˙ Yj,k+1 − Yj,k−1 Xk = −Xk−1(Xk−2 − Xk+1) − √ J j=1  1 1   Y˙j,k = − Yj+1,k (Yj+2,k − Yj−1,k ) − √ Yj,k (Xk+1 − Xk−1). ε J

(hL96)

with J = 1/ε. The following is an additional slow variables (hidden in (hL96)): J 1 X 2 ¯k = √ B Yj,k . J j=1

Due to the absence of forcing and damping in (hL96), this equation conserves the energy E=

K  X k=1

Xk2

+

J X j=1

2 Yj,k



.

2 0.4

0.3

1.5

0.2

1 0.1

0

0.5

-5

-2.5

0

2.5

5

0

-0.5

0

3

6

9

12

15

Fig. 13. ACFs and PDFs of Xk . Grey line: direct simulation with ε = 1/64, J = 512 (δt = 2−10 ); thick solid black line: multiscale scheme with ∆t = 2−8 , J = 32 ACFs and PDFs of Xk . Grey line: direct simulation with ε = 1/64, (efficiency gain: 64); thin black line: multiscale with ∆t = 2−7 , J = 8 (efficiency −10 J = 512 (δt = 2 ); thick solid black line: multiscale with −8 , J = scheme gain: 512). Dashed black line: multiscale scheme with ∆t = 2 32 (efficiency −8 ∆t = 2 , J = 32 (efficiency gain: 64); thin black line: multiscale gain: 64) ∆ where Bk are not accounted for. The discrepancy with t = the 2−7hidden , J =slow 8 variables (efficiency gain: 512). Dashed black line: −8 clearly indicates that accounting for the B ’s is necessary. In all the multiscale multiscale scheme with ∆t = 2 , Jk = 32 (efficiency gain: 64) where computations N = R = 1. 1 the hidden slow variables Bk are not accounted for. The discrepancy clearly indicates that accounting for the B ’s is necessary. In all the in the dynamics arising on the O(ε−1) time-scale khave a very weak effect on multiscale computations MT = R = 1.

the long-time dynamics and quantities like the PDFs and ACFs. This is not always the case, though, and even L96 can be tuned in a way that the stochas-

Toward seamless multiscale integrators?

 X˙ = −Y 1 1 dY = − (Y − X)dt + √ dW (t) ε ε Recall the scheme:

(?)

1. Use: Y n,m+1 = Y n,m −

δt n,m (Y − X n) + ε

r

δt n,m ξ , ε

Y n,0 = Y n−1,M

2. Use X n+1 = X n − ∆t Y n,M 3. Repeat. Denoting λ = ∆t/M δt, this integrator approximates  X˙ = −Y 1 1 dY = − (Y − X)dt + √ dW (t) λε λε

(??)

Hence, by the limit theorem, it approximates (?) and boost the efficiency by λ if ε  ελ  1

or equivalently

M δt  ∆t  ε−1 M δt

Seamless multiscale algorithm – Boosting method Consider dZ =

1 1 ¯ (t) h1 (Z)dt + √ h2 (Z)dW (t) + h3 (Z)dt + h4 (Z)dW ε ε

and assume that there exist slow variables X = φ(Z) which satisfies as ε → 0

dX = F (X)dt + G(X)dW (t) 1. Use Z n,m+1 = Z n,m +

δt h1 (Z n,m) + ε

r

δt h2 (Z n,m)ξ n,m, ε

2. Use Z

n+1,0

=Z

n,M

+ ∆t h3 (Z

n,M

√ )+

˜n,M )η n ∆t h4(Z

3. Repeat. Seamless in that one does not need to know the slow variables X are. Works because it approximates 1 1 ¯ (t) h1 (Z)dt + √ h2 (Z)dW (t) + h3 (Z)dt + h4 (Z)dW λε λε with λ = ∆t/M δt. dZ =

Similar in spirit to Chorin’s artificial compressibility method and the Car-Parrinello method in molecular dynamics.

More sophisticated seamless integrators? Reformulating the main theorem: Consider the system 1 f (Zt) + g(Zt) ε for some variable Zt ∈ Rn. Assume that there exists a vector valued function ϕ : Rn → Rm (m < n) such that: Z˙ t =

1. We have f (z) · ∇ϕ(z) = 0; 2. The dynamics Z˙ tx = f (Ztx ) is ergodic on every component indexed by ϕ(z) = x ∈ Rm with respect to the equilibrium distribution dµx (z). Then Xt = ϕ(Zt) are slow variables satisfying the following equation X˙ t = H(Xt) where

Z g(z) · ∇ϕ(z)dµx (z).

H(x) = R

n

The previous result is more general, but unfortunately it does not lead to a seamless multiscale algorithm because the mapping ϕ defining the slow variable is usually nonlinear. In particular, it is easy to see that averaging the original equation, i.e. using ¯˙ t = G(ϕ(Z ¯t)) Z where

Z G(z) =

g(z)dµx (z), R

n

will, in general not be correct, in the sense that ¯t) 6= Xt ϕ(Z unless ∇ϕ(z) is a function of x alone, i.e. ∇ϕ(z) = J(ϕ(z)) for some J : Rm → Rn × Rm . Only if (??) is satisfied do we have F (z) = G(z)J(z)

and

¯t) = Xt ϕ(Z

When this is the case, we can build a seamless multiscale algorithm based only on our ability to decompose the velocity field in its fast O(ε−1 ) component and its slow O(1) component.

Seamless multiscale algorithm - bis If ∇ϕ(z) = J(ϕ(z)) for some J : Rm → Rn × Rm , then we can do the following to integrate Z˙ t =

1 f (Zt) + g(Zt) ε

1. Use ˜m+1,n = Z ˜m,n + Z

δt ˜m,n), f (Z ε

˜0,0 = Zt=0 , Z

˜0,n+1 = Z ˜M +MT ,n Z

and compute 1 ˜n = G M

M +M XT −1

˜n,m) g(Z

m=MT

2. Use ˜n , Zn+1 = Zn + ∆t G

Z0 = Zt=0

3. Repeat. Here ϕ(Zn) ≈ X(n∆t). Notice that the algorithm is totally seamless, i.e. one does not need to know ϕ nor ∇ϕ.

Lecture 3: Application to Markov jump processes (aka Kinetic Monte-Carlo - KMC) Evolution of an isothermal, spatially homogeneous mixture of chemically reacting molecules contained in a fixed volume V . NS species of molecules Si, i = 1, . . . , NS involved in MR reactions Rj , j = 1, . . . , MR . Each reaction Rj is characterized by a rate function aj (x) and a state change (or stochiometric) vector νj : Rj = (aj , νj ),

R = {R1 , . . . , RMR }.

Let xi be the number of molecules of species Si. Given the state x = (x1 , . . . , xNS ), the occurrences of the reactions on an infinitesimal time interval dt are independent of each other and the probability for reaction Rj to happen during this time interval is given by aj (x)dt. The state of the system after reaction Rj is x + νj .

Equivalently: Given that the state of the system is Xt = x at time t; 1. The probability that next reaction happens after time t + s is PMthe R −a(x)s e where a(x) = j=1 aj (x). 2. Given that a reaction happens at time t + s, the probability that it be reaction j is aj (x)/a(x).

Gillespie’s Stochastic Simulation Algorithm (SSA) Randomly choose when the next reaction occurs according to 1. above; then: Randomly choose which one occurs according to 2. above.

Gillespie’s Stochastic Simulation Algorithm (SSA) Let a(x) =

MR X

aj (x).

j=1

Assume that the current time is tn, and the system is at state Xn. We perform the following steps: 1. Generate independent random numbers r1 and r2 with uniform distribution on the unit interval (0, 1]. Let δtn+1 = −

ln r1 , a(Xn)

and kn+1 be the natural number such that kn+1 −1 kn+1 X 1 1 X aj (Xn) < r2 ≤ aj (Xn), a(Xn) j=0 a(Xn) j=0

where a0 = 0 by convention. 2. Update the time and the state of the system by tn+1 = tn + δtn+1 , 3. Repeat

Xn+1 = Xn + νkn+1 .

Suppose that are fast and slow reactions: Rjs = (asj (x), νjs ),

Rjf = (ε−1 afj (x), νjf ).

where ε  1 represents the ratio of time scales of the system.

Then: The time-step between reactions is O(ε) and with probability 1 − O(ε) a fast reaction happens.

Difficult to simulate the evolution up to the O(1) time-scale of the slow reactions!

Simple example: a

S1 |

1 −→ ←− S2 , a2 {z }

fast

a

S2 |

3 −→ ←− S3 a4 {z }

slow

a

S3 |

5 −→ ←− S4 . a6 {z }

fast

with a1 = 105 x1 , a2 = 105 x2 , a 3 = x2 , a 4 = x3 , a5 = 105 x3 , a6 = 105 x4 , i.e. the first and third reactions

ν1 = (−1, +1, 0, 0), ν2 = (+1, −1, 0, 0), ν3 = ( 0, −1, +1, 0), ν4 = ( 0, +1, −1, 0), ν5 = ( 0, 0, −1, +1), ν6 = ( 0, 0, +1, −1). are faster than the second one.

Every species is involved in at least one fast reaction so there is no slow species. But the variables y1 = x1 + x2 and y2 = x3 + x4 are conserved during the fast reactions and only evolve during the slow reaction.

18 x +x 1

16

2

x

3

14 molecules

12 10 8 6 4 2 0 0

0.5

1 time

1.5 −3

x 10

Fig. 1. Evolution of slow variable y1 = x1 + x2 and fast variable x3 on the intermediate time scale. Evolution of one of the slow variable, y1 = x1 + x2 (the other, y2 = 2.2 dynamics on the slow time x3 +Effective x4 behaves similarly), and one of scale the fast variables, x3 , on an intermediate time scale. Note that x3 is also the “instantaneous” rate of reaction for the slow variable y2 = For the kind of xsystems 3 + x4 . discussed above, very often we are interested mostly in

the effective dynamics over the slow time scale. In this section we will derive the model for this effective dynamics. The analysis is built upon the perturbation theory developed in [17,19,14–16].

Averaging thm: a

a

1 ←− S2 , S1 −→ a2 | {z }

a

3 ←− S3 S2 −→ a4 | {z }

fast

5 ←− S4 . S3 −→ a6 | {z }

slow

fast

The slow variables are y 1 = x1 + x 2 ,

y 2 = x 3 + x4

The equilibrium distribution of the virtual fast process is µy1 ,y2 (x1 , x2 , x3 , x4 ) =

y 1 ! y2 ! (1/2)y1 (1/2)y2 δx1 +x2 =y1 δx3 +x4 =y2 . x 1 ! x 2 ! x3 ! x4 !

Effective dynamics: x1 + x2 y1 = , 2 2 x + x y2 3 4 ¯ as4 = P x3 = = , 2 2

¯ as3 = P x2 =

ν ¯3s = (−1, +1), ν ¯4s = (+1, −1).

Key observation: v(x) = b · x is a slow variable if b · νjf = 0 for all νjf . The set of such vectors form a linear subspace in RNS . Let b1 , b2 , . . . , bJ be a set of basis vectors of this subspace, and define yj = bj · x

for j = 1, . . . , J,

then y1 , y2 , · · · , yJ defines a complete set of slow variables, i.e. all slow observables can be expressed as functions of y1 , y2 , · · · , yJ .

In other words, in the present case, the slow variables are linear functions of the original variables! This allows for a seamless formulation of the limiting theorem (see next).

Notice that the slow variables are not slow species (which do not exist in general)!

Seamless limit theorem: Consider a Markov jump process with generator L = L0 +ε−1 L1 where

 Ns  X    (L0 f )(x) = asj (x)(f (x + νjs ) − f (x))   j=1 Nf

 X f    (L1 f )(x) = aj (x)(f (x + νjf ) − f (x))   j=1

Assume that the fast process is ergodic with respect to the foliated measure µy (x0 ) indexed by y = b · x, i.e. 1 T

Z

T

F (Xtf )dt



0

X

f (x0 )µb·x (x0 )

x0

f where Xtf is a sample path with Xt=0 = x of the process with generator L1 .

Then, for any T > 0, there exists a constant C > 0 such that sup |Ef (Xt) − E

X

0≤t≤T

µb·X¯t (x0 )f (x0 )| ≤ Cε

x0

¯t is a sample path of the process with generator where X

¯f )(x) = (L

Ns X j=1

¯ aj (b · x)(f (x +

νjs )

− f (x))

¯ aj (y) =

X x0

µy (x0 )asj (x0 )

Nested Stochastic Simulation Algorithm (nSSA) 1. Inner SSA: Run N independent replicas of SSA with the fast reactions Rf = {(−1 af , ν f })} only, for a time interval of T0 + Tf . During this calculation, for j = 1, · · · , Ms

compute the modified slow rates

Z Tf +T0 N X 1 1 asj (Xτk )dτ, ˜ asj = N Tf T0 k=1

where Xτk is the result of the k-th replica of this auxiliary virtual k fast process at virtual time τ whose initial value is Xt=0 = Xn, and T0 is a parameter we choose in order to minimize the effect of the transients to the equilibrium in the virtual fast process.

2. Outer SSA: Run one step of SSA for the modified slow reac˜s = (˜ tions R as , ν s ) to generate (tn+1 , Xn+1 ) from (tn, Xn). Then repeat.

Totally seamless!

Error estimate: For any T > 0, there exist constants C and α independent of (N, T0 , Tf ) such that, ! −αT 0 / e 1 sup E |v(x, t) − u(x, t)| ≤ C  + . +p 1 + Tf / 0≤t≤T N (1 + Tf /) Here: v(x, t) = Ef (Xtε) where Xtε is an exact path, and v(x, t) = Ef (Xt) where Xt is a pathway from the nested SSA. Efficiency: Given an error tolerance λ:

 cost = O(N (1 + T0 / + Tf /)) = O

  1 cost = O 

1 λ2



(direct SSA)

(nested SSA)

Example: heat shock response of Escherichia Coli Reaction DNA.σ 32 → mRNA.σ 32 mRNA.σ 32 → σ 32 + mRNA.σ 32 mRNA.σ32 → degradation σ32 → RNAPσ 32 RNAPσ 32 → σ 32 σ 32 + DnaJ → σ 32 .DnaJ (??) DnaJ → degradation (??) σ 32 .DnaJ→ σ 32 + DnaJ DNA.DnaJ + RNAPσ 32 → DnaJ + DNA.DnaJ + σ 32 DNA.FtsH + RNAP.σ 32 → FtsH + DNA.FtsH + σ 32 FtsH → degradation GroEL → degradation σ 32 .DnaJ + FtsH → DnaJ + FtsH DNA.GroEL + RNAPσ 32 → GroEL + DNA.GroEL +σ 32 Protein → UnfoldedProtein (?) DnaJ+ UnfoldedProtein → DnaJ.UnfoldedProtein (?) DnaJ.UnfoldedProtein → DnaJ+ UnfoldedProtein (?)

Rates magnitude 1.4 × 10−3 1.19 2.38 × 10−5 10.5 9.88 25.2 2.97 × 10−6 1.30 3.71 0 1.48 × 10−8 7.76 × 10−5 8.4 4.78 106 107 106

Reaction list for the heat shock model of E. Coli proposed in: R. Srivastava, M. S. Peterson and W. E. Bently, Biotech. Bioeng. 75, 120–129 (2001). The rate magnitude is the value of ai(x) evaluated at initial time or equilibrium. The last three reactions marked with a (?) in the table are fast: they are used in the Inner SSA. All the other reactions are used in the Outer SSA, and the rates of the reactions marked with a (??) are averaged.

Figure 3. SPN of the !32 genetic regulatory circuit. All simulations were based on the SPN shown. The arcs made up of dotted lines were included only for the recombinant protein simulations.

Stochastic petri net diagram for the model of heat shock of E. Coli softwareand packageW. for modeling stochastic activity networks tion, the generic constant between the recombinant [from R.binding Srivastava, M. S. Peterson E. Bently, Biotech. (Sanders, 1995). The ULTRASAN software was kindly proprotein and the J-complex was varied. In this way, predicBioeng. 75, 120–129 (2001)] tions of a !32-mediated stress response could be generated. vided by Dr. Sanders (Center for Reliable and HighThree scenarios were evaluated: low requirement for chaperone mediation, equal requirement, and strong requirement for chaperone mediation, with their corresponding values shown in Table II. Although the model of the !32 circuit was in part lumped, we assumed that it could be modeled as

Performance Computing at the University of Illinois at Urbana–Champaign). The time duration used for simulations was 30 min. Experimentally, we found that antisense reached a maximum within the first 30 min postinduction, and most of the meta-

(N, Tf /10−6 ) CPU

(1, 1) 0.62

(1, 4) 1.32

(1, 16) 2.98

(1, 64) 9.56

(1, 256) 35.81

(1, 1024) 142.08

σ 32

4.60

8.66

13.60

14.52

14.98

15.00

var(σ 32 )

4.41

8.11

12.22

13.13

13.73

14.66

Efficiency of nested SSA when N = 1. Since we used N0 = 1000 realizations of the Outer SSA to compute σ 32 and var(σ 32 ), the statistical errors on these quantities is about 0.2. For comparison, the actual values of these quantities are σ 32 = 14.8 ± 0.2,

var(σ 32 ) = 14.2 ± 0.2.

and the direct SSA took 19719.2 seconds of CPU time to compute them.

genome + struct

a6 =7.5d−6×genome×struct − −−−−−−−−−−−−−−−−−−−−−−−−−−−−−− →

virus

Fig. 2.1. Reaction network of the virus infection model. The catalytic reactions are represented by =⇒ lines. The nucleotides and amino acids are assumed to be available at constant concentrations therefore are omitted. .

Example: virus infection model

virus

a6 genome

a2

a1

template

a3

struct

a5

degraded secreted

a4 degraded . When the concentrations reacting species inin: a chemical system inGeneral virus infection modelof all proposed R. kinetic Srivastava, L. You, J. crease to infinity (so does the overall size of the system), it is well known that the forSummers and ward J. master Yin,equation J. Theor. 218, (2002). (2.3) describing Biol. discrete kinetic systems309-321 converges to a FokkerPlanck equation describing continuous diffusion processes [16]. The simplest way to derive the Fokker-Planck equation from the Kolmogorov master equation is the so called Kramers-Moyal expansion [17, 18], which was implicitly used by Einstein [19] in the original study of Brownian motions. The idea is to expand the difference oper-

The nucleotides and amino acids are assumed to be available at constant concentrations. The reacting species that need to be modeled are genome, struct, template and virus (Ns = 4). The quantity genome represents the vehicle of the viral genetic information which can take the form of DNA, positive-strand RNA, negative-strand RNA, or some other variants. The structural proteins making up the virus are denoted by struct. Template refers to the form of the nucleic acid that is transcribed and involved in catalytically synthesizing each viral component.

Reaction list in the virus model:

nucleotides nucleotides + genome (f ast) (f ast)

nucleotides + amino acids template struct genome + struct

template —— components genome mation

——

struct

——

template −−−−−−−→

gnome −−−−−−−−−−−→ template template −−−−−−−→

struct −−−−−−−−−−−→ degraded −−−−−−−−−−−→ degraded/secreted −−−−−−−−−−−→ virus

nucleic acids catalyzing the synthesis of the virus

DNA (RNA) transporting the viral genetic infor-

structural protein

Reactions nucleotides nucleotides + genome template genome + struct

Rates

template −−−−−−−→

genome −−−−−−−−−−−→ template −−−−−−−−−−−→ degraded −−−−−−−−−−−→ virus

a1 a2 a4 a6

= 1. × template = .025 × genome = .25 × template = 3.75d − 3 × genome2 × struct

Tf / CP U

1 154.8

4 461.3

16 2068.2

64 9190.9

template var(template)

4.027 5.401

3.947 5.254

3.796 5.007

3.757 4.882

Efficiency of the nested SSA for the virus infection model: the direct SSA cost was 34806.39 seconds of CPU time

The exact values are : template = 3.7170 ± 0.005,

var(template) = 4.9777 ± 0.005.

300

14000

direct simulation HMM

direct simulation HMM

12000

250

10000

200

8000

150 6000

100 4000

50

2000

0

0

100

200

300

400

500

600

700

800

900

1000

0

0

100

200

300

400

500

600

700

800

Left: Growth of the virus; Right: evolution of template

900

1000

Asymptotic techniques for singularly perturbed Markov processes: R. Z. Khasminsky, On Stochastic Processes Defined by Differential Equations with a Small Parameter, Theory Prob. Applications, 11:211–228, 1966. R. Z. Khasminsky, A Limit Theorem for the Solutions of Differential Equations with Random RightHand Sides, Theory Prob. Applications, 11:390–406, 1966. T. G. Kurtz, Semigroups of conditioned shifts and approximations of Markov processes, Annals of Probability, 3:618–642, 1975. G. Papanicolaou, Some probabilistic problems and methods in singular perturbations, Rocky Mountain J. Math, 6:653–673, 1976. M. I. Freidlin and A. D. Wentzell, Random perturbations of dynamical systems, 2nd edition, SpringerVerlag, 1998. G. Pavliotis and A. Stuart. Multiscale Methods: Averaging and Homogenization. (to appear)

Explicit integrators for stiff ODEs: C.W. Gear and I.G. Kevrekidis, “Projective methods for stiff differential equations: problems with gaps in their eigenvalue spectrum,” SIAM J. Sci. Comp. 24(4):109-110 (2003). K. Eriksson, C. Johnson, and A. Logg, “Explicit Time-stepping for stiff ODEs,” SIAM J. Sci. Comp. 25(4):1142-1157 (2003). V. I. Lebedev and S. I. Finogenov, “Explicit methods of second order for the solution of stiff systems of ordinary differential equations”, Zh. Vychisl. Mat. Mat Fiziki, 16:895–910 (1976). A. Abdulle, “Fourth order Chebychev methods with recurrence relations”, SIAM J. Sci. 23:2041–2054 (2002).

Comput.

HMM-like multiscale integrators for stochastic systems E. Vanden-Eijnden, “Numerical techniques for multiscale dynamical systems with stochastic effects”, Comm. Math. Sci., 1: 385–391 (2003). W. E, D. Liu and E. Vanden-Eijnden, “Analysis of Multiscale Methods for Stochastic Differential Equations,” Comm. Pure App. Math. 58: 1544–1585 (2005). E. Vanden-Eijnden, On HMM-like integrators and projective integration methods for systems with multiple time scales, Comm. Math. Sci. (2007).

HMM and Equation Free: W. E, B. Engquist, The heterogeneous multi-scale methods, Comm. Math. Sci. 1 (2003), 87–133. Kevrekidis, I. G.; Gear, C. W.; Hyman, J. M.; Panagiotis, G. K.; Runborg, O.; Theodoropoulos, C. Equation-Free, Coarse-Grained Multiscale Computation: Enabling Microscopic Simulators to Perform System-Level Analysis. Comm. Math. Sci. 1: 87–133 (2003).

Effective dynamics in L96: E. N. Lorenz, Predictability – A problem partly solved, pp. 1–18 in: ECMWF Seminar Proceedings on Predictability, Reading, United Kingdom, ECMWF, 1995. C. R¨ odenbeck, C. Beck, H. Kantz, Dynamical systems with time scale-separation: averaging, stochastic modeling, and central limit theorems, pp. 187–209 in: Stochastic Climate Models, P. Imkeller, J.-S. von Storch eds., Progress in Probability 49, Birkh¨ auser Verlag, Basel, 2001. I. Fatkullin and E. Vanden-Eijnden, A computational strategy for multiscale systems with applications to Lorenz 96 model, J. Comp. Phys. 200:605–638, 2004.

(Coupled) TBH: A. J. Majda and I. Timofeyev, Remarkable statistical behavior for truncated Burgers-Hopf dynamics, Proc. Nat. Acad. Sci. USA, 97:12413–12417, 2000. A. J. Majda and I. Timofeyev, Statistical mechanics for truncations of the Burgers-Hopf equation: a model for intrinsic stochastic behavior with scaling, Milan Journal of Mathematics, 70(1):39–96, 2002. A. J. Majda, I. Timofeyev, and E. Vanden-Eijnden, Systematic strategies for stochastic mode reduction in climate. J. Atmos. Sci., 60(14):1705–1722, 2003. A. J. Majda, I. Timofeyev, and E. Vanden-Eijnden, Stochastic models for selected slow variables in large deterministic systems, Nonlinearity, 19:769–794, 2006.

Stochastic Simulation Algorithm: D. T. Gillespie, J. Comp. Phys. 22, 403 (1976) A. B. Bortz, M. H. Kalos and J. L. Lebowitz, J. Comp. Phys. 17, 10 (1975)

Seamless scheme for Kinetic Monte-Carlo: W. E, D. Liu, E. Vanden-Eijnden, “Nested stochastic simulation algorithm for chemical kinetic systems with disparate rates,” J. Chem. Phys. 123: 194107 (2005) W. E, D. Liu, E. Vanden-Eijnden, “Nested stochastic simulation algorithm for chemical kinetic systems with multiple time-scales,” J. Comp. Phys. 221, 158–180 (2007).

Related works: Y. Cao, D. Gillespie, L. Petzold, “Multiscale Stochastic Simulation Algorithm with Stochastic Partial Equilibrium Assumption for Chemically Reacting Systems,” J. Comp. Phys. 206(2):395–411 (2005). A. Samant and D. G. Vlachos, “Overcoming stiffness in stochastic simulation stemming from partial equilibrium: A multiscale Monte Carlo algorithm.,” J. Chem. Phys. 123, 144114 (2005)

Suggest Documents