Adaptive enhanced sampling with a path-variable for ...

0 downloads 0 Views 6MB Size Report
Dec 6, 2017 - and τ1 =10 ps for the adaptive MD part, with an α -value of 3E-3. Each system has been simulated over. 10 ns. Dialanine simulations.
Adaptive enhanced sampling with a path-variable for the simulation of protein folding and aggregation - Supplementary information Emanuel K. Peter1 Institute of Physical and Theoretical Chemistry, Department of Pharmacy and Chemistry, University of Regensburg, Germanya) (Dated: 6 December 2017)

I. A.

does not affect the transition state, which is valid in our case, since the transition path L† :

METHODS Kinetics

We assume that time-dependent processes can be described in dependency of a quantity q(X). The transition of q(X)(t0 ) occurs over a transition state q(X)† to q(X)(t) at time t : q(X)(t0 )  q(X)†  q(X)(t) .

q(X)† . q(X)(t0 )

(2)

The reaction follows the differential equation : dq(X)(t) = kq(X)† = kK † q(X)(t0 ) . dt

(3)

(4)

The equilibria between q(X)(t0 ) and q(X)(t) are shifted through the application of enhanced sampling in dependency of an acceleration factor αb and the equilibrium constant Kb† . That is expressed by : dq(X)(t) = αb kKb† q(X)(t0 ) . dt

(5)

If ∆G† equals approximately the biased transition state ∆G†b , which means that the application of the bias

a) Electronic

mail: [email protected]

is not affected by definition through addition of the bias and the rate constant α(t) of the biased simulation equals : kb ≈ αb−1 ν exp(−∆G† /(kB T )) ,

(7)

and we finally can state that the scaling factor αb−1 in the biased simulation follows a linear relation :

αb −1 ≈

kb . kl

B.

Program, system preparation and parameters

1.

System preparation

(8)

Water simulations

The definition of ∆G† = −kB T ln K † leads to the expression for the rate constant : kl = ν exp(−∆G† /(kB T )) .

(6)

(1)

We define the equilibrium-constant K † for the first partial reaction (Thereby assuming that the reaction from q(X)† (t) to q(X)(t) is quasi irreversible, while the first transition is considered as a equilibrium between q(X)(t0 ) and q(X)† . That is the case, if the observed time-frame for one transition along the trajectory is finite and on the order of ps to ns.) :

K† =

dL† =0, ds

For simulations of water, we filled a box with dimensions 3×3×3 nm3 with 895 SPC/E waters (gmx editconf, gmx genbox). In adaptive MD simulations, we varied α00 from 0 to 3E-3 and τ from 1 to 10 ps. For Path-sampling and hybrid algorithm simulations, we used a smaller cubic box with a box-length of 2 nm and filled it with 639 SPC/E waters. We varied W from 0 to 0.01 kJ/mol and τ from 1 to 10 ps. For the hybrid simulation, we used W = 0.01 kJ/mol and τ2 = 1 ps for Path-sampling and τ1 =10 ps for the adaptive MD part, with an α00 -value of 3E-3. Each system has been simulated over 10 ns. Dialanine simulations For simulations of Dialanine, we centered Ace-AlaNme in a cubic box with a box-length of 2.26 nm and added 371 SPC/E waters. In adaptive MD simulations, we varied α00 from 0 to 5e-4 (β = 0) (0 to 4e-4 β = 1, and one set with only every 5th atom considered by the algorithm with β = 1) with τ =10 ps. For Path-sampling, we used W = 0.5 kJ/mol

2 SPC/E waters and added 1 chloride ion. In 3 adaptive MD simulations, we varied α00 from 0 to 5E-6 with τ =10 ps (β=1). In 2 Path-sampling simulations, we used W=2.0 (τ =10 ps) and W=0.1 kJ/mol (τ =1 ps). In one hybrid simulation, we used W=0.1 kJ/mol (τ2 =1 ps) for Path-sampling and α00 =5E-6 (τ1 =10 ps) for the adaptive MD part (β=1). We propagated each system over 100 ns. (For the determination of the RMSD, we used the NMR-structure : 1L2Y model #1 as reference.) Aβ 25-35 hexamer simulations For simulations of the Aβ25 − 35 hexamer, we distributed (gmx genbox) 6 extended Aβ 2535 monomers (GSNKGAIIGLM) in a cubic box with 6 nm box-length, filled the box with 6811 SPC/E waters and added 6 chloride ions. In 2 adaptive MD simulations, we varied α00 from 0 to 5E-6 (τ =10 ps). In 2 Path-sampling simulations, we used W’=0.25 kJ/mol (τ =10 ps, τ =1 ps). In 1 hybrid simulation, we used W’=0.25 kJ/mol (τ1 =10 ps), α00 =5E-6 (τ2 =1 ps) (β=1). We propagated each system over 100 ns.

C.

Fig. 1S. Results from simulations of SPC/E water using the novel adaptive bias implementations. Each atom has been coupled to the new algorithms in dependency of the coupling parameters α00 , β and τ in general adaptive bias MD. In Pathsampling simulations, water has been sampled using different heights of the Gaussians W and τ . (α00 = 0 corresponds to the un-biased simulation result.) We used one single simulation in order to test the hybrid algorithm, which connects general adaptive bias MD with Path-sampling. (a) Diffusion coefficients as function of α00 . Path-sampling results are shown at a corresponding value of α00 = 0. (b) Center-to-center radial distribution functions g(r) of OW-OW (water-oxygens), OWHW (water-oxygen, water-hydrogen) and HW-HW (waterhydrogen, water-hydrogen) averaged over Path-sampling and hybrid trajectories (the unbiased result corresponds to W = 0 with no applied bias with the same algorithm). (c) Adaptive bias MD averaged radial distribution functions g(r) for different coupling parameters α00 and τ . The unbiased result corresponds to α00 = 0.

In simulations of hexameric Aβ 25-35 in explicit solvent, we extracted the peptide configurations from the trajectory after 203762 ps (1) (Path-sampling, W’=0.25 kJ/mol, τ =10 ps), 195928 ps (2) (Path-sampling, W’=0.25 kJ/mol, τ =1 ps) and 190365 ps (3) (Hybrid, W’=0.25 kJ/mol, τ1 =10 ps, τ2 =1 ps), after the collapse of the oligomer has occurred. We then restricted the box around the oligomer with a minimum distance of 0.5 nm of the peptide assembly to the boundaries of the box, and refilled each box with SPC/E water (6520 SPC/E waters, 4 sodium ions - 2407 (1), 2755 (2), 4966 (3) SPC/E waters, 6 chloride ions, Aβ 25-35). We restarted the simulations then. That procedure allows a more efficient sampling of a higher density of states, since each CV is defined proportional to the box-diameter as the maximal length. At the same time, MD sampling is accelerated due to a reduction of the number of particles.

D.

and τ =1 ps. For the hybrid case, we used W = 0.5 kJ/mol, τ2 =1 ps for the Path-sampling component, and α00 =1E-5 with τ1 =10 ps for the general adaptive bias MD part (β=1). We propagated each system over 100 ns. TrpCage simulations For simulations of TrpCage1 , we centered the peptide (NLYIQWLKDGGPSSGRPPPS) in a cubic box with a box-length of 7.022 nm, filled the box with 11424

Restriction of box volume

Computational efficiency

Adaptive bias MD, Path-sampling and the hybrid algorithm have been implemented in separate programs into the gmx mdrun module of the GROMACS-4.5.5 package2 (/src/kernel/md.c), with a change in the domain-decomposition routines (/src/mdlib/domdec.c, ./domdec.h). The calculation time in terms of propagated time-steps is slowed down by a factor of approximately 1.4-1.5 compared to the un-modified MD-module. Compared to the efficiency in trajectory space (we compared water, dialanine, TrpCage and Aβ 25-35) the ef-

3 ficiency factor αb−1 can range from 1 to above 1000 (see section : Kinetics, Dynamics of SPC/E water and forcecorrelations, Dynamics of Dialanine and Section Results), which is system dependent. The computational cost of the method is low compared to REMD simulations. We used 4 cores times 36 hours (Intel Xeon Phi) for each Dialanine- and water-simulation. We used 16-32 cores times 1.5-2.5 weeks for each of the adaptive bias- and Path-sampling-simulations. A comparable REMD simulation to study the folding of a peptide system uses approximately 25-30 replicas with 200-240 cores and currently is run over 3 weeks to obtain 300-350 ns of MD sampling3 .

tive bias MD α00 value of 3E-3. We observe an absolute overlap of all RDFs of the Path-sampling and the hybrid result with the unbiased RDFs.

III. A.

PSEUDO-CODES Pseudo-code : Adaptive bias MD :

• Loop over MD-timesteps – Determine p1 and p2 at timesteps t1 and t2 = t1 + dt. – Calculate dp = p2 − p1 .

II.

– Determine q1 and q2 at timesteps t1 and t2 = t1 + dt.

DYNAMICS AND STRUCTURE OF SPC/E WATER

We applied the general adaptive bias MD, the Pathsampling and the hybrid algorithm on SPC/E water in the bulk. We intended to investigate the properties of the 3 novel implementations. For the general adaptive bias MD implementation, we varied the coupling time τ and α00 and analyzed the dynamical properties as well as the structural properties of SPC/E water (see Figure 1S and 2S). In the analysis of the self-diffusion coefficients D of water, we observe that D depends from the coupling time τ . We find for a low coupling time τ of 1 ps, that D only slightly varies from 4.0E − 5 cm2 /s at α00 = 0 ( corresponding to the unbiased MD value ) to 9.9E − 5 cm2 /s at α00 = 3E − 3, with a resulting scaling factor αb−1 of 2.5 (β = 0) (see Figure 1S a). We find an increase of the self-diffusion with larger τ -values of 5 and 10 ps as function of α00 . With a value for τ of 10 ps, we obtain a scaling of the self-diffusion of water to a value of 39.6E-5 cm2 /s equivalent to a value of αb−1 = 9.9 (β = 0). When we apply β = 1, we observe a scaling of D to a value of 41.4E-5 cm2 /s corresponding to a value of αb−1 of 10.35. In order to investigate the structural properties of the simulated systems, we analyzed the radial distribution functions (RDF) of water. We find that for any α00 value and any value for τ , the RDFs between water-oxygen water-oxygen, water-hydrogen water-oxygen and waterhydrogen water-hydrogen only change within values in the range of 0.1 in between the main, the second and the third coordination maxima (see Figure 1S c and Figure 2S a, b). The maxima themselves are approximately identical with the unbiased result at α00 = 0. As an alternative dynamical measure, we investigated the hydrogen-bond lifetime as function of α00 . Here, we find a decay of the lifetime from approximately 2.2 ps to values in the range from 1.47 to 1.75 ps at α00 = 3E − 3 (see Figure 2S c). We also applied the Path-sampling implementation on the same system. We obtain self-diffusion coefficients D nearly identical with the unbiased result in the range from 5 to 5.2E-5 cm2 /s, while we do not observe any variations in the RDFs for different coupling values (see Figure 1S a, b). For the hybrid result, we find that the self-diffusion D is at 5.7E-5 cm2 /s at an general adap-

– Calculate dq = q2 − q1 . – Calculate dL = (p + dp)dq. – If time-interval equals any multiple value of τ , calculate derivative of dL over time-interval τ . – Calculate gradient

dL dτ .

– Add bias dL dτ over period τ , while generating random-numbers ξ and re-evaluating α00 .

B.

Pseudo-code : A general adaptive bias - path-sampling

• Loop over MD-timesteps – Determine p1 and p2 at timesteps t1 and t2 = t1 + dt. – Calculate dp = p2 − p1 . – Determine q1 and q2 at timesteps t1 and t2 = t1 + dt. – Calculate dq = q2 − q1 . – Calculate L(t) = L(t − dt) + (p + dp)dq. – If time-interval equals any multiple value of τ , calculate Φ(t) = Φ(t − τ ) using equations 21, 31 and 32 shown in the main body of the text. – If time-interval equals any multiple value of τ , σ(t0b ) = σ(tb ). – Calculate gradient F using F =

dΦ(t) d(L/|L|) .

– Add gradient F .

C.

Pseudo-code : Hybrid algorithm

• Loop over MD-timesteps – Determine p1 and p2 at timesteps t1 and t2 = t1 + dt. – Calculate dp = p2 − p1 .

4 – Determine q1 and q2 at timesteps t1 and t2 = t1 + dt. – Calculate dq = q2 − q1 . – Calculate dL = (p + dp)dq. – If time-interval equals any multiple value of τ , calculate derivative of dL over time-interval τ . – Calculate gradient

dL dτ .

– Add bias dL dτ over period τ , while generating random-numbers ξ and re-evaluating α00 . – Calculate L(t) = L(t − dt) + (p + dp)dq. – If time-interval equals any multiple value of τ , calculate Φ(t) = Φ(t − τ ) using equations 21, 31 and 32 shown in the main body of the text. – If time-interval equals any multiple value of τ , σ(t0b ) = σ(tb ). – Calculate gradient F using F = – Add gradient F .

dΦ(t) d(L/|L|) .

5

Fig. 2S. Results from simulations of SPC/E water using the novel general adaptive bias MD method with different coupling strengths. (a) Radial distribution function of water-hydrogen water-hydrogen HW-HW. (b) Radial distribution function of water-oxygen water-hydrogen OW-HW. (c) Hydrogen-bond lifetimes in SPC/E water with different coupling strengths.

6

Fig. 3S. Free energy landscapes (FEL) of Dialanine in water simulated with general adaptive bias MD at different coupling strengths as function of dihedral angles Φ and Ψ. (a-g) FEL from simulations using β = 0 with different coupling strengths. (h-m) FEL from simulations using β = 1 with different coupling strengths.

7

Fig. 4S. Results from simulations of TrpCage with the general adaptive bias MD and the Path-sampling algorithm using different parameters in independent simulations with 100 ns length. (a) RM SDCα−Cα to the native structure (PDB: 1L2Y1 ) as function of simulation time from general adaptive bias MD simulations and Path-sampling. (b-d) Free energy landscapes from general adaptive bias MD simulations with different coupling parameters and τ -values. (e,f) Results from Path-sampling trajectories. The coupling parameters are indicated below each graph.

8

Fig. 5S. Results from simulations of hexameric Aβ 25-35 with the general adaptive bias MD and the Path-sampling algorithm using different parameters in independent simulations with a total simulation time ranging from 200 to 300 ns. (a-d) Radii of gyration as function of simulation time, for (a,b) general adaptive bias MD, (c,d) Path-sampling simulations. (e-h) Free energy landscapes from the same simulations with different coupling parameters as function of radius of gyration and the RM SDCα−Cα to the final structure.

9 1 J.

W. Neidigh and R. M. Fesinmeyer, Nat.Struct.Biol. 9, 425–430 (2002). 2 B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl, J. Chem.

Theory Comput. 4, 435–447 (2008). K. Peter, M. Agarwal, B.-K. Kim, I. V. Pivkin, and J.-E. Shea, J. Chem. Phys. 141, 22D511 (2014).

3 E.