Frequency adaptive metadynamics for the calculation ...

0 downloads 0 Views 1MB Size Report
Frequency adaptive metadynamics for the calculation of rare-event kinetics. 1 ... requires the bias to be added slowly, thus hampering applications to processes ...
1

Frequency adaptive metadynamics for the calculation of rare-event kinetics

2

Yong Wang,1 Omar Valsson,2 Pratyush Tiwary,3 Michele Parrinello,4, 5 and Kresten

3

Lindorff-Larsen1, a)

4

1)

Structural Biology and NMR Laboratory, Linderstrøm-

5

Lang Centre for Protein Science, Department of Biology,

6

University of Copenhagen, Ole Maaløes Vej 5 DK-2200 Copenhagen N,

7

Denmark

8

9

10

2)

Max Planck Institute for Polymer Research, Ackermannweg 10, D-55128 Mainz,

Germany 3)

Department of Chemistry and Biochemistry and Institute for Physical

11

Science and Technology, University of Maryland, College Park 20742,

12

USA.

13

4)

Department of Chemistry and Applied Biosciences, ETH Zurich,

14

c/o USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano,

15

Switzerland

16

5)

Facolt`a di Informatica, Instituto di Scienze Computationali,

17

Universit`a della Svizzera Italiana, CH-6900 Lugano, Switzerland

18

(Dated: 25 April 2018)

1

19

The ability to predict accurate thermodynamic and kinetic properties in biomolecular

20

systems is of both scientific and practical utility. While both remain very difficult,

21

predictions of kinetics are particularly difficult because rates, in contrast to free

22

energies, depend on the route taken. For this reason, specific enhanced sampling

23

methods are needed to calculating long-time scale kinetics. It has recently been

24

demonstrated that it is possible to recover kinetics through so called ‘infrequent

25

metadynamics’ simulations, where the simulations are biased in a way that minimally

26

corrupts the dynamics of moving between metastable states. This method, however,

27

requires the bias to be added slowly, thus hampering applications to processes with

28

only modest separations of timescales. Here we present a frequency-adaptive strategy

29

which bridges normal and infrequent metadynamics. We show that this strategy can

30

improve the precision and accuracy of rate calculations at fixed computational cost,

31

and should be able to extend rate calculations for much slower kinetic processes.

32

Keywords: Binding and unbinding kinetics — Metadynamics — rate calculation

a)

Electronic mail: [email protected]

2

33

I.

INTRODUCTION

34

Accessing long timescales in molecular dynamics (MD) simulations remains a longstand-

35

ing challenge. Limited by the short time steps of a few femtoseconds, and despite recent

36

progress in methods and hardware1–4 , it remains difficult to access the timescales of millisec-

37

onds and beyond. This leaves many applications outside the realm of MD, including many

38

structural rearrangements in biomolecules and binding and unbinding events of ligands and

39

drug molecules. Many of these reactions are so called ‘rare events’ where the time scale of

40

the reaction is orders of magnitude longer than the time it takes to cross the barrier between

41

the states. In such cases, most of the simulation time ends up being used in simulating the

42

local fluctuations inside free energy basins.

43

To study phenomena that involve basin-to-basin transitions that occur on long timescales,

44

one typically has to rely on some combination of machine parallelism5,6 , advanced strategies

45

for analysing simulations7–9 and enhanced sampling methods10–15 that allow for efficient

46

exploration of phase space. Metadynamics is one such popular and now commonly used

47

enhanced sampling method that involves the periodic application of a history-dependent

48

biasing potential to a few selected degrees of freedom, typically also called collective variables

49

(CVs)16 . Through this bias, the system is discouraged from getting trapped in low energy

50

basins, and one can observe processes that would be far beyond the timescales accessible

51

in normal MD, while still maintaining complete atomic resolution. Originally designed to

52

explore and reconstruct the equilibrium free energy surface17 , it has recently been shown

53

that metadynamics can also be used to calculate kinetic properties3,18–29 . In particular,

54

inspired by the pioneering work of Grubm¨ uller30 and Voter31 , it has recently been shown

55

that the unbiased kinetics can be correctly recovered from metadynamics simulations using

56

a method called infrequent metadynamics (InMetaD)18 .

57

The basic idea in InMetaD is to add a bias to the free energy landscape sufficiently

58

infrequently so that only the free energy basins, but not the free energy barriers, experience

59

the biasing potential, V (s, t), where t is the simulation time and s is one or more CV’s

60

chosen to distinguish the different free energy basins. By adding bias infrequently enough

61

compared to the time spent in barrier regions, the landscape can be filled up to speed up

62

the transitions without perturbing the sequence of state-to-state transitions. The effect of 3

63

the bias on time can then be reweighted through an acceleration factor:

64

α(τ ) = heβ(V (s,t)) i

65

Here β is the inverse temperature and the angular brackets denote an average over a meta-

66

dynamics run up until the simulation time τ .

(1)

67

In the application of InMetaD to recover the correct rates, there are two key assump-

68

tions. First, the state-to-state transitions are of the rare event type: namely, the system

69

is trapped in a basin for a duration long enough that memory of the previous history is

70

lost, and when the system does translocate into another basin, it does so rather rapidly.

71

Second, it is important that no substantial bias is added to the transition state (TS) region

72

during the simulation. This requirement can be achieved by adjusting the bias deposition

73

frequency, determined by the time between two instances of bias deposition. If the deposi-

74

tion frequency is kept low enough, it becomes possible to keep the TS region unbiased. This

75

second requirement, however, also means that it takes longer to fill up the basin, revealing

76

one of the practical limitations of applying InMetaD. It is this second requirement that we

77

address and improve in this work.

78

In particular, we asked ourselves the question, can we design a bias deposition scheme so

79

that the frequency is high near the bottoms of the free energy basins, but decreases gradually

80

so as to lower the risk of biasing the TS regions? Our scheme, which we term Frequency-

81

Adaptive Metadynamics (FaMetaD), is illustrated using a ligand (benzene) unbinding from

82

the T4 lysozyme (T4L) L99A mutant as an example (Fig. 1). In particular, we designed a

83

strategy that uses a high frequency at the beginning of the simulation (to fill up the basin

84

quickly) and then slows progressively down (to minimize the risk of perturbing the TS).

85

In this way, we aim to improve the reliability, accuracy and robustness of the calculations

86

without additional computational cost. We evaluated the performance of FaMetaD using

87

three different types of problems with different complexities. We first test the method on a

88

relatively simply conformational exchange in a short peptide, which nevertheless occurs on

89

a time scale of ten microseconds. We then apply the method to study the unfolding rates

90

of two small proteins which have unfolding times in the multi-microsecond regime. Finally,

91

we apply it to ligand binding and unbinding reactions in T4L, which occurs on a time scale

92

of milliseconds, that is difficult to access with standard MD simulations. We find that the

93

FaMetaD in general provides a robust and efficient method to estimate rare-event kinetics. 4

Frequency-adaptive Scheme

τc

τdep(t) (ps)

100

frequency in InMetaD

10

1

τ0 0

frequency in MetaD 20 40 60 80 100 Metadynamics Simulation Time (ns)

Figure 1:

FIG. 1. A schematic picture to show the application of frequency-adaptive metadynamics (FaMetaD) to calculate the off-rate of protein-ligand systems. The protein-ligand system (left panel) is benzene (light and dark blue spheres) binding to the L99A mutant of T4 lysozyme (green cartoon)26 . Frequency-adaptive metadynamics fills the free energy basin quickly at the beginning of the simulation, but adds the bias slowly at the latter stage when the system moves close to the transition state regions. The right panel shows a typical trajectory of how the time between deposition of the bias is adjusted on-the-fly in a FaMetaD run. The typical frequencies used in normal metadynamics (τ0 ) and InMetaD (τc ) are labeled by black arrows. 94

95

II.

METHODS Our adaptive frequency scheme reads: τdep (t) = min{τ0 · max{

96

α(t) , 1}, τc } θ

(2)

97

where τ0 is the initial deposition time between adding a bias, similar to the relatively short

98

time in normal metadynamics, and τc is a cut-off value for the frequency equal to or larger

99

than the deposition time originally proposed in InMetaD. These two parameters bridge the

100

normal metadynamics and InMetaD, by controlling the minimal and maximal bias depo-

101

sition frequency. α(t) is the instantaneous acceleration factor at simulation time t in a

102

metadynamics simulation.

103

Through the starting deposition time τ0 , we1 can modulate the enhancement in our ability

104

to fill free energy wells, relative to an InMetaD run performed with a constant stride of τc .

105

For practical considerations the choice of τc is dependent on the computational resources. 5

106

The key free parameter here is θ, the threshold value for the acceleration factor to trigger the

107

gradual change from normal (frequent) metadynamics to InMetaD. The choice of θ requires

108

some considerations. On one hand, θ should be as large as possible to delay the switch from

109

normal metadynamics to InMetaD so as to get significant enhancement in basin filling. On

110

the other hand, too large θ could lead to problems that a transition might occur when the

111

frequency τdep (t) is still less than the typical duration τc of a reactive trajectory crossing

112

over from one basin to another.

113

We now estimate a value for θ in terms of the expected transition time τexp available from

114

either experimental measurements or an estimate, the simulation time, τsim (determined also

115

by computational resources), and a ‘safety coefficient’ Cs . We seek the following to hold

116

true: θ≤

117

τexp . τsim Cs

(3)

118

If we use a protein-drug system as an example, and we (i) can estimate the residence time

119

to be roughly on the order of a second (τexp = 1s), (ii) aim to observe the transition

120

within a 100 ns metadynamics run (τsim = 100ns), and (iii) use Cs = 102 to counteract

121

the risk that a transition time falls in the long tail of Poisson distribution, we thus obtain

122

θ =

123

value of θ this leads to perturbed (and erroneous) kinetics; an error that may be detected by

124

testing whether the observed distribution of transition times follows a Poisson distribution32 .

125

As per Eq. (2), the frequency is also changed as a function of α(t). In practice, we

126

observed significant fluctuation of τdep (t), that might cause problems if τdep (t) is small at

127

the transition time point. Therefore we finally modified Eq. (2) to become a monotonously

128

increasing function:

1s 100ns∗102

= 105 . As we demonstrate by an example below, if one chooses too high a

0

129

τdep (t) = max(τdep ((N − 1)∆t), τdep (N ∆t))

130

where τdep (N ∆t) is the instantaneous deposition time at step N in a metadynamics sim-

131

ulation with a MD time step ∆t. The frequency-adaptive scheme was implemented in a

132

development version of the PLUMED2.2 code33 . 6

(4)

TABLE I. Conformational Transition Times of Ace-Ala3-Nme Parameters: τ0 (ps), τc (ps), h (kJ/mol) τslow (µs) P-value Cost (µs) Set Unbiased MDa

T=300K

11±2

InMetaDb

τ0 = 200,h = 0.4

16±5

0.4±0.3

2.8

τ0 = 100,h = 0.4

12±3

0.3±0.2

1.5

A

τ0 = 40,h = 0.4

19±6

0.3±0.2

0.8

B

τ0 = 20,h = 0.4

16±5

0.1±0.2

0.5

C

τ0 = 10,h = 0.4

32±15

0.1±0.1

0.3

D

τ0 = 5,h = 0.4

92±38

0.02±0.05

0.2

E

τ0 = 2,h = 0.4

98±61

0.01±0.01

0.1

F

τ0 = 2,τc = 200,θ = 1,h = 0.4

15±3

0.4±0.3

1.5

A

τ0 = 2,τc = 200,θ = 10,h = 0.4

14±3

0.5±0.3

0.6

B

τ0 = 2,τc = 200,θ = 40,h = 0.4

19±6

0.2±0.2

0.4

C

τ0 = 2,τc = 200,θ = 100,h = 0.4

24±7

0.1±0.1

0.3

D

τ0 = 2,τc = 200,θ = 500,h = 0.4

29±10

0.1±0.2

0.2

E

τ0 = 2,τc = 200,θ = 10000,h = 0.4

24±11

0.02±0.03

0.1

F

FaMetaDb

a

From ref.26 .

b

300

We estimated τslow from 40 independent simulations for each set of parameters,

and used a bootstrap analysis with 50 subsamples to estimate the errors.

133

III.

134

A.

RESULTS AND DISCUSSION Kinetics of conformational change of a four-state model system

135

To benchmark our results, we first consider a five-residue peptide (Nme-Ala3-Ace) as a

136

model system with a non-trivial free energy landscape involving multiple conformational

137

states26 . We consider the slowest state-to-state transition time, τslow , which has been es-

138

timated to be ∼ 11µs from unbiased MD26 , as the benchmark target. We used the same

139

computational setup and CVs as previously described26 .

140

As a baseline, we used InMetaD with a fixed height of Gaussian bias potential (h = 0.4

141

kJ/mol) but with different times of bias deposition ranging from 2 ps to 200 ps (40 runs

142

for each parameter set) (Table I and Fig. S1). The reliability of the calculated transition 7

143

times were verified a posteriori using a Kolmogorov-Smirnov test32 to examine whether

144

their cumulative distribution function is Poissonian. If the p-value is low (e.g. less than

145

0.05) then it is likely that the distribution has been perturbed because of too aggressive

146

application of the biasing potential. As expected, we find that simulations that used a low

147

frequency (τ0 ≥ 20 ps) gave rise to a consistent estimation of τslow , with values close to those

148

observed in unbiased MD, while the simulations with high frequency (τ0 ≤ 10 ps) resulted in

149

substantially longer and less reliable times (Table I and Fig. S1). This correlation between

150

the p-value and the deposition time in the InMetaD simulations can be explained by the fact

151

that the higher bias frequency results in higher risk of biasing the TS regions and more non-

152

Poissonian distribution. We also note that the transition times appear to be over-estimated

153

in these cases.

154

To compare the FaMetaD simulations with the InMetaD baseline, we designed six sets

155

(A to F) of FaMetaD simulations to have comparable computational costs (Table I and

156

Fig. 2). Overall, the results of FaMetaD show the same trends as InMetaD and support

157

the same conclusions. In other words, both InMetaD and FaMetaD reveal the trade-off

158

between reliability, accuracy and computational cost. There are, however, some important

159

differences that highlight the improvement obtained through FaMetaD. First, in each case

160

there is a small but notable improvement in the reliability of the calculations, as judged by

161

the greater p-values that suggest that FaMetaD perturbs the TS less than InMetaD. Most

162

important, however, is the accuracy of the calculations. While both InMetaD and FaMetaD

163

achieve accurate results when using the most conservative parameters (sets A–C), there

164

are dramatic differences when more aggressive parameters are used (D–F). Importantly,

165

we observe a much more ‘gracious’ decline in accuracy with more aggressive parameters in

166

FaMetaD comparing to InMetaD (Fig. S2). Thus together these results suggest that the

167

frequency-adaptive scheme can further improve not only the reliability but also the accuracy

168

of the calculation, without the need of increasing computational burden, and also that the

169

reliability is more robust to the choice of the simulation parameters.

170

We also checked the robustness of our method with respect to the choice of the so-called

171

bias factor, γ, which is defined as the ratio of the effective CV temperature and system

172

temperature34 . During a metadynamics simulation γ controls the rate at which the height

173

of the individual Gaussians in the bias potential is decreased, and is a key parameter in

174

well-tempered metadynamics. We therefore performed a series of FaMetaD simulations with 8

4

Parameters

10 100

More aggressive 500

40

τ0(ps)

20 40

100 10

10

θ

5 2

τslow (µs)

1

FaMetaD InMetaD

100

Increased accuracy 50 0 Increased reliability

p-value

0.6 0.3

Cost (µs)

0 2

Increased efficiency

1 0 A

B

C

D

E

F

FIG. 2. Comparing the accuracy, reliability and efficiency of frequency-adaptive metadynamics and infrequent metadynamics. The upper panel shows the key parameters (τ0 in InMetaD, θ in FaMetaD) of the six sets (A to F) pairs simulations. The middle panels show a comparison of the accuracy and reliability of the results. The grey line shows the τslow obtained from unbiased MD simulations. The bottom panel shows the computational cost of each set of simulations.

175

different values of γ, and determined the effect on the estimated rates in the pentapeptide

176

(Table S1). We find that varying γ 50-fold (from 2–100, including γ = 15 which is the value

177

used in our other simulations) has essentially no effect on the estimated rates, with the results

178

at very small γ (approaching unbiased MD) and very high γ (approaching non-tempered

179

metadynamics) being within error. Whereas the rates and their reliability (including the

180

estimated p-values in the KS test) are not substantially affected by changing γ in the range

181

tested, the value of γ does affect the efficiency of the simulations. Increasing γ corresponds 9

182

to an initially more aggressive bias deposition, and thus decreased computational cost. Due

183

to the switchover nature of the FaMetaD deposition schedule, increasing γ to very high

184

values does, however, not increase performance nor decrease accuracy.

185

B.

Kinetics of protein unfolding

186

We now consider the more complicated problem of protein unfolding using the ther-

187

mostable TC10b variant of Trp-Cage35 and the GTT variant of the Fip35 WW domain

188

(GTT36 ) as examples. The in silico unfolding times of these two proteins have previously

189

been determined from long-timescale simulations of reversible protein folding1 , giving us the

190

opportunity of comparing the rates from InMetaD and FaMetaD directly with those from

191

unbiased MD (τunf olding = 3 ± 1µs and 80 ± 24µs, respectively). The previous simulations

192

also determined the mean transition-path times, τT P , to be ∼ 0.2µs and ∼ 0.5µs, respec-

193

tively. Such relatively long transition path times could, in principle, make it difficult to use

194

the InMetaD and FaMetaD approaches, since even when the folded state basin is filled, it

195

still takes a substantial time to cross the barrier. Thus, these simulations also probe the

196

robustness of the methods in a system where the basic requirements of the approaches are

197

difficult to gauge or control. Simulations were performed using the same force field and

198

similar simulation parameters (to the extent possible) as those previously used1 , and the

199

fraction of native contacts, Q, as the CV. A simulation was deemed to give rise to an un-

200

folding event when Q dropped below 0.35, a value chosen by comparison with the unbiased

201

simulations1 . Our results show that, for both proteins and both InMetaD and FaMetaD, it

202

is possible to estimate the unfolding rates with good accuracy when compared to the results

203

from unbiased simulations (Table II). While it is difficult to converge the rate estimates, we

204

find that all rates are within a factor of two from the unbiased simulations, and generally

205

within error of one another. As above in the peptide system we note a small, but non-trivial

206

gain in performance with FaMetaD compared to InMetaD at comparable precision of the

207

estimated rates. 10

TABLE II. Unfolding Time Constants of Trp-Cage and a WW domain System

Method

Trp-Cagee

InMetaDa

2±1

0.3±0.3

0.5

FaMetaDb

3±1

0.4±0.2

0.1

Unbiased MD

3±1

InMetaDc

150±45

0.6±0.3

10

FaMetaDd

132±56

0.3±0.2

1.5

Unbiased MD

80±24

WWe

a

τunf olding (µs) p-value Computational Cost (µs)

210

1100

τ0 = 200 ps,h = 0.4 kJ/mol.

b

τ0 = 2 ps,τc = 200 ps,θ = 10,h = 0.4 kJ/mol.

c

d

τ0 = 2 ps,τc = 200 ps,θ = 1,h = 0.2 kJ/mol.

τ0 = 30 ps,h = 0.4 kJ/mol. e

We used the same force field (CHARMM22*37 ), ion concentrations and

temperatures (290K and 360K) for Trp-Cage and the GTT WW domain variant as in the earlier unbiased simulations1 . The error bars were the standard deviation obtained by a bootstrap analysis with 50 subsamples. The sample size is 15 for each set of simulations.

208

C.

Kinetics of protein-ligand binding

209

We consider now the case of a ligand binding to or escaping from a buried internal cavity

210

in the L99A mutant of T4 lysozyme, processes which occur on timescales of milliseconds or

211

more38–40 . To benchmark the results, we performed three sets of metadynamics simulations,

212

up to 26µs in total (including 12 µs InMetaD simulations of T4L L99A with benzene from

213

our recent work26 ). In each set, we compared the results of InMetaD with that of FaMetaD.

214

BN Z BN Z In set 1 and 2 we calculated the time constants for binding (τon ) and unbinding (τof f ) of

215

IN D benzene, while we in set 3 calculated the time for indole to escape the pocked (τof f ). We

216

performed 20 independent runs for each set of simulations (using the CHARMM22* force

217

field37 for protein and CGenFF force field41 for the ligands) to collect the transition times,

218

from which we obtained τon and τof f (Table III).

219

BN Z Again, we find consistency between the results of InMetaD and FaMetaD, with e.g. τon

220

BN Z and τof to be ∼ 10 ms and ∼ 170 ms, respectively. As previously described26 , we can use f

11

TABLE III. Binding and Unbinding Times of T4L L99A with Benzene and Indole Methods Parameters: τ0 (ps), τc (ps), h (kJ/mol) Time (ms) p-value Cost (µs) BN Z ) Set 1: Benzene Binding (τon

InMetaDa

τ0 = 40,h = 0.4

9±5

0.1±0.1

4.4

FaMetaD

τ0 = 1,τc = 100,h = 0.2,θ = 103

14±7

0.2±0.1

3.0

BN Z ) Set 2: Benzene Unbinding (τof f

InMetaDa

τ0 = 100,h = 0.2

168±59

0.4±0.3

6.7

FaMetaD

τ0 = 1,τc = 100,h = 0.2,θ = 103

176±68

0.3±0.2

5.5

IN D ) Set 3: Indole Unbinding (τof f

InMetaD

τ0 = 100,h = 0.2

102±87

0.2±0.2

4.5

FaMetaD

τ0 = 1,τc = 100,h = 0.2,θ = 104

168±95

0.2±0.1

2.0 26

a

The results are from our recent work26 .

b

In all simulations, the ligand concentration

is ∼ 5 mM. The error bars were the standard deviation obtained by a bootstrap analysis with 50 subsamples. The sample size is 20 for each set of simulations.

221

these values to estimate the binding free energy to be ∆Gbinding ≈ −21 kJ/mol, a value that

222

agrees well with calorimetric42 and NMR38 measurements.

223

In set 1, the InMetaD simulation was performed with τ0 = 40 ps and h = 0.4 kJ/mol.

224

The frequency-adaptive scheme allows us to perform FaMetaD with weaker bias (h = 0.2

225

kJ/mol) and ending with longer and more conservative deposition time (τc = 100 ps). This

226

parameter set used less simulation time but resulted in a more reliable estimation. In set

227

2, the FaMetaD simulation was performed using the same τc = 100 ps and h = 0.2 kJ/mol

228

as that used in the InMetaD simulation, but with θ = 103 . Given that τexp ∼ 10 − 100 ms,

229

τsim ∼ 100 ns and Cs = 102 , according to Eq. (3), this allows us to judge that θ = 103 is a

230

BN Z fairly conservative parameter. The estimated values of τof in this set are indistinguishable f

231

between InMetaD and FaMetaD within the error bars but at slightly less computational cost.

232

In set 3, we used similar parameters as in set 2 except a bit more aggressive θ = 104 . Again,

233

IN D this parameter set resulted in τof f of 168 ± 95 ms from FaMetaD that is reasonably close

234

to the estimation from InMetaD. Remarkably, FaMetaD allowed us to reduce more than 12

235

half the computational cost, without loss of accuracy. Overall, the application in the case

236

of L99A binding with two ligands suggests that the frequency-adaptive scheme can improve

237

the reliability-accuracy-efficency balance of metadynamics on kinetics calculation.

238

IV.

CONCLUSIONS

239

Many biological processes occur far from equilibrium, and kinetic properties can play

240

an important role in biology and biochemistry. For example, there has been continued

241

interest in determining ligand residence times in the context of drug optimization43 , and

242

millisecond conformational dynamics in an enzyme has been shown to underlie an intriguing

243

phenomenon of kinetic cooperativity of relevance to disease44 .

244

The principle of microscopic reversibility, however, sets limits to how the kinetic prop-

245

erties can be varied independently of thermodynamics. Thus, for example, increasing the

246

residence time of a ligand will also increase its thermodynamic affinity, unless there is also

247

a simultaneous drop in the rate of binding. Thus, in practice it may be in many cases be

248

difficult to disentangle the effects of kinetics and thermodynamics.

249

Taken together, the above considerations suggest the need for improved methods for un-

250

derstanding and ultimately predicting the rate constants of biological processes. Further,

251

the ability to calculate the kinetics of conformational exchange or ligand binding and un-

252

binding provides an alternative approach to determine equilibrium properties26 . For these

253

and other reasons, several methods have been developed to enable the estimation of kinetics

254

from simulations15 .

255

We have here proposed a modification to the already very powerful InMetaD algorithm,

256

which leads to a further improvement in the reliability and accuracy in recovering unbiased

257

transition rates from biased metadynamics simulations. The basic idea in the resulting

258

FaMetaD approach is that, by filling up the basin more rapidly in the beginning and only

259

using an infrequent bias near the barrier, we may spend the computational time where it is

260

needed. We anticipate that our scheme will prove particularly useful in two different aspects.

261

First, by enabling a decreased bias in the transition state region, we obtain more accurate

262

kinetics at fixed computational cost, leading also to the observed robustness of our approach.

263

Second, in the case of large barriers, it would be prohibitively slow to use a very infrequent

264

bias through the entire duration of the simulation. Thus, the FaMetaD provides a practical 13

265

approach to study rare events that involve escaping deep free energy minima, and which

266

occur on timescales well beyond those accessible to current simulation methods. Here we

267

have opted to examine processes that can be studied using both InMetaD and FaMetaD,

268

but in the future we aim to apply this approach to barrier crossing events that occur on

269

even longer timescales. With more examples in hand, we also expect that in the future it

270

may be possible to design improved biasing schemes to increase the computational efficiency

271

further, and at the same time retain the robustness of the FaMetaD approach.

272

V.

See Supplementary Material for supplementary figures and table.

273

274

SUPPLEMENTARY MATERIAL

VI.

ACKNOWLEDGEMENT

The authors thank Tristan Bereau and Claudio Perego for a critical reading of the

275

276

manuscript.

277

Nordisk Foundation and the BRAINSTRUC initiative from the Lundbeck Foundation. M.P.

278

acknowledges funding from the National Centre for Computational Design and Discovery of

279

Novel Materials MARVEL and European Union grant ERC-2014-AdG-670227/VARMET.

280

REFERENCES

281

1

K. Lindorff-Larsen, S. Piana, R. O. Dror, and D. E. Shaw, “How fast-folding proteins fold,” Science 334, 517–520 (2011).

282

283

K.L.-L. acknowledges funding by a Hallas-Møller Stipend from the Novo

2

I. Buch, T. Giorgino, and G. De Fabritiis, “Complete reconstruction of an enzyme-inhibitor

284

binding process by molecular dynamics simulations,” Proc Natl Acad Sci 108, 10184–10189

285

(2011).

286

3

P. Tiwary, V. Limongelli, M. Salvalaglio, and M. Parrinello, “Kinetics of protein–ligand

287

unbinding: Predicting pathways, rates, and rate-limiting steps,” Proc Natl Acad Sci 112,

288

E386–E391 (2015).

289

4

F. Paul, C. Wehmeyer, E. T. Abualrous, H. Wu, M. D. Crabtree, J. Sch¨oneberg, J. Clarke,

290

C. Freund, T. R. Weikl, and F. No´e, “Protein-peptide association kinetics beyond the

291

seconds timescale from atomistic simulations,” Nature Communications 8, 1095 (2017). 14

292

5

a computational microscope for molecular biology,” Annu Rev Biophys 41, 429–452 (2012).

293

294

6

T. J. Lane, D. Shukla, K. A. Beauchamp, and V. S. Pande, “To milliseconds and beyond: challenges in the simulation of protein folding,” Curr Opin Struct Biol 23, 58–65 (2013).

295

296

R. O. Dror, R. M. Dirks, J. Grossman, H. Xu, and D. E. Shaw, “Biomolecular simulation:

7

J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. D. Chodera, C. Sch¨ utte,

297

and F. No´e, “Markov models of molecular kinetics: Generation and validation,” J Chem

298

Phys 134, 174105 (2011).

299

8

Review X 6, 011009 (2016).

300

301

9

10

306

7, 1244–1252 (2011). 11 ˇ S. a Beccara, T. Skrbi´ c, R. Covino, and P. Faccioli, “Dominant folding pathways of a ww domain,” Proceedings of the National Academy of Sciences 109, 2330–2335 (2012).

307

308

D. R. Glowacki, E. Paci, and D. V. Shalashilin, “Boxed molecular dynamics: decorrelation time scales and the kinetic master equation,” Journal of chemical theory and computation

304

305

L. T. Chong and D. M. Zuckerman, “Weighted ensemble simulation methods and applications,” Annu Rev Biophys 46 (2016).

302

303

B. Trendelkamp-Schroer and F. No´e, “Efficient estimation of rare-event kinetics,” Physical

12

R. C. Bernardi, M. C. Melo, and K. Schulten, “Enhanced sampling techniques in molecular

309

dynamics simulations of biological systems,” Biochim Biophys Acta, Gen Subj 1850, 872–

310

877 (2015).

311

13

O. Valsson, P. Tiwary, and M. Parrinello, “Enhancing important fluctuations: Rare events

312

and metadynamics from a conceptual viewpoint,” Annu Rev Phys Chem 67, 159–184

313

(2016).

314

14

related methods in drug discovery,” J Med Chem (2016).

315

316

15

16

17

323

G. Bussi and D. Branduardi, “Free-energy calculations with metadynamics: theory and practice,” Reviews in Computational Chemistry Volume 28 , 1–49 (2015).

321

322

A. Laio and M. Parrinello, “Escaping free-energy minima,” Proc Natl Acad Sci 99, 12562– 12566 (2002).

319

320

N. J. Bruce, G. K. Ganotra, D. B. Kokh, S. K. Sadiq, and R. C. Wade, “New approaches for computing ligand–receptor binding kinetics,” Curr Opin Struct Biol 49, 1–10 (2018).

317

318

M. De Vivo, M. Masetti, G. Bottegoni, and A. Cavalli, “Role of molecular dynamics and

18

P. Tiwary and M. Parrinello, “From metadynamics to dynamics,” Phys Rev Lett 111, 230602 (2013). 15

324

19

flooding for rate calculation,” Phys Rev Lett 115, 070601 (2015).

325

326

20

21

22

K. L. Fleming, P. Tiwary, and J. Pfaendtner, “New approach for investigating reaction dynamics and rates with ab initio calculations,” J Phys Chem A 120, 299–305 (2016).

331

332

J. Mondal, P. Tiwary, and B. Berne, “How a kinase inhibitor withstands gatekeeper residue mutations,” J Am Chem Soc 138, 4608–4615 (2016).

329

330

P. Tiwary, J. Mondal, J. A. Morrone, and B. Berne, “Role of water and steric constraints in the kinetics of cavity–ligand unbinding,” Proc Natl Acad Sci 112, 12015–12019 (2015).

327

328

J. McCarty, O. Valsson, P. Tiwary, and M. Parrinello, “Variationally optimized free-energy

23

H.-J. Tung and J. Pfaendtner, “Kinetics and mechanism of ionic-liquid induced protein un-

333

folding: application to the model protein HP35,” Molecular Systems Design & Engineering

334

1, 382–390 (2016).

335

24

H. Sun, Y. Li, M. Shen, D. Li, Y. Kang, and T. Hou, “Characterizing drug–target resi-

336

dence time with metadynamics: how to achieve dissociation rate efficiently without losing

337

accuracy against time-consuming approaches,” J Chem Inf Model 57, 1895–1906 (2017).

338

25

C. D. Fu, L. F. L. Oliveira, and J. Pfaendtner, “Determining energy barriers and selectivi-

339

ties of a multi-pathway system with infrequent metadynamics,” J Chem Phys 146, 014108

340

(2017).

341

26

Y. Wang, J. M. Martins, and K. Lindorff-Larsen, “Biomolecular conformational changes

342

and ligand binding: from kinetics to thermodynamics,” Chemical Science 8, 6466–6473

343

(2017).

344

27

streptavidin,” J Phys Chem B (2017).

345

346

P. Tiwary, “Molecular determinants and bottlenecks in the dissociation dynamics of biotin-

28

R. Casasnovas, V. Limongelli, P. Tiwary, P. Carloni, and M. Parrinello, “Unbinding

347

kinetics of a p38 map kinase type II inhibitor from metadynamics simulations,” J Am

348

Chem Soc 139, 4780–4788 (2017).

349

29

binding site?” Sci Adv 3, e1700014 (2017).

350

351

30

354

H. Grubm¨ uller, “Predicting slow structural transitions in macromolecular systems: Conformational flooding,” Phys Rev E 52, 2893 (1995).

352

353

P. Tiwary, J. Mondal, and B. Berne, “How and when does an anticancer drug leave its

31

A. F. Voter, “Hyperdynamics: Accelerated molecular dynamics of infrequent events,” Phys Rev Lett 78, 3908 (1997).

16

355

32

reconstructed from metadynamics,” J Chem Theo Comput 10, 1420–1425 (2014).

356

357

33

34

A. Barducci, G. Bussi, and M. Parrinello, “Well-tempered metadynamics: a smoothly converging and tunable free-energy method,” Physical review letters 100, 020603 (2008).

360

361

G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni, and G. Bussi, “Plumed 2: New feathers for an old bird.” Comput Phys Commun 185, 604–613 (2014).

358

359

M. Salvalaglio, P. Tiwary, and M. Parrinello, “Assessing the reliability of the dynamics

35

B. Barua, J. C. Lin, V. D. Williams, P. Kummler, J. W. Neidigh, and N. H. Andersen,

362

“The Trp-cage: optimizing the stability of a globular miniprotein,” Protein Engineering,

363

Design & Selection 21, 171–185 (2008).

364

36

S. Piana, K. Sarkar, K. Lindorff-Larsen, M. Guo, M. Gruebele, and D. E. Shaw, “Com-

365

putational design and experimental testing of the fastest-folding β-sheet protein,” Journal

366

of molecular biology 405, 43–48 (2011).

367

37

with respect to force field parameterization?” Biophys J 100, L47–L49 (2011).

368

369

38

V. A. Feher, E. P. Baldwin, and F. W. Dahlquist, “Access of ligands to cavities within the core of a protein is rapid,” Nat Struct Biol 3 (1996).

370

371

S. Piana, K. Lindorff-Larsen, and D. E. Shaw, “How robust are protein folding simulations

39

G. Bouvignies, P. Vallurupalli, D. F. Hansen, B. E. Correia, O. Lange, A. Bah, R. M.

372

Vernon, F. W. Dahlquist, D. Baker, and L. E. Kay, “Solution structure of a minor and

373

transiently formed state of a T4 lysozyme mutant,” Nature 477, 111–114 (2011).

374

40

populated conformations on a complex energy landscape,” eLife 5, e17505 (2016).

375

376

Y. Wang, E. Papaleo, and K. Lindorff-Larsen, “Mapping transiently formed and sparsely

41

K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian,

377

O. Guvench, P. Lopes, and I. Vorobyov, “Charmm general force field: A force field for

378

drug-like molecules compatible with the charmm all-atom additive biological force fields,”

379

J Comput Chem 31, 671–690 (2010).

380

42

binding in an interior nonpolar cavity of T4 lysozyme,” Biochemistry 34, 8564–8575 (1995).

381

382

43

R. A. Copeland, D. L. Pompliano, and T. D. Meek, “Drug–target residence time and its implications for lead optimization,” Nature reviews Drug discovery 5, 730 (2006).

383

384

A. Morton, W. A. Baase, and B. W. Matthews, “Energetic origins of specificity of ligand

44

M. Larion, A. L. Hansen, F. Zhang, L. Bruschweiler-Li, V. Tugarinov, B. G. Miller, and

385

R. Br¨ uschweiler, “Kinetic cooperativity in human pancreatic glucokinase originates from

386

millisecond dynamics of the small domain,” Angewandte Chemie 127, 8247–8250 (2015). 17

Frequency-adaptive Scheme

τc

τdep(t) (ps)

100

frequency in InMetaD

10

1

τ0 0

Figure 1:

1

frequency in MetaD 20 40 60 80 100 Metadynamics Simulation Time (ns)

Parameters

104 100

More aggressive 500

40

τ0(ps)

20 40

100 10

10

θ

5 2

τslow (µs)

1

FaMetaD InMetaD

100

Increased accuracy 50 0 Increased reliability

p-value

0.6 0.3

Cost (µs)

0 2

Increased efficiency

1 0 A

B

C

D

E

F

Suggest Documents