1
Frequency adaptive metadynamics for the calculation of rare-event kinetics
2
Yong Wang,1 Omar Valsson,2 Pratyush Tiwary,3 Michele Parrinello,4, 5 and Kresten
3
Lindorff-Larsen1, a)
4
1)
Structural Biology and NMR Laboratory, Linderstrøm-
5
Lang Centre for Protein Science, Department of Biology,
6
University of Copenhagen, Ole Maaløes Vej 5 DK-2200 Copenhagen N,
7
Denmark
8
9
10
2)
Max Planck Institute for Polymer Research, Ackermannweg 10, D-55128 Mainz,
Germany 3)
Department of Chemistry and Biochemistry and Institute for Physical
11
Science and Technology, University of Maryland, College Park 20742,
12
USA.
13
4)
Department of Chemistry and Applied Biosciences, ETH Zurich,
14
c/o USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano,
15
Switzerland
16
5)
Facolt`a di Informatica, Instituto di Scienze Computationali,
17
Universit`a della Svizzera Italiana, CH-6900 Lugano, Switzerland
18
(Dated: 25 April 2018)
1
19
The ability to predict accurate thermodynamic and kinetic properties in biomolecular
20
systems is of both scientific and practical utility. While both remain very difficult,
21
predictions of kinetics are particularly difficult because rates, in contrast to free
22
energies, depend on the route taken. For this reason, specific enhanced sampling
23
methods are needed to calculating long-time scale kinetics. It has recently been
24
demonstrated that it is possible to recover kinetics through so called ‘infrequent
25
metadynamics’ simulations, where the simulations are biased in a way that minimally
26
corrupts the dynamics of moving between metastable states. This method, however,
27
requires the bias to be added slowly, thus hampering applications to processes with
28
only modest separations of timescales. Here we present a frequency-adaptive strategy
29
which bridges normal and infrequent metadynamics. We show that this strategy can
30
improve the precision and accuracy of rate calculations at fixed computational cost,
31
and should be able to extend rate calculations for much slower kinetic processes.
32
Keywords: Binding and unbinding kinetics — Metadynamics — rate calculation
a)
Electronic mail:
[email protected]
2
33
I.
INTRODUCTION
34
Accessing long timescales in molecular dynamics (MD) simulations remains a longstand-
35
ing challenge. Limited by the short time steps of a few femtoseconds, and despite recent
36
progress in methods and hardware1–4 , it remains difficult to access the timescales of millisec-
37
onds and beyond. This leaves many applications outside the realm of MD, including many
38
structural rearrangements in biomolecules and binding and unbinding events of ligands and
39
drug molecules. Many of these reactions are so called ‘rare events’ where the time scale of
40
the reaction is orders of magnitude longer than the time it takes to cross the barrier between
41
the states. In such cases, most of the simulation time ends up being used in simulating the
42
local fluctuations inside free energy basins.
43
To study phenomena that involve basin-to-basin transitions that occur on long timescales,
44
one typically has to rely on some combination of machine parallelism5,6 , advanced strategies
45
for analysing simulations7–9 and enhanced sampling methods10–15 that allow for efficient
46
exploration of phase space. Metadynamics is one such popular and now commonly used
47
enhanced sampling method that involves the periodic application of a history-dependent
48
biasing potential to a few selected degrees of freedom, typically also called collective variables
49
(CVs)16 . Through this bias, the system is discouraged from getting trapped in low energy
50
basins, and one can observe processes that would be far beyond the timescales accessible
51
in normal MD, while still maintaining complete atomic resolution. Originally designed to
52
explore and reconstruct the equilibrium free energy surface17 , it has recently been shown
53
that metadynamics can also be used to calculate kinetic properties3,18–29 . In particular,
54
inspired by the pioneering work of Grubm¨ uller30 and Voter31 , it has recently been shown
55
that the unbiased kinetics can be correctly recovered from metadynamics simulations using
56
a method called infrequent metadynamics (InMetaD)18 .
57
The basic idea in InMetaD is to add a bias to the free energy landscape sufficiently
58
infrequently so that only the free energy basins, but not the free energy barriers, experience
59
the biasing potential, V (s, t), where t is the simulation time and s is one or more CV’s
60
chosen to distinguish the different free energy basins. By adding bias infrequently enough
61
compared to the time spent in barrier regions, the landscape can be filled up to speed up
62
the transitions without perturbing the sequence of state-to-state transitions. The effect of 3
63
the bias on time can then be reweighted through an acceleration factor:
64
α(τ ) = heβ(V (s,t)) i
65
Here β is the inverse temperature and the angular brackets denote an average over a meta-
66
dynamics run up until the simulation time τ .
(1)
67
In the application of InMetaD to recover the correct rates, there are two key assump-
68
tions. First, the state-to-state transitions are of the rare event type: namely, the system
69
is trapped in a basin for a duration long enough that memory of the previous history is
70
lost, and when the system does translocate into another basin, it does so rather rapidly.
71
Second, it is important that no substantial bias is added to the transition state (TS) region
72
during the simulation. This requirement can be achieved by adjusting the bias deposition
73
frequency, determined by the time between two instances of bias deposition. If the deposi-
74
tion frequency is kept low enough, it becomes possible to keep the TS region unbiased. This
75
second requirement, however, also means that it takes longer to fill up the basin, revealing
76
one of the practical limitations of applying InMetaD. It is this second requirement that we
77
address and improve in this work.
78
In particular, we asked ourselves the question, can we design a bias deposition scheme so
79
that the frequency is high near the bottoms of the free energy basins, but decreases gradually
80
so as to lower the risk of biasing the TS regions? Our scheme, which we term Frequency-
81
Adaptive Metadynamics (FaMetaD), is illustrated using a ligand (benzene) unbinding from
82
the T4 lysozyme (T4L) L99A mutant as an example (Fig. 1). In particular, we designed a
83
strategy that uses a high frequency at the beginning of the simulation (to fill up the basin
84
quickly) and then slows progressively down (to minimize the risk of perturbing the TS).
85
In this way, we aim to improve the reliability, accuracy and robustness of the calculations
86
without additional computational cost. We evaluated the performance of FaMetaD using
87
three different types of problems with different complexities. We first test the method on a
88
relatively simply conformational exchange in a short peptide, which nevertheless occurs on
89
a time scale of ten microseconds. We then apply the method to study the unfolding rates
90
of two small proteins which have unfolding times in the multi-microsecond regime. Finally,
91
we apply it to ligand binding and unbinding reactions in T4L, which occurs on a time scale
92
of milliseconds, that is difficult to access with standard MD simulations. We find that the
93
FaMetaD in general provides a robust and efficient method to estimate rare-event kinetics. 4
Frequency-adaptive Scheme
τc
τdep(t) (ps)
100
frequency in InMetaD
10
1
τ0 0
frequency in MetaD 20 40 60 80 100 Metadynamics Simulation Time (ns)
Figure 1:
FIG. 1. A schematic picture to show the application of frequency-adaptive metadynamics (FaMetaD) to calculate the off-rate of protein-ligand systems. The protein-ligand system (left panel) is benzene (light and dark blue spheres) binding to the L99A mutant of T4 lysozyme (green cartoon)26 . Frequency-adaptive metadynamics fills the free energy basin quickly at the beginning of the simulation, but adds the bias slowly at the latter stage when the system moves close to the transition state regions. The right panel shows a typical trajectory of how the time between deposition of the bias is adjusted on-the-fly in a FaMetaD run. The typical frequencies used in normal metadynamics (τ0 ) and InMetaD (τc ) are labeled by black arrows. 94
95
II.
METHODS Our adaptive frequency scheme reads: τdep (t) = min{τ0 · max{
96
α(t) , 1}, τc } θ
(2)
97
where τ0 is the initial deposition time between adding a bias, similar to the relatively short
98
time in normal metadynamics, and τc is a cut-off value for the frequency equal to or larger
99
than the deposition time originally proposed in InMetaD. These two parameters bridge the
100
normal metadynamics and InMetaD, by controlling the minimal and maximal bias depo-
101
sition frequency. α(t) is the instantaneous acceleration factor at simulation time t in a
102
metadynamics simulation.
103
Through the starting deposition time τ0 , we1 can modulate the enhancement in our ability
104
to fill free energy wells, relative to an InMetaD run performed with a constant stride of τc .
105
For practical considerations the choice of τc is dependent on the computational resources. 5
106
The key free parameter here is θ, the threshold value for the acceleration factor to trigger the
107
gradual change from normal (frequent) metadynamics to InMetaD. The choice of θ requires
108
some considerations. On one hand, θ should be as large as possible to delay the switch from
109
normal metadynamics to InMetaD so as to get significant enhancement in basin filling. On
110
the other hand, too large θ could lead to problems that a transition might occur when the
111
frequency τdep (t) is still less than the typical duration τc of a reactive trajectory crossing
112
over from one basin to another.
113
We now estimate a value for θ in terms of the expected transition time τexp available from
114
either experimental measurements or an estimate, the simulation time, τsim (determined also
115
by computational resources), and a ‘safety coefficient’ Cs . We seek the following to hold
116
true: θ≤
117
τexp . τsim Cs
(3)
118
If we use a protein-drug system as an example, and we (i) can estimate the residence time
119
to be roughly on the order of a second (τexp = 1s), (ii) aim to observe the transition
120
within a 100 ns metadynamics run (τsim = 100ns), and (iii) use Cs = 102 to counteract
121
the risk that a transition time falls in the long tail of Poisson distribution, we thus obtain
122
θ =
123
value of θ this leads to perturbed (and erroneous) kinetics; an error that may be detected by
124
testing whether the observed distribution of transition times follows a Poisson distribution32 .
125
As per Eq. (2), the frequency is also changed as a function of α(t). In practice, we
126
observed significant fluctuation of τdep (t), that might cause problems if τdep (t) is small at
127
the transition time point. Therefore we finally modified Eq. (2) to become a monotonously
128
increasing function:
1s 100ns∗102
= 105 . As we demonstrate by an example below, if one chooses too high a
0
129
τdep (t) = max(τdep ((N − 1)∆t), τdep (N ∆t))
130
where τdep (N ∆t) is the instantaneous deposition time at step N in a metadynamics sim-
131
ulation with a MD time step ∆t. The frequency-adaptive scheme was implemented in a
132
development version of the PLUMED2.2 code33 . 6
(4)
TABLE I. Conformational Transition Times of Ace-Ala3-Nme Parameters: τ0 (ps), τc (ps), h (kJ/mol) τslow (µs) P-value Cost (µs) Set Unbiased MDa
T=300K
11±2
InMetaDb
τ0 = 200,h = 0.4
16±5
0.4±0.3
2.8
τ0 = 100,h = 0.4
12±3
0.3±0.2
1.5
A
τ0 = 40,h = 0.4
19±6
0.3±0.2
0.8
B
τ0 = 20,h = 0.4
16±5
0.1±0.2
0.5
C
τ0 = 10,h = 0.4
32±15
0.1±0.1
0.3
D
τ0 = 5,h = 0.4
92±38
0.02±0.05
0.2
E
τ0 = 2,h = 0.4
98±61
0.01±0.01
0.1
F
τ0 = 2,τc = 200,θ = 1,h = 0.4
15±3
0.4±0.3
1.5
A
τ0 = 2,τc = 200,θ = 10,h = 0.4
14±3
0.5±0.3
0.6
B
τ0 = 2,τc = 200,θ = 40,h = 0.4
19±6
0.2±0.2
0.4
C
τ0 = 2,τc = 200,θ = 100,h = 0.4
24±7
0.1±0.1
0.3
D
τ0 = 2,τc = 200,θ = 500,h = 0.4
29±10
0.1±0.2
0.2
E
τ0 = 2,τc = 200,θ = 10000,h = 0.4
24±11
0.02±0.03
0.1
F
FaMetaDb
a
From ref.26 .
b
300
We estimated τslow from 40 independent simulations for each set of parameters,
and used a bootstrap analysis with 50 subsamples to estimate the errors.
133
III.
134
A.
RESULTS AND DISCUSSION Kinetics of conformational change of a four-state model system
135
To benchmark our results, we first consider a five-residue peptide (Nme-Ala3-Ace) as a
136
model system with a non-trivial free energy landscape involving multiple conformational
137
states26 . We consider the slowest state-to-state transition time, τslow , which has been es-
138
timated to be ∼ 11µs from unbiased MD26 , as the benchmark target. We used the same
139
computational setup and CVs as previously described26 .
140
As a baseline, we used InMetaD with a fixed height of Gaussian bias potential (h = 0.4
141
kJ/mol) but with different times of bias deposition ranging from 2 ps to 200 ps (40 runs
142
for each parameter set) (Table I and Fig. S1). The reliability of the calculated transition 7
143
times were verified a posteriori using a Kolmogorov-Smirnov test32 to examine whether
144
their cumulative distribution function is Poissonian. If the p-value is low (e.g. less than
145
0.05) then it is likely that the distribution has been perturbed because of too aggressive
146
application of the biasing potential. As expected, we find that simulations that used a low
147
frequency (τ0 ≥ 20 ps) gave rise to a consistent estimation of τslow , with values close to those
148
observed in unbiased MD, while the simulations with high frequency (τ0 ≤ 10 ps) resulted in
149
substantially longer and less reliable times (Table I and Fig. S1). This correlation between
150
the p-value and the deposition time in the InMetaD simulations can be explained by the fact
151
that the higher bias frequency results in higher risk of biasing the TS regions and more non-
152
Poissonian distribution. We also note that the transition times appear to be over-estimated
153
in these cases.
154
To compare the FaMetaD simulations with the InMetaD baseline, we designed six sets
155
(A to F) of FaMetaD simulations to have comparable computational costs (Table I and
156
Fig. 2). Overall, the results of FaMetaD show the same trends as InMetaD and support
157
the same conclusions. In other words, both InMetaD and FaMetaD reveal the trade-off
158
between reliability, accuracy and computational cost. There are, however, some important
159
differences that highlight the improvement obtained through FaMetaD. First, in each case
160
there is a small but notable improvement in the reliability of the calculations, as judged by
161
the greater p-values that suggest that FaMetaD perturbs the TS less than InMetaD. Most
162
important, however, is the accuracy of the calculations. While both InMetaD and FaMetaD
163
achieve accurate results when using the most conservative parameters (sets A–C), there
164
are dramatic differences when more aggressive parameters are used (D–F). Importantly,
165
we observe a much more ‘gracious’ decline in accuracy with more aggressive parameters in
166
FaMetaD comparing to InMetaD (Fig. S2). Thus together these results suggest that the
167
frequency-adaptive scheme can further improve not only the reliability but also the accuracy
168
of the calculation, without the need of increasing computational burden, and also that the
169
reliability is more robust to the choice of the simulation parameters.
170
We also checked the robustness of our method with respect to the choice of the so-called
171
bias factor, γ, which is defined as the ratio of the effective CV temperature and system
172
temperature34 . During a metadynamics simulation γ controls the rate at which the height
173
of the individual Gaussians in the bias potential is decreased, and is a key parameter in
174
well-tempered metadynamics. We therefore performed a series of FaMetaD simulations with 8
4
Parameters
10 100
More aggressive 500
40
τ0(ps)
20 40
100 10
10
θ
5 2
τslow (µs)
1
FaMetaD InMetaD
100
Increased accuracy 50 0 Increased reliability
p-value
0.6 0.3
Cost (µs)
0 2
Increased efficiency
1 0 A
B
C
D
E
F
FIG. 2. Comparing the accuracy, reliability and efficiency of frequency-adaptive metadynamics and infrequent metadynamics. The upper panel shows the key parameters (τ0 in InMetaD, θ in FaMetaD) of the six sets (A to F) pairs simulations. The middle panels show a comparison of the accuracy and reliability of the results. The grey line shows the τslow obtained from unbiased MD simulations. The bottom panel shows the computational cost of each set of simulations.
175
different values of γ, and determined the effect on the estimated rates in the pentapeptide
176
(Table S1). We find that varying γ 50-fold (from 2–100, including γ = 15 which is the value
177
used in our other simulations) has essentially no effect on the estimated rates, with the results
178
at very small γ (approaching unbiased MD) and very high γ (approaching non-tempered
179
metadynamics) being within error. Whereas the rates and their reliability (including the
180
estimated p-values in the KS test) are not substantially affected by changing γ in the range
181
tested, the value of γ does affect the efficiency of the simulations. Increasing γ corresponds 9
182
to an initially more aggressive bias deposition, and thus decreased computational cost. Due
183
to the switchover nature of the FaMetaD deposition schedule, increasing γ to very high
184
values does, however, not increase performance nor decrease accuracy.
185
B.
Kinetics of protein unfolding
186
We now consider the more complicated problem of protein unfolding using the ther-
187
mostable TC10b variant of Trp-Cage35 and the GTT variant of the Fip35 WW domain
188
(GTT36 ) as examples. The in silico unfolding times of these two proteins have previously
189
been determined from long-timescale simulations of reversible protein folding1 , giving us the
190
opportunity of comparing the rates from InMetaD and FaMetaD directly with those from
191
unbiased MD (τunf olding = 3 ± 1µs and 80 ± 24µs, respectively). The previous simulations
192
also determined the mean transition-path times, τT P , to be ∼ 0.2µs and ∼ 0.5µs, respec-
193
tively. Such relatively long transition path times could, in principle, make it difficult to use
194
the InMetaD and FaMetaD approaches, since even when the folded state basin is filled, it
195
still takes a substantial time to cross the barrier. Thus, these simulations also probe the
196
robustness of the methods in a system where the basic requirements of the approaches are
197
difficult to gauge or control. Simulations were performed using the same force field and
198
similar simulation parameters (to the extent possible) as those previously used1 , and the
199
fraction of native contacts, Q, as the CV. A simulation was deemed to give rise to an un-
200
folding event when Q dropped below 0.35, a value chosen by comparison with the unbiased
201
simulations1 . Our results show that, for both proteins and both InMetaD and FaMetaD, it
202
is possible to estimate the unfolding rates with good accuracy when compared to the results
203
from unbiased simulations (Table II). While it is difficult to converge the rate estimates, we
204
find that all rates are within a factor of two from the unbiased simulations, and generally
205
within error of one another. As above in the peptide system we note a small, but non-trivial
206
gain in performance with FaMetaD compared to InMetaD at comparable precision of the
207
estimated rates. 10
TABLE II. Unfolding Time Constants of Trp-Cage and a WW domain System
Method
Trp-Cagee
InMetaDa
2±1
0.3±0.3
0.5
FaMetaDb
3±1
0.4±0.2
0.1
Unbiased MD
3±1
InMetaDc
150±45
0.6±0.3
10
FaMetaDd
132±56
0.3±0.2
1.5
Unbiased MD
80±24
WWe
a
τunf olding (µs) p-value Computational Cost (µs)
210
1100
τ0 = 200 ps,h = 0.4 kJ/mol.
b
τ0 = 2 ps,τc = 200 ps,θ = 10,h = 0.4 kJ/mol.
c
d
τ0 = 2 ps,τc = 200 ps,θ = 1,h = 0.2 kJ/mol.
τ0 = 30 ps,h = 0.4 kJ/mol. e
We used the same force field (CHARMM22*37 ), ion concentrations and
temperatures (290K and 360K) for Trp-Cage and the GTT WW domain variant as in the earlier unbiased simulations1 . The error bars were the standard deviation obtained by a bootstrap analysis with 50 subsamples. The sample size is 15 for each set of simulations.
208
C.
Kinetics of protein-ligand binding
209
We consider now the case of a ligand binding to or escaping from a buried internal cavity
210
in the L99A mutant of T4 lysozyme, processes which occur on timescales of milliseconds or
211
more38–40 . To benchmark the results, we performed three sets of metadynamics simulations,
212
up to 26µs in total (including 12 µs InMetaD simulations of T4L L99A with benzene from
213
our recent work26 ). In each set, we compared the results of InMetaD with that of FaMetaD.
214
BN Z BN Z In set 1 and 2 we calculated the time constants for binding (τon ) and unbinding (τof f ) of
215
IN D benzene, while we in set 3 calculated the time for indole to escape the pocked (τof f ). We
216
performed 20 independent runs for each set of simulations (using the CHARMM22* force
217
field37 for protein and CGenFF force field41 for the ligands) to collect the transition times,
218
from which we obtained τon and τof f (Table III).
219
BN Z Again, we find consistency between the results of InMetaD and FaMetaD, with e.g. τon
220
BN Z and τof to be ∼ 10 ms and ∼ 170 ms, respectively. As previously described26 , we can use f
11
TABLE III. Binding and Unbinding Times of T4L L99A with Benzene and Indole Methods Parameters: τ0 (ps), τc (ps), h (kJ/mol) Time (ms) p-value Cost (µs) BN Z ) Set 1: Benzene Binding (τon
InMetaDa
τ0 = 40,h = 0.4
9±5
0.1±0.1
4.4
FaMetaD
τ0 = 1,τc = 100,h = 0.2,θ = 103
14±7
0.2±0.1
3.0
BN Z ) Set 2: Benzene Unbinding (τof f
InMetaDa
τ0 = 100,h = 0.2
168±59
0.4±0.3
6.7
FaMetaD
τ0 = 1,τc = 100,h = 0.2,θ = 103
176±68
0.3±0.2
5.5
IN D ) Set 3: Indole Unbinding (τof f
InMetaD
τ0 = 100,h = 0.2
102±87
0.2±0.2
4.5
FaMetaD
τ0 = 1,τc = 100,h = 0.2,θ = 104
168±95
0.2±0.1
2.0 26
a
The results are from our recent work26 .
b
In all simulations, the ligand concentration
is ∼ 5 mM. The error bars were the standard deviation obtained by a bootstrap analysis with 50 subsamples. The sample size is 20 for each set of simulations.
221
these values to estimate the binding free energy to be ∆Gbinding ≈ −21 kJ/mol, a value that
222
agrees well with calorimetric42 and NMR38 measurements.
223
In set 1, the InMetaD simulation was performed with τ0 = 40 ps and h = 0.4 kJ/mol.
224
The frequency-adaptive scheme allows us to perform FaMetaD with weaker bias (h = 0.2
225
kJ/mol) and ending with longer and more conservative deposition time (τc = 100 ps). This
226
parameter set used less simulation time but resulted in a more reliable estimation. In set
227
2, the FaMetaD simulation was performed using the same τc = 100 ps and h = 0.2 kJ/mol
228
as that used in the InMetaD simulation, but with θ = 103 . Given that τexp ∼ 10 − 100 ms,
229
τsim ∼ 100 ns and Cs = 102 , according to Eq. (3), this allows us to judge that θ = 103 is a
230
BN Z fairly conservative parameter. The estimated values of τof in this set are indistinguishable f
231
between InMetaD and FaMetaD within the error bars but at slightly less computational cost.
232
In set 3, we used similar parameters as in set 2 except a bit more aggressive θ = 104 . Again,
233
IN D this parameter set resulted in τof f of 168 ± 95 ms from FaMetaD that is reasonably close
234
to the estimation from InMetaD. Remarkably, FaMetaD allowed us to reduce more than 12
235
half the computational cost, without loss of accuracy. Overall, the application in the case
236
of L99A binding with two ligands suggests that the frequency-adaptive scheme can improve
237
the reliability-accuracy-efficency balance of metadynamics on kinetics calculation.
238
IV.
CONCLUSIONS
239
Many biological processes occur far from equilibrium, and kinetic properties can play
240
an important role in biology and biochemistry. For example, there has been continued
241
interest in determining ligand residence times in the context of drug optimization43 , and
242
millisecond conformational dynamics in an enzyme has been shown to underlie an intriguing
243
phenomenon of kinetic cooperativity of relevance to disease44 .
244
The principle of microscopic reversibility, however, sets limits to how the kinetic prop-
245
erties can be varied independently of thermodynamics. Thus, for example, increasing the
246
residence time of a ligand will also increase its thermodynamic affinity, unless there is also
247
a simultaneous drop in the rate of binding. Thus, in practice it may be in many cases be
248
difficult to disentangle the effects of kinetics and thermodynamics.
249
Taken together, the above considerations suggest the need for improved methods for un-
250
derstanding and ultimately predicting the rate constants of biological processes. Further,
251
the ability to calculate the kinetics of conformational exchange or ligand binding and un-
252
binding provides an alternative approach to determine equilibrium properties26 . For these
253
and other reasons, several methods have been developed to enable the estimation of kinetics
254
from simulations15 .
255
We have here proposed a modification to the already very powerful InMetaD algorithm,
256
which leads to a further improvement in the reliability and accuracy in recovering unbiased
257
transition rates from biased metadynamics simulations. The basic idea in the resulting
258
FaMetaD approach is that, by filling up the basin more rapidly in the beginning and only
259
using an infrequent bias near the barrier, we may spend the computational time where it is
260
needed. We anticipate that our scheme will prove particularly useful in two different aspects.
261
First, by enabling a decreased bias in the transition state region, we obtain more accurate
262
kinetics at fixed computational cost, leading also to the observed robustness of our approach.
263
Second, in the case of large barriers, it would be prohibitively slow to use a very infrequent
264
bias through the entire duration of the simulation. Thus, the FaMetaD provides a practical 13
265
approach to study rare events that involve escaping deep free energy minima, and which
266
occur on timescales well beyond those accessible to current simulation methods. Here we
267
have opted to examine processes that can be studied using both InMetaD and FaMetaD,
268
but in the future we aim to apply this approach to barrier crossing events that occur on
269
even longer timescales. With more examples in hand, we also expect that in the future it
270
may be possible to design improved biasing schemes to increase the computational efficiency
271
further, and at the same time retain the robustness of the FaMetaD approach.
272
V.
See Supplementary Material for supplementary figures and table.
273
274
SUPPLEMENTARY MATERIAL
VI.
ACKNOWLEDGEMENT
The authors thank Tristan Bereau and Claudio Perego for a critical reading of the
275
276
manuscript.
277
Nordisk Foundation and the BRAINSTRUC initiative from the Lundbeck Foundation. M.P.
278
acknowledges funding from the National Centre for Computational Design and Discovery of
279
Novel Materials MARVEL and European Union grant ERC-2014-AdG-670227/VARMET.
280
REFERENCES
281
1
K. Lindorff-Larsen, S. Piana, R. O. Dror, and D. E. Shaw, “How fast-folding proteins fold,” Science 334, 517–520 (2011).
282
283
K.L.-L. acknowledges funding by a Hallas-Møller Stipend from the Novo
2
I. Buch, T. Giorgino, and G. De Fabritiis, “Complete reconstruction of an enzyme-inhibitor
284
binding process by molecular dynamics simulations,” Proc Natl Acad Sci 108, 10184–10189
285
(2011).
286
3
P. Tiwary, V. Limongelli, M. Salvalaglio, and M. Parrinello, “Kinetics of protein–ligand
287
unbinding: Predicting pathways, rates, and rate-limiting steps,” Proc Natl Acad Sci 112,
288
E386–E391 (2015).
289
4
F. Paul, C. Wehmeyer, E. T. Abualrous, H. Wu, M. D. Crabtree, J. Sch¨oneberg, J. Clarke,
290
C. Freund, T. R. Weikl, and F. No´e, “Protein-peptide association kinetics beyond the
291
seconds timescale from atomistic simulations,” Nature Communications 8, 1095 (2017). 14
292
5
a computational microscope for molecular biology,” Annu Rev Biophys 41, 429–452 (2012).
293
294
6
T. J. Lane, D. Shukla, K. A. Beauchamp, and V. S. Pande, “To milliseconds and beyond: challenges in the simulation of protein folding,” Curr Opin Struct Biol 23, 58–65 (2013).
295
296
R. O. Dror, R. M. Dirks, J. Grossman, H. Xu, and D. E. Shaw, “Biomolecular simulation:
7
J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. D. Chodera, C. Sch¨ utte,
297
and F. No´e, “Markov models of molecular kinetics: Generation and validation,” J Chem
298
Phys 134, 174105 (2011).
299
8
Review X 6, 011009 (2016).
300
301
9
10
306
7, 1244–1252 (2011). 11 ˇ S. a Beccara, T. Skrbi´ c, R. Covino, and P. Faccioli, “Dominant folding pathways of a ww domain,” Proceedings of the National Academy of Sciences 109, 2330–2335 (2012).
307
308
D. R. Glowacki, E. Paci, and D. V. Shalashilin, “Boxed molecular dynamics: decorrelation time scales and the kinetic master equation,” Journal of chemical theory and computation
304
305
L. T. Chong and D. M. Zuckerman, “Weighted ensemble simulation methods and applications,” Annu Rev Biophys 46 (2016).
302
303
B. Trendelkamp-Schroer and F. No´e, “Efficient estimation of rare-event kinetics,” Physical
12
R. C. Bernardi, M. C. Melo, and K. Schulten, “Enhanced sampling techniques in molecular
309
dynamics simulations of biological systems,” Biochim Biophys Acta, Gen Subj 1850, 872–
310
877 (2015).
311
13
O. Valsson, P. Tiwary, and M. Parrinello, “Enhancing important fluctuations: Rare events
312
and metadynamics from a conceptual viewpoint,” Annu Rev Phys Chem 67, 159–184
313
(2016).
314
14
related methods in drug discovery,” J Med Chem (2016).
315
316
15
16
17
323
G. Bussi and D. Branduardi, “Free-energy calculations with metadynamics: theory and practice,” Reviews in Computational Chemistry Volume 28 , 1–49 (2015).
321
322
A. Laio and M. Parrinello, “Escaping free-energy minima,” Proc Natl Acad Sci 99, 12562– 12566 (2002).
319
320
N. J. Bruce, G. K. Ganotra, D. B. Kokh, S. K. Sadiq, and R. C. Wade, “New approaches for computing ligand–receptor binding kinetics,” Curr Opin Struct Biol 49, 1–10 (2018).
317
318
M. De Vivo, M. Masetti, G. Bottegoni, and A. Cavalli, “Role of molecular dynamics and
18
P. Tiwary and M. Parrinello, “From metadynamics to dynamics,” Phys Rev Lett 111, 230602 (2013). 15
324
19
flooding for rate calculation,” Phys Rev Lett 115, 070601 (2015).
325
326
20
21
22
K. L. Fleming, P. Tiwary, and J. Pfaendtner, “New approach for investigating reaction dynamics and rates with ab initio calculations,” J Phys Chem A 120, 299–305 (2016).
331
332
J. Mondal, P. Tiwary, and B. Berne, “How a kinase inhibitor withstands gatekeeper residue mutations,” J Am Chem Soc 138, 4608–4615 (2016).
329
330
P. Tiwary, J. Mondal, J. A. Morrone, and B. Berne, “Role of water and steric constraints in the kinetics of cavity–ligand unbinding,” Proc Natl Acad Sci 112, 12015–12019 (2015).
327
328
J. McCarty, O. Valsson, P. Tiwary, and M. Parrinello, “Variationally optimized free-energy
23
H.-J. Tung and J. Pfaendtner, “Kinetics and mechanism of ionic-liquid induced protein un-
333
folding: application to the model protein HP35,” Molecular Systems Design & Engineering
334
1, 382–390 (2016).
335
24
H. Sun, Y. Li, M. Shen, D. Li, Y. Kang, and T. Hou, “Characterizing drug–target resi-
336
dence time with metadynamics: how to achieve dissociation rate efficiently without losing
337
accuracy against time-consuming approaches,” J Chem Inf Model 57, 1895–1906 (2017).
338
25
C. D. Fu, L. F. L. Oliveira, and J. Pfaendtner, “Determining energy barriers and selectivi-
339
ties of a multi-pathway system with infrequent metadynamics,” J Chem Phys 146, 014108
340
(2017).
341
26
Y. Wang, J. M. Martins, and K. Lindorff-Larsen, “Biomolecular conformational changes
342
and ligand binding: from kinetics to thermodynamics,” Chemical Science 8, 6466–6473
343
(2017).
344
27
streptavidin,” J Phys Chem B (2017).
345
346
P. Tiwary, “Molecular determinants and bottlenecks in the dissociation dynamics of biotin-
28
R. Casasnovas, V. Limongelli, P. Tiwary, P. Carloni, and M. Parrinello, “Unbinding
347
kinetics of a p38 map kinase type II inhibitor from metadynamics simulations,” J Am
348
Chem Soc 139, 4780–4788 (2017).
349
29
binding site?” Sci Adv 3, e1700014 (2017).
350
351
30
354
H. Grubm¨ uller, “Predicting slow structural transitions in macromolecular systems: Conformational flooding,” Phys Rev E 52, 2893 (1995).
352
353
P. Tiwary, J. Mondal, and B. Berne, “How and when does an anticancer drug leave its
31
A. F. Voter, “Hyperdynamics: Accelerated molecular dynamics of infrequent events,” Phys Rev Lett 78, 3908 (1997).
16
355
32
reconstructed from metadynamics,” J Chem Theo Comput 10, 1420–1425 (2014).
356
357
33
34
A. Barducci, G. Bussi, and M. Parrinello, “Well-tempered metadynamics: a smoothly converging and tunable free-energy method,” Physical review letters 100, 020603 (2008).
360
361
G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni, and G. Bussi, “Plumed 2: New feathers for an old bird.” Comput Phys Commun 185, 604–613 (2014).
358
359
M. Salvalaglio, P. Tiwary, and M. Parrinello, “Assessing the reliability of the dynamics
35
B. Barua, J. C. Lin, V. D. Williams, P. Kummler, J. W. Neidigh, and N. H. Andersen,
362
“The Trp-cage: optimizing the stability of a globular miniprotein,” Protein Engineering,
363
Design & Selection 21, 171–185 (2008).
364
36
S. Piana, K. Sarkar, K. Lindorff-Larsen, M. Guo, M. Gruebele, and D. E. Shaw, “Com-
365
putational design and experimental testing of the fastest-folding β-sheet protein,” Journal
366
of molecular biology 405, 43–48 (2011).
367
37
with respect to force field parameterization?” Biophys J 100, L47–L49 (2011).
368
369
38
V. A. Feher, E. P. Baldwin, and F. W. Dahlquist, “Access of ligands to cavities within the core of a protein is rapid,” Nat Struct Biol 3 (1996).
370
371
S. Piana, K. Lindorff-Larsen, and D. E. Shaw, “How robust are protein folding simulations
39
G. Bouvignies, P. Vallurupalli, D. F. Hansen, B. E. Correia, O. Lange, A. Bah, R. M.
372
Vernon, F. W. Dahlquist, D. Baker, and L. E. Kay, “Solution structure of a minor and
373
transiently formed state of a T4 lysozyme mutant,” Nature 477, 111–114 (2011).
374
40
populated conformations on a complex energy landscape,” eLife 5, e17505 (2016).
375
376
Y. Wang, E. Papaleo, and K. Lindorff-Larsen, “Mapping transiently formed and sparsely
41
K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian,
377
O. Guvench, P. Lopes, and I. Vorobyov, “Charmm general force field: A force field for
378
drug-like molecules compatible with the charmm all-atom additive biological force fields,”
379
J Comput Chem 31, 671–690 (2010).
380
42
binding in an interior nonpolar cavity of T4 lysozyme,” Biochemistry 34, 8564–8575 (1995).
381
382
43
R. A. Copeland, D. L. Pompliano, and T. D. Meek, “Drug–target residence time and its implications for lead optimization,” Nature reviews Drug discovery 5, 730 (2006).
383
384
A. Morton, W. A. Baase, and B. W. Matthews, “Energetic origins of specificity of ligand
44
M. Larion, A. L. Hansen, F. Zhang, L. Bruschweiler-Li, V. Tugarinov, B. G. Miller, and
385
R. Br¨ uschweiler, “Kinetic cooperativity in human pancreatic glucokinase originates from
386
millisecond dynamics of the small domain,” Angewandte Chemie 127, 8247–8250 (2015). 17
Frequency-adaptive Scheme
τc
τdep(t) (ps)
100
frequency in InMetaD
10
1
τ0 0
Figure 1:
1
frequency in MetaD 20 40 60 80 100 Metadynamics Simulation Time (ns)
Parameters
104 100
More aggressive 500
40
τ0(ps)
20 40
100 10
10
θ
5 2
τslow (µs)
1
FaMetaD InMetaD
100
Increased accuracy 50 0 Increased reliability
p-value
0.6 0.3
Cost (µs)
0 2
Increased efficiency
1 0 A
B
C
D
E
F