Molecular dynamics simulations using empirical force fields:

10 downloads 126 Views 13MB Size Report
67. 1.7.3. Force-field parametrization using tnostly expérimental data. 70. 1.7.4. ..... C'est pourquoi l'application d'une force motrice artificielle extrêmement forte ...
Research Collection

Doctoral Thesis

Molecular dynamics simulations using empirical force fields principles and applications to selected systems of chemical and biochemical interest Author(s): Hünenberger, Philippe Henry Publication Date: 1997 Permanent Link: https://doi.org/10.3929/ethz-a-001730163

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library

Diss. ETH No. 12052

Molecular

dynamics simulations using empirical Principles and applications to

force fields:

selected Systems of chemical and biochemical interest

A dissertation submitted to the SWISS FEDERAL INSTITUTE OF TECHNOLOGY

for the

degree

of

Doctor of Natural Sciences

presented by Philippe Henry Hiinenberger Dipl. Chem. UNIL boni 23.5.1970 citizen of Basel BS and Zurich ZH

accepted

on

Prof. Dr. W.F.

the recommendation of van

Gunsteren, examiner

Prof. Dr. R.R. Emst, co-examiner

1997

To my

parents,

my two

grandmothers,

and to Valérie

CONTENTS

Résumé

1

Summary

5

Préface

8

CHAPTER 1:

Empirical

classical force fields

for molecular Systems 1.1.

12

Summary

12

1.2. Introduction

12

1.3. Choice of the 1.3.1. 1.3.2. 1.3.3. 1.3.4. 1.3.5.

1.4.

explicit degrees of freedom Gas-phase force fields Condensed-phase force fields Mean-solvent force fields Low-resolution force fields Hybrid force fields

of the model

17 19 20 21 21

21

Assumptions underlying empirical classical interaction functions 1.4.1. Implicit degrees offreedom and the assumption ofweak corrélation

22

1.4.2. Energy tenus and the assumption

27

oftransferability 1.4.3. Coordinate redundancy and assumption oftransferability 1.4.4. Choices mode in the averaging processes 1.5. General characteristics of the

empirical interaction

1.5.1. Interaction function parameters and molecular

function

topology

22

30 31

32 32

1.5.2. Atom types and combination rules

33

1.5.3. Expression for the classical Hamiltonian

34

1.6. Interaction function ternis used in current force fields 1.6.1.

Bond-stretching term 1.6.1.1. Functionalforms

37 37

1.6.1.2. Combination rules

1.6.2.

Bond-angle bending

35

40

term

40

1.6.2.1. Functional forms

40

1.6.2.2. Combination rules

41

Contents

n

1.6.3. Torsional dihedral

1.6.4.

angle

41

term

1.6.3.1. Functional forms

41

1.6.3.2. Combination rules

44

Out-of-plane

44

coordinate distortion term

1.6.4.1. Functional forms

44

1.6.4.2. Combination rules

45

7.6.5. Valence coordinates

45

cross terms

1.6.5.1. Functional forms 1.6.6.

45

der Waals interaction

48

1.6.6.1. Functional forms

48

1.6.6.2. Combination rules

53

van

1.6.7. Electrostatic interaction

57

1.6.7.1. Functional forms 1.6.7.2. Combination rules

57 63

1.6.8. Coupling between covalent coordinates and electrostatic interactions 63 1.6.9.

Hydrogen-bonding

63

term

1.7. Force fleld

parametrization procédures basicproblem Source of data for force-field parametrization or validation

1.7.1. The 1.7.2.

1.7.3. Force-field parametrization using tnostly expérimental data 1.7.4. Systematic parametrization for simple condensed-phase Systems 1.7.4.1. By trial and 1.7.4.2.

Using sensitivity analysis

Systematic parametrization using resultsfrom calculations in

65

67 70 71

71

error

71

1.7.4.3. Using the weak-coupling method /. 7.4.4. Using a search method in parameter space 1.7.4.5. Using the perturbationformula 1.7.5.

65

72 72

73

ab initio

73

vacuum

7.7.6. Technical

difficultés in the calibration offorce fields 1.7.6.1. Parameter interdependence 1.7.6.2. Parameter dependence on degrees offreedom (D), functional form (F), combination rules (C) and approximations (A) 1.7.6.3. Parameter dependence on the molécule training set and

75 75

76

calibration observables

77

1.7.6.4.

77

1.7.6.5. 1.7.6.6. 1.7.6.7.

Non-convergence of important observables Existence of conflicting requirements Force-fieldmixingproblems Validation ofaforcefield and comparisonofforcefields

78 78 79

1.8. Conclusion

79

1.9. Références

80

Contents

m

Alternative schemes for the inclusion of a reaction-field

CHAPTER 2:

correction into molecular Influence

on

and dielectric 2.1.

dynamics simulations: energetic, structural

the simulated

87

properties of liquid water

87

Summary

88

2.2. Introduction

90

23. Theory

straight cutofftruncation application ofa reaction-field correction

2.3.1.

Schemesfor the Coulomb interaction with

2.3.2.

Schemesfor

2.3.3.

Comparison ofthe différent schemesfor dipoles

to

the

90

93

the truncation at

a

distance close

cutoff

101

2.3.4. Ewald summation scheme

103

to

the

2.4. Molecular model and

computational procédures

104

2.4.1. Simulation setup

104

Analysis ofthe energetic properties 2.4.3. Analysis ofthe diffusion properties 2.4.4. Analysis ofthe structural properties 2.4.5. Analysis ofthe dielectric properties

105

2.4.2.

2.5.2.

107 108

110

2.5. Resuite 2.5.1.

106

Thermodynamic and energetic properties Diffusion properties

110 114

2.5.3. Structural properties

116

2.5.4. Dielectric properties

121

2.6. Conclusion

126

2.7. Références

128

potential and field in a spherical cavity permittivity tensor

APPENDK A:

Reaction

APPENDK B:

Calcuiation of the dielectric

based

CHAPTER 3:

on

the

dipole

moment fluctuations

132

137

Hydrogen-bonded diastereotopic interactions in a model complex

Contents

145

rv

3.1. Summary

145

3.2. Introduction

146

3.3. Molecular model and

computational procédure

148

3.4. Résulte 3.4.1.

149 149

Population analysis

Thermodynamic properties 3.4.3. Sensitivity ofthe thermodynamic properties

152

3.4.2.

3.4.4. 3.4.5.

to

selected

model parameters

157

Lifeûmes ofthe complex and hydrogen bonds Hydrogen-bonding pattems

160 163

3.5. Discussion

165

3.6. Conclusion

167

3.7. Références

168

APPENDK C:

APPENDK E:

Thermodynamic extrapolations using the perturbation formula Thermodynamic extrapolations using a Taylor expansion The homogeneity function

CHAPTER 4:

Fluctuation and cross-corrélation

APPENDK D:

protein

analysis

170 172 174

of

motions observed in nanosecond

molecular

dynamics

simulations

175

4.1. Summary

175

4.2. Introduction

175

43. Methods

178

4.3.1 Molecular model and computational procédure 4.3.2. Analysis procédure

4.4. Résulte and discussion 4.4.1. 4.4.2. 4.4.3. 4.4.4.

178 179 180

Stability ofthe simulations Simulated B-factors calculated using averaging Windows of différent lengths Corrélation between simulated and expérimental B-factors over différent averaging Windows Build-up time offluctuations and cross-correlations Contents

181

183 187 188

V

4.4.5. Cross-correlotions

190

4.4.6. Influence

192

ofthe fitting procédure

4.5. Conclusion

194

4.6. Références

195

CHAPTER 5:

Computational approaches to study protein unfolding: lysozyme as a case study

Hen egg white

5.1.

197

197

Summary

197

5.2. Introduction

Study of protein folding and unfolding by computer simulation 5.2.2. Expérimental data on the folding and unfolding of hen egg white lysozyme 5.2.7.

5.3. Methods

197 199

200

5.3.1. Molecular model and

200

5.3.2.

200

5.3.3. 5.3.4.

computational procédures Température induced unfolding: T-run Pressure induced unfolding: P-run Constant radial force induced unfolding: F-run

5.3.5. Kinetic energy

gradient driven unfolding:

K-run

5.4. Resuite 5.4.1.

5.5. Discussion

5.5.3. 5.5.4. 5.5.5.

202

202

T-run

unfolding: P-run 5.4.3. Constant radial force induced unfolding: F-run 5.4.4. Kinetic energy gradient driven unfolding: K-run 5.4.5 Peptide amide hydrogen bonding to water

5.5.2.

201

202

Température induced unfolding:

5.4.2. Pressure induced

5.5.1.

201

209 210 214

215 217

Methodology: température induced unfolding Methodology: pressure induced unfolding Methodology: artificial driving forces Peptide amide proton protection Models forfolding and unfolding

217

217 218 219 220

5.6. Conclusion

221

5.7. Références

222

APPENDK F:

Algorithm for driven

the kinetic energy

unfolding

gradient 225

Contents

VI

APPENDK G:

Définition of the uniform déformation rate vectors

APPENDK H:

Conservation of the overall kinetic energy in the

kinetic energy

gradient driven unfolding

229

231

Outlook

232

Acknowledgements

234

Curriculum vitae

235

Contents

1

Résumé Ces dernières décennies ont vu,

ordinateurs, méthodes

une

avec

l'accroissement continuel de la

augmentation rapide du nombre,

informatiques utilisées

en

chimie

des

performances

théorique.

Dans

et

puissance

des

précision

des

de la

méthodes, les champs de

ces

classiques empiriques utilisés en modélisation moléculaire occupent une place unique, car ils permettent actuellement de générer jusqu'à 10*-107 configurations (en termes force

de

dizaines de

dynamique, quelques

106

nanosecondes) de systèmes comprenant jusqu'à 105-

thermodynamiques

atomes. Le calcul d'un certain nombre d'observables

mécanique statistique est ainsi

biologiques

de macromolécules

L'application

de

ces

champs

méthode permettant de traiter extrêmement solution

en

portée,

grand

et

de membranes

ou

hors

problèmes chimiques

de macromolécules

équilibre (Chapitre 5). aux

biologiques

Dans les

perdre

de

vue

point

de

rapport

Le

aux

possibilités

et

Chapitre

méthodes basées Sont

1 est

également discutées

qui déterminent

introduction

champs

rapport

celles des

en

l'équilibre

solution à

une

attention

ou

qui

précision

de la

de la fonction

l'espace des conformations,

aux

employées.

aux

champs

de force

empiriques.

à d'autres modèles

Les

théoriques.

du système et des observables étudiées,

devrait être choisi pour résoudre

champ

nous

résultats des simulations et par

comparées

caractéristiques

l'utilisation d'un

l'omission de certains atomes

générale

de force sont

le type de modèle

spécifique. Quand

par

limitations des méthodes

une

sur ces

critique

vue

biochimiques

d'équilibres

que la fiabilité d'études utilisant des

d'interaction utilisée et du domaine échantillonné dans un

en

et

et

exemples présentés,

champs de force classiques dépend essentiellement

adopté

liquides, exemple.

développements méthodologiques ainsi qu'à une analyse

détaillée des résultats simulés. Sans

avons

par

constitue à l'heure actuelle la seule

empiriques nombre de

lipidiques,

importants: description de liquides polaires (Chapitre 2)

particulière est portée

à l'aide de la

de même que que la simulation de

solution

de force un

(Chapitre 3), dynamique

(Chapitre 4) toute

à notre

de force

classique

un

semble le choix

problème

approprié,

groupes d'atomes dans la fonction d'interaction

(p.ex.

solvant, chaînes latérales de protéines...) permet de réduire le temps de calcul. L'adjonction de certains

degrés

de liberté

permettre l'étude des

d'interpréter élémentaires, l'interaction

quantiques (p.ex. protons, électrons...) peut quant

propriétés

acide-base

les fonctions d'interaction nous

ou

classiques

pouvons identifier les

comme une

somme

des réactions

conditions qui doivent être satisfaites pour

terme de

en

hypothèses qui

de termes

analytiques.

qu'un champ

Résumé

(bio-)chimiques.

principes physiques

mènent à Sont

à elle

En essayant

une

description

également

de force donne

de

discutées les

une

description

2

thermodynamique

et

dynamique

énuméré les différents choix

correcte

possibles

systèmes

des

moléculaires. Enfin,

après

avoir

dans le nombre, le type et la forme fonctionnelle des

utilisés, ainsi que les règles de combinaison employées dans les champs de force

termes

d'usage

courant,

nous avons

abordé le difficile

problème de l'optimisation des paramètres

partir d'informations expérimentales.

à

problème

Le

du traitement des interactions

abordé dans le Chapitre 2. La précision

est

globale

l'approximation la plus grossière qui entre dans (typiquement l'eau) interactions et

la

des macromolécules

ou

électrostatiques

précision

des

courant

sa

souvent un

basée

une troncature

périodiques

aux

rôle crucial dans la fiabilité

limites de l'échantillon

sur

des groupes de

sur

propriétés

étudie, il

charges,

ou

d'usage

(iii)

liquide

et

comparées

sur

en

la théorie du

sommes

sont

appliquées

détail. Les résultats montrent que

champ de réaction,

interatomiques.

Dans le second cas, les

celles calculées

en

à des forces

point

de

utilisant des

propriétés calculées

sommes

de réseau. De

plus,

électrostatiques qui s'annulent à la distance vue

une troncature

été étudiées par simulation de

reportés

dans le

complexation

Chapitre

des deux

3.

présentons

deux

systèmes dans

un

Expérimentalement,

paires diastéréomériques

modèle

accord

avec

méthode conduit

sur

les

sommes

de réseau.

de complexes de deux molécules

dynamique moléculaire.

des simulations de

des

sur

des distances

cyclohexanediamine) liées par liaisons hydrogène

à 298K, montrent des différences inattendues Nous

lorsqu'on

de troncature et est moins coûteuse

thermodynamique et la dynamique de formation

organiques simples (cyclopentanediol

sur

dans des

basée

sont en excellent

cette dernière

du temps de calcul que les méthodes basées

et

de réseau,

l'inclusion de corrections

groupes de charges donne de moins bons résultats qu'une troncature basée

ont

est

simulées. Ces artefacts

en utilisant des

champ de réaction. Ces trois méthodes

la théorie du

inclut une correction basée

La

polaires

simulés, le traitement des

de tronquer ces interactions à une distance donnée. Cependant, la troncature directe

simulations d'eau

du

liquides polaires

calculées. Pour restreindre le temps de calcul et permettre

peuvent être réduits p.ex. par (i) le calcul de l'interaction

(ii)

les

de force est limitée par

définition. Lorsque des solvants

donne lieu à de graves artefacts dans beaucoup de

basées

champ

d'un

solution sont

en

longue portée joue

à

propriétés

l'utilisation de conditions

électrostatiques dans

Les résultats de cette étude sont

paramètres thermodynamiques

de

de molécules, mesurés dans le benzène

qui appellent

dynamique

simplifié

les

une

moléculaire de

de solvant. La

justification théorique.

longue durée (0. lus)

longue durée de

ces

des

simulations

permet d'étudier de manière exhaustive les espèces présentes à l'équilibre, rendant possible un

calcul

d'énergie

libre par comptage direct. Les résultats

Résumé

sont en

bon accord

avec

les

3

valeurs

expérimentales

hypothèses

pour l'une des

inhérentes

au

paramètres du modèle

modèle lui-même. Une analyse

sur

hypothèses sous-jacentes au modèle entre résultats

théoriques

modélisation

et

lui-même doivent être mises

inadéquate des liaisons hydrogène

ou

possibilité

a aucune

conséquent,

Par

en

aux

de l'influence des

les

doute. Ce désaccord

être dû à des interactions

d'aggrégats de plus de deux

désaccord pour

en

du modèle soit

systématique

ajustement des paramètres.

expérimentaux peut

à la formation

(benzène),

paramètres

indique qu'il n'y

les résultats simulés

d'améliorer l'énantiosélectivité par

le solvant

paires diastéréomériques, mais

Ceci peut être dû soit à l'inexactitude des

paire.

l'autre

spécifiques

avec

molécules de soluté, à

à des effets de redistribution de

une

charge

électronique. Le

problème

l'équilibre

en

de la convergence d'observables dans des simulations de

solution

est

abordé dans le

Chapitre

moléculaire de deux

protéines, BPTI et HEWL,

position atomique

et de

fluctuations sont

souvent

calculées

car

liaison

avec sur

un

avec un

extrapolation ait un

sens:

substrat). L'hypothèse

sur

le

système

l'examen de l'évolution

clairement que si la condition (i)

(ii)

n'est pas

remplie à l'échelle

facteurs B simulés et de temps et

est

Les

facteurs

se

déroulent

représentatives

produiraient sur des temps plus longs

soit

en

qu'une

équilibre durant la simulation

convergé sur

temporelle

durant la

période

et

ou

telle

(ii)

que

de simulation.

des fenêtres de temps de différentes

des

propriétés mentionnées,

montre

satisfaite pour les deux protéines étudiées, la condition

de temps de la nanoseconde. De

expérimentaux

techniques

se

trajectoires, basée

deux

aux

est faite que les corrélations de

substrat. Deux conditions sont nécessaires pour

les fluctuations et leurs corrélations aient

et

importants qui

la courte échelle de temps des simulations sont

(i) que

L'analyse systématique de

atomiques.

beaucoup plus longues que la nanoseconde (p.ex.

de mouvements de plus grande amplitude qui lors de la liaison

à

dynamique

cristallographiques. Les corrélations croisées sont utilisées

dans les protéines à des échelles de temps

fluctuations observées

protéines

de fluctuations de

en termes

elles peuvent formellement être reliées

indirecte de certains processus fonctionnels

comme mesure

longueurs

étudiées

corrélations croisées entre les mouvements

B déterminés à partir de données

repliement,

sont

4. Des simulations de

est douteuse

en

plus, la comparaison

des

raison des environnements, échelles

de détermination très différents

impliqués

dans

l'expérience

et

la

simulation.

Au

appliquées

Chapitre 5, sur

nous

étudions l'influence de différents types de

le processus de relaxation observé lors de simulations de

solution hors-équilibre. Dans la communauté scientifique,

Résumé

on

observe

un

perturbations

protéines

en

effort considérable

4

pour tenter d'élucider la manière dont une

structure tridimensionnelle se

processus

résolution

produit.

nanoseconde

protéine

nous

déploiement

une

explicite),

se

replie

seul le processus de

chemins de

Simuler le

d'une force motrice artificielle

dépendra alors

protéine

et

de celles de la force motrice choisie. Pour aborder

une

déploiement

et

comparoons des simulations de

de la

méthode basée

dynamique :

de mécanisme de

de la

d'une force radiale

redistribution des vitesses

atomiques.

Les et

repliement est plutôt douteuse. Malgré cela,

les résultats simulés et certaines données

protons amidiques, capacités

ce

moléculaire du

augmentation

pression (lOkbar), application sur une

simultanément

observés sont fortement dépendants de la force motrice choisie

en termes

comparaison entre

une

repliement

le chemin de relaxation

de HEWL utilisant différentes forces motrices

interprétation

au

être

et

présentons

ce

déploiement peut

déploiement d'une protéine en

pourquoi l'application

en

de force de

nécessaire,

de la

à tous les atomes, et

la

pratiques, lorsqu'on utilise champ

repliement.

température (500K), augmentation

leur

donnée d'acides aminés

quel(s) chemin(s) dynamiques)

selon

accélération d'un facteur 106-10' par rapport

in vitro. C'est

caractéristiques

problème,

finales du

représente

extrêmement forte est des

solvant

séquence et

L'hypothèse est alors faite que les étapes initiales du déploiement peuvent

comparées aux étapes d'une

Pour des raisons

atomique (incluant un

être étudié.

une

unique,

expérimentales (échange de

calorifiques, compressibilités) indique

que

ces

simulations

peuvent aider à comprendre la stabilité des protéines ainsi que la stabilité relative des éléments de structure secondaire

qui

les constituent.

Résumé

5

Summary

a

With the continuing increase of the power of computers, the past décades hâve seen rapid increase in the number, performance and accuracy of theoretical computational

mefhods in

chemistry. Empirical

unique place

among thèse since

classical force fields for molecular simulation occupy

they

can

currently

and generate up to lf/-107

configurations of thèse Systems (tenss

of dynamics). This

that

using lipid

means

only available biochemical solution under

example,

methods in

problems,

(Chapter

are

the

as

dynamics

simulated results.

Keeping

dépend crucially

the outcome of the

are

description

of

force fields

can

are

extremely important

be computed

chemical and

equilibrium (Chapter 4)

The main focus in the and

a

or

therefore the

polar liquids (Chapter 2), equilibrium

of solvated biomolecules at

in mind that the

on

Empirical

address many

methodological developments

in or

applications reported

detailed

analysis ofthe

reliability of studies using classical force

fields

the accuracy of the interaction function and the extent of

conformational space that has been to

to

non-equilibrium conditions (Chapter 5).

in the présent work is towards

will

thermodynamic observables

within reach.

position

a

such

3), the

number of

of nanoseconds in terms

and that the simulation of solutions, solvated biomolecules

statistical mechanics,

membranes, for

a

a

handle Systems of up to ÎO'-IO6 atoms

sampled,

a

critical

point of view

adopted with respect

is

simulations, and the power and limitations of the methods employed

discussed. Fn

Chapter 1,

Empirical

gênerai

a

introduction

force-field methods

are

empirical

to

compared

classical force fields is

given.

with other theoretical models, and the

characteristics of the System and observable(s) of interest that détermine which type of model should be chosen in order to solve

classical force fields

seem

the

a

given problem

appropriate choice,

are

discussed. When

solvent, protein sidechains) from the interaction function may lead

computational efficiency. Alternatively,

degrees

of freedom

properties

or

the inclusion of selected

(e.g. protons, électrons)

may allow for

(bio-)chemical reactions. By trying

classical interaction functions in terms of first

analytical description be satisfied for

a

as a sum

force field to

molecular Systems

are

give

form and combination rules

a correct

the

quantum-mechanical

description physical

assumptions

(building blocks)

(e.g.

to an increased

of acid-base basis of the

that lead to

an

and the conditions to

thermodynamic and dynamical description

Finally, possible

defining

a

understand the

principles,

of force field terms

identified.

to

empirical

the omission of selected atoms

of

choices in the number, type, functional

the terms found in

currently

used force fields

are

listed, and the difficult problem of parameter calibration from expérimental data is discussed.

Summary

6

Chapter 2,

In

the

problem of the treatment of electrostatic

is addressed. The overall accuracy of that enters into its définition.

(bio-)molecules

the crudest

sol vents

computational effort to truncate

and allow for the

of

use

thèse interactions at

long-range

(typically water)

or

plays

a

To limit the

properties.

it is

common

convenient (cutoff) distance.

Straight

periodic boundary conditions,

some

polar liquids

approximation

electrostatic interactions

reliability and accuracy of the simulated

crucial rôle in the

practice

simulating polar

When

in solution, the treatment of

interactions in

by

force field is limited

a

truncation, however, results in serious artifacts in many simulated properties. Thèse artifacts may be reduced e.g. by (i) the group truncation scheme,

methods

applied

are

show that when

instead of the

using are

a

a

in simulations of

sum

method and those calculated

complexes

cyclohexanediamine, thermodynamic

of

Systems in

simulations,

equilibrium possible.

are

studied

parameters for

using

an

a

a

simplified

be studied

The results

are

nearly identical results can

organic

binding

properties

lattice

are

molécules,

solvent model of the and

are

reported.

inaccuracy

dynamics a

sum

experiment

with

of hydrogen-

and

no

and

for

one

of the

the

dependence is

problem

or

electronic

length of

more

by

direct

at

counting

(ii)

with the

experiment.

assumptions

of simulation results to

improve

the

the model

spécifie solute-solvent

that two soluté molécules,

charge redistribution effects.

of the convergence of observables in

simulations of solvated proteins is addressed. Nanosecond

Summary

a

the

species présent

assumptions underlying

(benzène) interactions, formation of clusters involving

as

for

diastereomeric pair,

found

itself should be questioned. Possible causes for discrepancies may be

bonds

or

the

simulations ofthe

disagreement

possibility

appealing

Due to the

of the model parameters

enantioselectivity by parameter tuning. Therefore,

the

methods.

of molécules,

are

free energy calculation

systematic analysis

improper modelling of hydrogen

pairs

which

is obtained, the

obtained for the other pair, in

parameters is performed

Chapter 4,

calculated

cyclopentanediol

(0.1 us) molecular-dynamics

timescale

good agreement

be due to either (i) the

model

In

in détails. The results

reaction-field correction scheme

a

of the two diastereomeric

exhaustively,

in

charge-

by molecular dynamics simulations. Experimentally,

well-converged picture

can

of a

atomic truncation scheme

the simulated

using

display unexpected différences,

Long

inhérent to the model itself. A on

compared

and

computationally cheaper than

simple

two

theoretical rationalization.

but

use

addition, the latter scheme leads to electrostatic forces which

measured in benzène at 298K,

This

method, (ii) the

Chapter 3, the equilibrium thermodynamics and dynamics of formation

bonded

is

water

sum

reaction-field correction. Thèse three

(usual) charge-group truncation scheme,

lattice

in excellent agreement. In

two

liquid

a

reaction-field correction is included

vanish at the cutoff distance and is In

of a lattice

use

the inclusion of

(iii)

or

molecular-dynamics

equilibrium simulations

7

proteins, BPTI and HEWL, are

of two

studied in terms of atomic positional fluctuations and

cross-correlarions of atomic motions. The former

formally an

indirect

for many

measure

longer than

timescales much

functionally important

nanoseconds (e.g.

of the

larger amplitude

meaningful

that (i) the System is in

fluctuations and corrélations

are

trajectories using time

development of the above properties

In

Chapter 5,

the

comparison

experiment and

which dynamical

expérimental

be studied. The

pathway(s) this used

of an

pathway

will

selected

driving

unique

For

practical

unfolding

of a factor 106-109 with respect to

extremely strong

artificial

dépend simultaneously on force. To address this

to

elucidate the way

reasons, when

the

driving

of

a

protein

within

protein folding

a

method based

on

velocity scaling.

questionable. data

(peptide

On the other hand,

amide proton

simulations may provide

their

and their

constituting

The

are

presented

protein

and

unfolding pathways

interprétation

comparisons

exchange,

insight

heat

into the

in terms of

and those of the

are

folding

Summary

of

proteins

to

increased ail atoms,

found to be

strongly

mechanism is rather

of the simulation results with

stability

Thus, the

simulations of the

compared:

capacities, compressibilities)

structural éléments.

nanosecond

in vitro.

température (500K), increased pressure (lOkbar) application of a radial force

driving-force dépendent,

may with

force is required, and the relaxation

the characteristics of the

forces

along

compared a

a

force field

a

unfolding process

problem, molecular-dynamics

unfolding of HEWL using différent driving

the type of

three-dimensional structure and

then made that the onset of unfolding may be the

on

solvated protein is addressed.

community in trying

occurs.

process

folding. Simulating

speedup

a

a

(including explicit solvent), solely

assumption is

the latest stages of

and

proteins

B-factors is

the simulation.

simulations of

non-equilibrium

in

at atomic resolution is

application

and by monitoring although condition (i) is

problem of the dependence of the relaxation pathway

the

séquence of amino acids folds into

a

(ii)

period. Systematic

lengths

that

of simulated and

There is considérable interest in the scientific

represents

the simulation and

or

be

to

due to the very différent environments, timescales and détermination

involved in the

perturbation applied given

clearly

are

timescale

extrapolation

an

as on

assumption

not satisfied on the nanosecond timescale for the two

study. Additionally,

techniques

shows

The

longer

within the simulation

Windows of différent

of the two

the time

questionable

occur on a

equilibrium during

converged

analysis

fulfilled, condition (ii) is

binding).

can

used

proteins

in

occur

they

are

the short simulation timescale

Two necessary conditions for such

binding.

under

substrate

on

motions that would

upon substrate are

B-factors. The latter

processes that

folding,

is made there that fluctuation corrélations observed

représentative

often monitored because

are

expérimental X-ray crystallographic

be related to

expérimental

indicate that thèse

and the relative

stability

of

8

Préface The to a

primary goal

of theoretical

approaches

chemical and biochemical problems is

to

rationalize the behaviour of macroscopic Systems in terms of a microscopic model. Once

model has allowed for

one

may

assume

a

successful

that it has

and

description of a class

of properties,

captured the essential underlying physics governing them,

the model may be further used for

possible microscopic

A

understanding

prédiction

of related

model (Model I) for

properties

macroscopic

and

in related Systems.

molecular Systems is

an

infinité statistical-mechanical ensemble of Systems in which nuclei and électrons interact

according électrons thèse

quantum-mechanical

to an exact

can

particles

appear

are

seldomly available

to say that Model I would allow for a correct

and biochemical an exact

Hamiltonian. Of course, both nuclei and

be described in terms of smaller

problems. Unfortunately,

description

of

particles,

in the

but the

description

the

use

macroscopic phenomena

énergies

day-to-day world.

a

fundamental model to obtain

impossible.

First of ail, exact analytical solutions for macroscopic properties based

virtually further

never

cases,

available. Statistical mechanics leads

approximations

mechanics

can

équations

heavily

on

computing

roughly an

are

describe

made

analytically solely

is

on

solutions

particle Systems (H, He+,...).

numerically. Theoretical

a

solve ail

the method that is the most suitable to One option is to insist

computational costs

on a

to any

can

a

are

when

chemistry such

In ail other

study is

the available

currently

possible problems, but that a

by

increases

still far

are

me

be studied. It has thus become clear that there

problem

one

should rather sélect

of interest.

quantum-mechanical description of the Systems. The high

of quantum mechanical models, however,

statistical ensembles of

only

tradeoff has to be found between the accuracy of

and the size of Systems that to

Model I

therefore relies

order of magnitude every 5-7 years, numerical solutions to Model I

universal method able

no

analytical

the power of modem computers

from reach for many Systems and

description

two

computers, and the fundamental limitation

Although

to

(e.g. idéal gases, idéal solutions,...), whereas quantum

hâve to be solved

resources.

before

of most of the relevant chemical

of such

is

to invest

It is thus reasonable

sufficient size

to

prohibits

the

génération

of

compute many thermodynamic observables with

reasonable accuracy, which

implies that the statistical description has to be simplified using crude analytical approximations or simply abandoned. The quantum description is a

most

of the time

simplified by

découplés the motion of nuclei H) an arbitrary

the

use

of the

Born-Oppenheimer approximation

and électrons. This leads

to a

second type of models

which

(Model

collection of Systems in which fixed nuclei interact with their électron cloud

according to an approximate quantum-mechanical Hamiltonian.

Préface

The

only exact numerical

9

solution to this

problem

would be

Hartree-Fock calculation

a

déterminants into account (full CI) and using

using a limited

molecular orbitals. Even

basis set, full CI

Systems and further simplifications in the Hamiltonian of ab initio

méthodologies

hâve been

approximations: plain Hartree Fock, of

use

proposed,

alternatively,

use

of

density

become

the lowest

at

extremely poor.

theory.

quantifies the

(cheapest) levels

constants but as

reasonable

seem

referred to

as

adjustable

of the interaction

nature.

or

theory

be handled. On the

experiment may

first-principle

can

be tuned to

combine

(Hamiltonian)

the

replace

but

derived

improve

correct

(or

the constants

the basis of expérimental

on

quantum-mechanical, thèse methods

semi-empirical version, study

phase, however, the important observables

of

is well-suited to the

gas-phase problems.

In

statistical mechanical in

are

Therefore, their calculation involves the considération of many degrees of freedom

use

study

not

play an

intermolecular forces in the gas

forces in the bulk

phase.

This is

chemical and biochemical To

significant size,

which

of Model II. In other words, quantum mechanical models

where intermolecular forces do to

which

and thus to the

and statistical ensembles of configurations of a the

or

semi-empirical.

single-molecule properties

the condensed

can

to treat

collection of adjustable parameters tuned

Model II, either in its ab initio

calculation of

that

parameters that

called empirical. When the Hamiltonian is

are

small

A wealth

discrepancies with experiment appear to

into it

a

cases.

various types of

by

of theory, the agreement with

entering

by

expand the

to very

Of course, the lower the level of

data

are

characterized

Such methods,

agreement with experiment.

approximated) analytical form

applied

needed in most

larger the System

In thèse cases, and when

bear somewhat systematic trends, it may no more as

are

be

excited

ail thèse with basis sets of différent sizes,

functional

and the smaller the size of the basis set, the other hand,

only

can

ail

to

inclusion of a limited number of excited déterminants,

many-body perturbation theory, the

taking

infinité basis set of functions

an

bypass to

description

a

phase, they

are not

suited for the

and their average effect replaced

by

(ii)

an

the électrons

be used

of thèse

important

analytical

through large (although finite) ensembles

may become

can

the nuclei

move

removedfrom the

représentation,

description

possible. This leads

finite statistical mechanical ensemble

Préface

be

interaction function between

method is used, the statistical mechanical

a

study

of many

Systems

simplified by considering that,

nuclei. Due to the lower computational expense of this

HT),

study

Born-Oppenheimer approximation, (i)

the laws of classical mechanics and

of models (Model

to

although they can

serious limitation for the

this limitation, Model II may be further

adéquate sampling

currently prohibits

restricted

problems.

within the framework of the

according

clearly

dominant rôle, and

are

and when

an

of the System

to

the third type

of atoms

interacting

10

according to a classical mechanical Hamiltonian.

Even if the real

classical, classical and quantum statistical mechanics lead to the

partition

and classical

changes

are

involved

functions become

(no electronically

équivalent, provided

excited states

significantly populated (high enough température, of interest. If thèse conditions function

describing

give

Models I

no

that (i)

electronic

solely physical

rearrangements), (ii)

motions of the nuclei

to

light particles)

satisfied, it is possible

II

by averaging

derived from first

the electronic

out

principles,

most often defined

parametrical

as a sum

functions of

Many choices

find

to

classical interaction

a

of the system with the

are

degrees

they

of freedom, due to the

but rather tailored and calibrated in of

physically-based

(or

one

a

few)

internai

an

microscopic

force field terms. Thèse

coordinate(s)

that

ofthe

empirical parametrization

dynamical properties

properly only if (v)

the omitted

fields) degrees of freedom explicit degrees offreedom ofthe model. resolution force

Although they or

popularity,

(electronic,

are

up to

now

extremely important chemical

the

(tens of nanoseconds in

dynamical properties

a

are

terms

of solids,

within reach.

of

hâve

a

capture the

not sufficient to ensure a

Dynamical properties

study

will

supra-atomic

of processes

are

only available methods

dynamics),

to

much shorter relaxation time than the

and biochemical

liquids,

analytical

of the force-field

but also atomic in

in

involving

rapidly increasing a

position to

in

address

problems. They can currently handle

Systems of up to 105-106 atoms and generate up to 106-107

such

are

ofthe system.

do not offer the best framework for the

they

are

interatomic distances.

or

becomes reliable. Since the real

proton transfers, empirical classical force fields since

generally not

thermodynamical description of macroscopic

world is not classical, thèse four conditions

description

be described

so

computational

empirical way. It is

possible for the number, type and functional définition

physics of the system,

example,

level of

may be used to

Systems is then that (iv) the selected functional form is sensible enough

many

same

the interaction function is

practice,

terms. A further condition for the correct

électron

(iii) statistical

and

it is most often not feasible to dérive the classical interaction function from

or

limitations of thèse models. Therefore in

correct

ail are

allow for the convergence of the observables

if classical molécules do not exist,

even

world is not

results, i.e. quantum

thermodynamical description of macroscopic Systems. Although possible in

a correct

principle,

are

to

thermodynamic properties

Model I. Thus,

as

accuracy

the

generated

are

or

corresponding

discrète quantum mechanical energy levels

ensembles of a sufficient size

microscopic

same

configurations

and thus, the

of thèse Systems

study of thermodynamic and

solvated (bio-)molecules

or

lipid membranes,

for

Although challenging, the possibility of addressing problems of

complexity, but also

of extrême

practical relevance,

should not obscure the inhérent

limitations of empirical classical force fields. Thèse limitations

Préface

are

essentially determined

11

by the five conditions listed in the previous paragraph. Consequently, in

empirical

force fields,

outcome of the

a

critical

point

of view should be

adopted

ail

applications of

with respect to the

simulations, and the power and limitations of the methods employed. As

for any scientific

study,

three criteria

reliability, reproducibility, and

are

essential for the scientific

quality of a given

accuracy, the assessment of which should form

part of any theoretical study.

Préface

an

work:

intégral

12

Chapter

Empirical classical

1.1.

force fields for molecular Systems

Summary When

spécifie

a

molecular system is to be studied

Computing

largely

power,

be consirered will

prohibit the

functional methods. In this case, Unless

a

one

interaction function

are

to further remove the

description

of

(expensive)

may turn to

an

their ability

to

large Systems,

to

of freedom of

use a mean

the

use

of

reproduce and predict

spécifie

solvent

or a

empirical

or

such

a

force-field

the

a vast

density

description will be at best

when many évaluations of the

atoms

one

may

attempt

(solvent, protein sidechains...)

low resolution force field

amount

or

description.

(Section 1.3).

classical force fields résides in

interesting to try to vmderstand their physical basis Classical force fields

molecular orbital

empirical

required to compute the observable(s) of interest,

degrees

and

only justification

together with

chemist and biochemist, the number of

to the

use

hybrid quantum/classical treatment is performed,

at the atomic resolution. For very

from the

of interest will,

détermine which theoretical method is to be used

(Section 1.2). For many Systems of interest atoms to

using a theoretical computational method,

observable(s)

characteristics of the system and

available

The

1

principle

in

of expérimental data. It is nevertheless

in terms of first

principles (Section 1.4).

originate from the averaging out ofthe electronic (Born-Oppenheimer

surface) and possibly also some of the atomic (potential-of-mean-force) degrees of freedom from the

quantum-mechanical

which the

averaged

Hamiltonian of the System. The

interaction function

hâve been removed from the

performed correctly,

any

dépends

description

are

are

called

thermodynamic property

ensemble averages of instantaneous observables

Chapter

degrees

implicit. that

can

If this

be

depending solely 1

of freedom upon

called explicit, whereas the

averaging

expressed in on

the

ones

that

process is terms of

explicit degrees

13

of freedom

can

quantum effect

cornes

play (high enough température,

into

will however be described

properties

of freedom hâve

degrees

exactly by the classical interaction function, provided that

be described

freedom. In

practice,

a

functions of are

one

(or

a

few)

It is most often defined

empirically. coordinate(s)

internai

or

atoms

a

sum

Many

interatomic distances.

term

as

of

principles, of

analytical parametrical

are are

number, type and functional définition of the force field

1.6). Combination rules, which détermine force-field types of

explicit degrees

not derived from first

(Section 1.5). Thèse

force field terms

for die

possible

(electronic, atomic)

if the omitted

generally

the interaction function is

no

light particles). Dynamical

much shorter relaxation time than the

but rather tailored and calibrated

physically-based

properly only

no

parameters

defining the corresponding internai coordinate,

are

as a

terms

choices

(Section

function of the

also

an

important

component of the définition of a force field. Once functional forms and combining rules

corresponding parameters

hâve been selected for ail terms, the

(Section 1.7).

This calibration is

possibly complemented by parameter tuning is the

problem

important

can

a

the results of

difficult task in the

be solved

high

of

an

of

of

functional forms and

force field.

require long simulations

observables in terms of simulated

straightforward. Finally, the

spécifie

Only parts

procédures. Convergence

empirical

or

be unreachable, and

properties

values of each parameter will be

with the values of other parameters, with the choice of the

the

be calibrated

to

level quantum chemical calculations. The

design

somewhat automatic

using

simulated observables may

interprétation of expérimental not be

hâve

generally performed using mainly expérimental data,

explicit degrees

combining rules, and

may sometimes

strongly correlated of

freedom, with

with the choice of calibration

Systems and observables. Thèse corrélations will strongly reduce the domain of validity of a

given

force field, and there is thus

no

"universal" force field (Section 1.8). At last, the

optimal choice of a combination of terms,

their functional forms, the

the parameters should be such that die terms (force-field

transferability

from

one

molecular System

to

the

combining rules

building blocks)

and

hâve the best

other, witiiin the domain of validity of the

force field.

1.2. Introduction With the

rapid

continuing increase

increase in the number,

methods in

chemistry (van

performance

Gunsteren et

distinguish three major classes listed in order of

of the power of computers, the past décades hâve and accuracy of theoretical

computational

al., 1989ff, Lipkowitz & Boyd, 1990ff). One

of methods for the theoretical

decreasing computational

methods (Hehre et al., 1986), (ii)

seen a

expenses:

semi-empirical

Chapter

study

(i)

can

of molecular properties,

ab initio molecular-orbital

molecular-orbital methods (Stewart, 1990,

1

14

Zerner, 1991), and (iii) empirical classical force-field methods. The computational expenses of ab initio methods

are

of the order

0(Nf 4) (Hartree-Fock level)

or

higher

(Configuration Interaction, Many Body Perturbation Theory), Nf being the number of basis functions used. or

Density functional approaches

and

lower. The costs of empirical methods scale

stands for the number of of the

scaling with

usually

much

than any other methods

for the simulation of Systems

Since the available

semi-empirical methods

Computing

an

possible problems,

ail

but that

problem of interest.

observable(s)

(size

of the

typically

As is

one

prefactor ÎO'-IO6

up to

0(Nf3)

no

the

to

scaling)

and

atoms.

often the Crue

resources are most

limiting

factor

to

universal method able to solve

should rather sélect the method that is the most suitable

schematically represented

and system under considération that

in

Figure 1.1,

the

properties ofthe

will, together with the available

Computing power, largely détermine which type of method &

as

or groups of atoms). Independently empirical interaction function remains

numerical calculations, it has become clear that there is

to a

scale

0(Na2) down to nearly 0(N„), where N,

elementary particles (atoms

the system size, évaluation of

cheaper

currently allows

as

can

be used,

(van Gunsteren

are

Berendsen, 1990):

A. the

required system size

B. the

required volume of conformational

terms

C. the

of

dynamics:

particle,

requirements

in conflict wim

hierarchical

or

a more

die

group of atoms, treated

sampled (in

the smallest

explicitly in

entity,

model)

the

A and B,

hybrid models, or

use

of

a

potential

average effect without

case

the observable cannot be

together mosdy determining die computational effort,

only the

by the design of degrees of freedom are treated

most relevant

resolution method. This is often done, for

base-catalysed, organic, of

in which

C and D, this conflict may be resolved

where

expensive, higher

of acid-

incompatible,

currently available computer resources (van Gunsteren et al.,

requirements

(Warshel, 1991,Field, 1993, is the

or

may be

1995b). When requirements

study

or

required energetical accuracy of die interaction function

Thèse

widi

of particles (determined by

in terms

atom,

computed adequately with are

space that has to be searched

required timescale)

required resolution

subatomic D. the

the

or

enzymatic

example,

in the

reactions in fhe bulk

phase

Whitnell & Wilson, 1993,Liuetal., 1996a). Another example

mean

including

force its

représentation

degrees

for the solvent, which includes its

of freedom

explicitly (van

1994). Mean fluctuations in the solvent may also be included through

équations

of motion

as

in Stochastic

Dynamics (Yun-Yu

Berendsen, 1990).

Chapter

1

et

a

Gunsteren et al.,

modification of the

al., 1988,

van

Gunsteren &

15

OBSERVABLE OF INTEREST

Required

Required

resolution

energetical

of

accuracy

Required

Required

System

conformanonal

terms

m

particles

space to be

size

Hybnd

model

PMF solvent

sampled

Structural7

?

Thermodynamic

?

Dynamical

9

Choice of

explicit

Choice of

c

a

sampling method Number of

Number of

H évaluations

explicit degrees offreedom

Choice of

Computational

interaction Hamiltonian

costs

Hqm OrHclass

Figure 1.1: Schematic représentation of the system m order

simulate

to

an

basic choices made while

Molecular-orbital methods

are

well suited for the

clusters of molécules

in

(supermolecule) averaged solvent environment (Ângyân, 1992, Persico, 1994, Muller-Plathe & as

building

a

model of the molecular

observable of interest Thick hne boxes represent the three essential choices

van

vacuum

study of small molécules

(Keith

&

Frisch, 1994),

or

or

small

within

an

Cramer & Truhlar, 1992, 1995, Tomasi &

Gunsteren, 1994), and give

access to

properties

such

equilibrium geometnes, vibrational frequencies, heats of formation, relative énergies of

conformers and isomensation barriers. Thèse

increasing

accuracy

1991, Maple

et

by empirical

problems

methods (Bowen &

al., 1994a,b). Due

to the size of the

conformational space, simulation of organic molécules

phase

is the domain of atom-based

empirical

also addressed with

are

Alhnger, 1991,

an

Hagler,

and volume of accessible

problem or

Dinur &

macromolecules in the condensed

classical force fields (van Gunsteren &

Berendsen, 1990). Long timescale (or long relaxation time) problems involving large Systems, such

as

protein folding

by

residue-based force fields

&

Smith, 1996). Finding

resolution areas

(i.e.

a

sufficient

or

de

novo

protein design,

can

currently be addressed only

(Gerber, 1992, Jones, 1994, Ulrich

an accurate

description

et

al., 1994,1996, Lathrop

of the interaction at this low

energetical resolution) is, however,

a

major difficulty.

of development with respect to treatment of degrees of freedom

in Section 1.3.

Chapter

1

are

particle Current

briefly discussed

16

Choosing

die

explicitly

force field calculation

handled

(Figure 1.1).

sample

the conformational space

1991,

Scheraga, 1992, 1993,

degrees

of freedom is the first step in

The second is the choice of

(Howard & Kollman, 1988, Leach, 1991,

Osawa & Orville-Thomas, 1994,

1995a). This choice will also

dépend

me

on

information

empirical

an

method to search

a

van

Gunsteren et al.,

van

required

to

compute die

namely:

observable(s)

of interest,

A. Structural

information (searching):

The purpose of thèse mediods is to search conformational space for

one or a

number

of relevant low energy conformations. In the latter case, the conformations obtained not

related

by

of choice is the B.

highest

probabilistic

any well defined

me one

that searches the

low-energy

number of

Structural and thermodynamic

to

get

a

or

largest

dynamical relationship,

of conformational space,

extent

returning

information (sampling):

sample

conformational space

collection of conformations which build

a

part of it in order

or

correct statistical ensemble, that is,

ensemble in which the conformations appear with

a

Boltzmann

probability.

The

séquence of the conformations is not relevant and the method of choice is the which achieves die C.

are

and the method

structures.

The purpose of thèse mediods is to

an

or

Gunsteren,

one

highest sampling efficiency.

Structural, thermodynamic and dynamical information (simulating): The purpose of thèse methods is to simulate the motion in conformational space of it, in order to get

a

séquence of conformations which build

ensemble, but are also consécutive in time (dynamics). In this which

explicitly

contain time

are

Lagrange, Hamilton, Langevin

or

The tiiird choice to be made in interaction function

(or, together with

the selected explicit

degrees

constructed

such

Liouville

équations

me

of freedom

using expérimental

empirical

an

as

information

1.1 ). In

a

(possibly complemented

to the

underlying quantum mechanical reality. Empirical on a

generalization

possibly

or

predict

also of individual atoms) to obtain

implicit degrees

an

Chapter

analytical

1

a

an

to

are

with theoretical

large

amount of

empirical description

classical force fields

Born-Oppenheimer approximation, over

of

principle, empirical force fields

based

of die quantum mechanical Hamiltonian

one

Hamiltonian) corresponding

It is however instructive to try to relate the

ofthe

équations of motion

of motion.

results) and their only justification is their ability to reproduce

expérimental observables.

part

Schrôdinger, Newton,

force-field calculation is die

kinetic energy,

(Figure

case,

die Dirac,

required,

or

statistical

a correct

diat is,

of freedom

are

on an

formally

averaging

(electronic

interaction function

and

depending

17

solely on

die

explicit degrees

interaction will be called

Averaging occurs

of freedom of the model. Due to this

potential of

ofthe

process, the

interaction

function.

over

the

différent chemical/topological

depending

terms

environments

on

1.1

internai coordinate

the same coordinate, that is,

possible options

The list of methods is

functional

on me

of force field

The choice of of

by far not exhaustive and

me

description

parametrization

explicit degrees

elementary

an

freedom)

will be

1.5 and

unit (i.e.

elementary

togefher witii die corresponding type This choice will détermine Gunsteren & Mark, 1992,

or

is somewhat biased towards

me

tiiat will hâve

particle

design

unit and

of

an

empirical

explicitly

of interaction function,

Gunsteren et al.,

and tiras die

The extent of conformational space mat

no

explicit internai

classical force field.

treated

are

degrees

of

freedom,

summarized in Table 1.1. & Berendsen, 1990,

van

1995b):

A. The number of degrees of freedom mat will hâve to be handled

spécifie molecular System,

Finally,

briefly discussed (Section 1.7).

strongly influence (van Gunsteren

van

of

1.6).

of freedom of the model

is the first step in the

Possible alternatives for die

a

représentation

(Sections

simulations and simulation of large molécules (biomolecules).

1.3. Choice of the

degrees

différent geometrical

over

interaction function in atom and united-atom based force fields

problem

the other

in the three basic choices outlined in

(thick lined boxes) and mainly concentrate

condensed-phase

over

(Section 1.4.3)

The présent text will discuss

Figure

environments

différent molécules (Section 1.4.2)

Averaging of a force-field term corresponding to an force-field

B.

averaging

effective

1.4.1)

Averaging of a force-field term

C.

the

or

tiiree levels:

at

model (Section

présent in

me

force

mean

Averaging ofthe quantum mechanical interaction over the implicit degrees offreedom

A.

B.

a

computational

explicitly for describing

effort.

(or in

tenus of molecular

dynamics, the reachable timescale). Because available Computing

power is most often

a

limiting factor,

potential

for

a

can

system of a given size,

energy function will

be searched

me

number of

rapidly decrease witil

freedom.

Chapter

1

possible évaluations

the number of

of the

explicit degrees

of

unplicit solvent

adisk

idem

possible

van

Gunsteren, 1994, (e) Brooks

et

form, parameters,

the équations of motion

based interaction fonction

m

the functional

H

mtramolecular

side-chain

solvent

idem

allH

ail H bound to C

aliphaùc

electromc

solvent

none

none

none

(h)

(g)

(0

(f)

(e)

(e)

(e)

(e)

(d)

(c)

(b)

(a)

REF

Keim & Pnsch, 1994, (d) Ângyan, 1992, Cramer & Truhlar, 1992, al, 1983, Gerber & Muller, 1995, (f) van Gunsteren et al, 1994, (g)

average întermolecular interaction fonction

statistics

terms or

m

solvent terms

corrections

încludrng exphcit

by additional

idem

idem

idem

Hierarchy of explidt degrees of freedom incmded in the model see for example (a) Hehre et al, 1986, (b) Stewart, 1990, Zerner, 1991, (c)

1995, Tomasi & Persico, 1994, Muller Plathe & Jones, 1994, (h) Bâtes & Luckhurst, 1996

Références,

Table 1.1:

or

liquid phase

rod

(or crystal)

a

sphère,

molécules

represented by

a

one or a

(or crystal)

represented by

few beads

proteins

in

"bead(s)":

atom groups

as

unplicit solvent

idem

eg amino-acids

explicit solvent

(ail)

idem

united atoms

groups)

idem

united atom (ail CH,

classical empirical interaction funcuon

gas phase

reaction field contribution

idem, additional

mplicit solvent

semi-empincal approximated HamUtoman

surface

allatoms

(alipbatic groups only)

density functional

pnnciple quantum mecbamcal Hamdtoman,

Bom-Oppenheimer

first

ab mitw,

OUT

idem, supennolecule methods

phase

DF AVERAGED

TYPE OF INTERACTION

(OPERATOR/FUNCnON)

exphcit solvent

gas

PHASE

united atom

(united-)atoms:

électrons and nuclei

ELEMENTARY UNIT

19 C.

The maximum resolution, in terms of particles

of atoms,

or

reactions) that

molécules) can

be achieved

D. The type of functions diat

units in E.

adéquate

an

that

to

describe

is, with

me

interaction between

reasonable

a

energetical

The type of observables the force field may be able to describe

which will

necessarily

Current in terms of

developments

degrees

empirical

in

1993, Whitnell & Wilson, 1993,

are

correctly, and those

Allinger, 1991,

mainly

follow five basic Unes

Dinur &

Gunsteren et al., 1994, Jones,

van

Hagler, 1991, Gelin, 1994), which will

described in Sections 1.3.1-1.3.5. Notediat in 1.3.3-1.3.5, die number of

particles.

is to limit the

dimensionality

1.3.1.

Gas-phase force

et

or

replace

or to

more

force fields is

me accurate

description

of molécules

frequencies, heats

vibrational

fields is made amount and

of

ab initio molecular orbital calculations

possible by (i)

and

such

as

to

(Maple

eitiier et

al.,

equilibrium geometries,

formation, relative énergies of conformers and energy

(Hwang

reliability of data

systematic

al., 1994). Thèse force fields may be used

et

expensive

predict expérimental gas-phase properties

barriers for isomerisation

of

e.g.

fields

al., 1988, 1994a,b, Hwang

complète

see

be discussed hère.

(Bowen & Allinger, 1991, Dinur & Hagler, 1991, Hagler & Ewig, 1994, Maple

vacuum

1994a),

not

primary purpose of gas phase

The in

size of the conformational space to be searched

discretize die coordinates (lattice mediods,

or to

Binder, 1992). Thèse mediods will

me

be

explicit degrees

the force-field resolution in terms of

essentially by decreasing

An alternative way to reduce

terms

sufficient.

classical force fields

offreedom (Bowen &

of freedom is reduced

elementary

accuracy.

(B), die force field resolution in

(C), and die force field accuracy (D)

particles

chemical

stay inaccessible. Accessible observables will be tiiose for which

die extent of searchable conformational space

of

changes,

conformational

die force field.

by

likely

are

manner,

(e.g. subatomic particles, atoms, group

(e.g.

and processes

et

al., 1994). Rapid progress in die design of such force

the absence of intermolecular forces,

(ii)

the

from ab initio molecular-orbital calculations and

increasing (iii)

the

use

relatively inexpensive procédures for parameter calibration using both

theoretical and expérimental data (Section 1.7.5). Thèse force fields, sometimes called class II force fields covalent

(Maple

degrees

and terms that

examples

are

et

al., 1994a,b),

of freedom,

couple

(Hagler

involving

usually characterized by

anharmonic

the internai coordinates

a

detailed

description

(non-quadratic) potential energy

(non-diagonal

energy terms).

of

terms

Typical

the force fields CFF (Lifson & Warshel, 1968, Warshel & Lifson, 1970,

Lifson & Stem, 1982) and et

are

a

recently

al., 1979a-c, Lifson

et

modified version

(Engelsen

et

al., 1995a,b), CVFF

al., 1979), EFF93 (Dillen, 1995a,b), MM2 (Allinger,

Chapter

1

20

1977, Bowen & Allinger, 1991), MM3 (Allinger Bowen &

Allinger, 1991)

The term for

gas-phase force

applications

structures

and QMFF/CFF93 field does not

in condensed

phase

is sometimes used in

mean

al., 1989, Lii & Allinger, 1989a,b,

et

(Maple

al., 1994a,b, Hwang

et

Expérimental information

simulations.

parametrization procédure (Warshel

me

al., 1994).

et

that such force fields cannot be extended

crystal

on

Lifson, 1970,

&

Dillen, 1995b, Engelsen, 1995b). For applications in liquid phase problems, however, tiiese force fields will suffer from die

force fields gas

(Section 1.3.2),

same

difficulties in

and whemer the

inclusion of anharmonic and

phase by

parametrization

significantly improved off-diagonal

terms

as

will resuit in

condensed-phase properties

increase of accuracy in the simulated

condensed-phase

accuracy

gained in a

die

significant

is still matter of

discussion.

Condensed-phase force

1.3.2.

fields

The primary purpose of condensed-phase liquids, solutions of organic compounds or

Tildesley, 1987,

McCammon &

force fields is die accurate macromolecules and

description

of

crystals (Allen

&

Brooks ffl et al., 1988,

Harvey, 1987,

van

Gunsteren &

Berendsen, 1990). Progress in die development of such force fields is slow, since (i) die dominant forces in the condensed described and

impossibility

rely mostly

phase,

and

possible (see

not

intermolecular forces which

(iii) die

on

design

a

large

are

small amount of

a

of

however Section

applied)

not

easily

is

limited,

expérimental

data

systematic optimization procédures

1.7.4). One major

is that the estimation of observables to be

generally requires

are

die relevance of data from ab initio molecular

(even when reaction-field corrections

has to

die condensed

gênerai

is in

vacuum

parametrization

concerning

are

parametrized adequately, (ii)

orbital calculations in and the

phase

reason

for tiiis

compared to expérimental

number of évaluations of die

potential

energy

results

function, and is

dierefore computationally expensive. In thèse force fields, die main effort is aimed at die description of non-bonded forces and torsional potential energy terms. Potential energy terms

involving other covalent internai

coordinates

are

often eitiier

called class I force field)

or

the force fields AMBER

(Weiner & Koliman, 1981, Weiner

et

simply

al., 1995), CHARMM (Brooks

1992, MacKerell Jr. DREIDING

1983a,b,

(Mayo

Levitt et

1987, Scott &

et

et

1989),

(Rappé

et

by me

use

quadratic-diagonal (so-

of constraints. et

Typical examples

al., 1983, Nilsson & Karplus, 1986, Smith & Karplus,

al., 1990), ECEPP/3 (Némethy EREF

(Levitt, 1974),

Gunsteren, 1995),

et

GROMOS

al., 1992), ENCAD (Levitt,

(van Gunsteren & Berendsen,

MAB (Gerber & Muller, 1995), MacroModel

al., 1990), OPLS(Jorgensen & Tirado-Rives, 1988), Tripos (Clark et

are

al., 1984,1986, Pearlman

al., 1995), CHARMm/QUANTA (Momany & Rone, 1992),

al., 1995),

van

(Mohamadi UFF

et

zeroed

al., 1992) and YETI (Vedani, 1988).

Chapter

1

et

al.,

21

1.3.3. Mean-solvent force fields The purpose of but widiout

an

a

explicit

1994). Almough

treatment

accurate

an

computational

by

expenses, e.g.

a

an

et

al.,

solvent, the

treatment of me

explicit

degrees of

almost ail solvent

or

of freedom (van Gunsteren

degrees

description of die structure, mobility, dynamics and energetics

generally requires

of molécules in solution omission of ail

of the solvent

of molécules in solution,

description

mean-solvent force field is die

dramatically reduces

freedom

the

factor 10-50 for biomolecules in solution. The explicit

influence of die solvent is

approximated hère by its mean effect, and possibly also me effect

of its

as

fluctuations,

mean

1993).

implicit

The main

dynamics (Yun-Yu

1995, Fraternali &

van

al., 1988,

are

Gunsteren, 1996) and of

Gunsteren,

structural effect,

mimicked

by a modification

équations

me

van

or

additional terms,

(différent functional form,

of me interaction function

et

hydrophobic

drag,

random fluctuations and viscous

screening,

dielectric

in stochastic

influences of solvent, i.e.

see

e.g. Banks et al.,

(Langevin

of motion

équation). 1.3.4. Low-resolution force fields The purpose of low-resolution force fields is

addressing long timescale phenomena, such de

novo

Computing

power, dièse

(HUnenberger

amino-acid residue level

problems et

are

adéquate expression

energetical

treated to

be

1.3.5.

study

récognition

are

van

for

being developed

for

me

of native mean

force term

expected from Hybrid

A whole

Lathrop

peptides

& Smith,

and

1996).

proteins (Gerber, 1992, The main

difficulty is to

provides

a

in

A correct

a

calibrated via

description

of the

a

sufficient

functional

a

are

statistical

normally

dynamics

is not

such models.

variety of models include me combination

high particle resolution

a

usually

are

structures. The effects of solvent

(Section 1.3.3).

available

force fields

instance, die first

explicidy

using

currently

force fields at atomic

interaction between residues tiiat

(and non-native) protein

while

proteins, protein folding,

Witii die

resolution to discriminate correct from incorrect structures. Once

by a

freedom at

large Systems,

of

in

Gunsteren et al., 1995b). Force fields at the

form is selected, die interaction function parameters

analysis

me

diffïcult to address,

al., 1995a,b,

Jones, 1994, Ulrich et al., 1994,1996, an

fold

protein design and protein-protein association.

resolution

find

as

or

first few

and

a

of

treatment of die

hydration shells

of

a

a

treatment of

odiers at

1

a

few

degrees

of

lower resolution. For

macromolecule may be included

simulation, fhe bulk solvent being modelled dirough

Chapter

a

a mean

force

(Section

22

1.3.3). Anodier typical example is die simulation of chemical,

acid-

or

base-catalyzed

or

reactions, in solution or in enzymes (Warshel, 1991, Field, 1993, Whitnell & Wilson, 1993, Liu et al., 1994,1996a,b).

be

applied to

me

resolution in such

a

quantum mechanical

description

of

computational

costs, such

a

due to the

full system under

be treated in diis way.

1.4.

Clearly,

required. However,

die protons is

study,

ability

to

models is hère die main

hybrid

only justification of empirical

reproduce

and

predict

information used in their

a vast

long

a

few relevant

design

degrees

of freedom

useful to try to understand die

reason

can

of particle

difficulty.

classical atomic interaction functions résides in tiieir

expérimental

and calibration

force field is successful at

or

classical interaction functions

amount of

quantum mechanical calculations. Thus, as a

only

électrons

Finding die proper interface between die différent degrees

Assumptions underlying empirical The

as

and

me

treatment cannot

no

cornes

theoretical

reproducing

results.

justification

data from

of die agreement

Usually,

experiment

from

is in

cause

of die

principle required

experiment.

(or the

most

and not from

It is nevertheless

of discrepancies)

by

considering die relationship between die force-field building blocks (energy terms) and die underlying quantum 1.4.1.

mechanical

Implicit degrees

of freedom and die

degrees

Whatever die

reality.

the reality behind remains

of freedom chosen to be treated

quantum-mechanical

and électrons. Since the electronic of

nuclei, do

not

function, but still fundamental

assumption of weak

are

explicitly

and involves

degrees offreedom,

appear in die définition of die

corrélation within

force field,

a

interaction between nuclei

me

and sometimes tiiose of classical

empirical

a

potential

number energy

présent in die underlying reality, they may be called implicit. The

assumption (or approximation)

on

which

classical force fields

empirical

are

based, is diat the corrélation between die fluctuations in thèse implicit degrees of freedom and die fluctuations in those which

assumption, only

their

the fluctuations in die

mean

effect. This

Oppenheimer principle, of freedom based

on

die framework of dlis energy

surface, PES)

implicit

degrees

me

explicitly

can

degrees of freedom is in

essence

a

be

can

neglected.

be

averaged

séparation of die nuclear

principle, can

be

a mean or

effective potential

Under this out,

generalization of

leaving

die Born-

and electronic

large différence between nuclear and electronic

masses.

degrees Witiiin

energy function (or potential

defined, which describes die interaction of the nuclei in die

instantaneously averaged potential electronic

handled

assumption

which allows

die

are

of die électron cloud. More

offreedom and i die nuclear ones, die

Chapter

1

mean

precisely,

potential

if

u

dénotes die

energy

describing

23

die interaction of the

nuclei,Vnuc( {r,} ),

is defined

time-independent Schrodinger équation

A^[? },{?,))

at a

lowest

as me

eigenvalue of die

given configuration

WFJiir,))

of die nuclei

electronic

{r,}

^((r,}) HiM({rM};{r,})

-

(1.4.1.1)

«„({?„},{?,))

with

=

«„({?„},{?,))

-

K,(lr,}) A

A

is

where Di

equal

to

die total Hamiltonian of die system, Jt M, minus die kinetic energy

A

operatorK, corresponding die

the nuclear

to

ground state electronic wave-function,

treatment is valid

which

only

electronically

the nuclei

are

for

an

degrees

dépends on {r,} only parametrically. This

which

isolated System (time

excited states

play

no

\|»|i({ïM};{r,})is

freedom, and

of

rôle. In

independent

total

Hamiltonian), in

Equation (1.4.1.1), die assumption

motionless while solving die electronic problem allows for the

mat

decoupling

A

of die K

nuclear

operator from the Hamiltonian. The nuclear problem is tiien described by

,

a

time-independent Schrodinger équation

«.({F,}) *,({?,})

E„ *,({?,})

=

(1.4.1.2)

«,({?,})

with

where die

eigenvalues Ew

are

Vnuc({r,})

=

*

die allowed values for the total energy of die system in its

$,({?,})

différent vibrational and rotational states, and wavefunctions. die

mean

view, this

Very often,

me

potential Vnuc({ rj)

approximation

ÂT,({r,})

further

can

assumption

be treated

die

corresponding

is made tiiat the motion of

classically.

is normally valid for ail but

From

me

a

me

nuclear nuclei in

thermodynamical point

lightest

atoms

and at

of

high enough

température, that is, when die classical and quantum partition functions become équivalent. When this classical treatment is adéquate,

équivalent formulations

VJLW)

"ap

or

where me

,,

=

be

given

to

using

die

Hellmann-Feynman tiieorem

two

Equation (1.4.1.1)

^

=-FHICJlirl})=M

j.

This

only

and

means

FnilC0 ({r,}) is

diat when die

potential (first définition)

u

and die

24

potential of mean force (second définition) case

in

an

When classical

an

degrees

of

are no more

freedom,

m, are further removed from the interaction

of freedom i will be described resolution model (in

particle

or

of

protein side-chains),

équivalent. Thermodynamic quantities

an

NVT

equivalendy by

an

explicit

the

ail atom model and

if the Boltzmann factors

ensemble)

die two

defined in terms of

microscopic (instantaneous) observable depending on

ensemble average of a

degrees

constant). This is the

energy surface.

function by averaging (e.g. nuclei of solvent molécules types of définition

a

field, where the classical interaction is described by die Born-

ail atom force

Oppenheimer potential

équivalent (witilin

are

lower

a

identical, that

are

isif

e-V„{[r,Wkf where

e-Vm({rJ.{r,))lkBT

is the interaction at die lower

VMF({Ï1})

constant, and T die

ïj

die correct statistical mechanical définition of

—^

=

W„

system, V„p becomes the

as an

areas

its internai

entropie force,

which

of différent Boltzmann

energy surface. This force

(1.4.1.6)

potential

averaging

{î,}.

Equation (1.4.1.4)), and VmMn

actually

leads

is

Vmea„ ( {r,} ),

out from the

particle j,

/. d-4.1.5)

df

atom

energy,

thus be

coordinates

k

a.potential ofmean-force



averaged

(1.4.1.6)

explicit

sizes in die nuclear

as

>m-m


({r },{?})

ôV

m
(1.4.2.1)

321/

ôr, drk

Chapter

1

io

28 ofthe matrix

eigenvectors

The

used to define

system

as

in

a

unique,

a

spectroscopic force

practical application A.

The

containing

second derivatives

me

(Hessian matrix)

basis set for

orthogonal

non-redundant and

mathematically satisfactory,

field. This is

can

description

me

be

of die

but of limited

since

equilibrium conformation

is

known but

usually not

something

one

would like to

predict. A system at

C.

Other conformations neither die

generally characterized by

is

equilibrium

B.

(non-equilibrium)

Taylor expansion

at

{ï,°)

are

nor me

than

more

one

often also of interest

conformation. in which

-

corresponding well-defined

cases

basis set

are

usable. D.

description through

The accurate

Taylor expansion

a

for

one

provide much insight into other parts of me configurational die E.

physics

configuration

space,

so

does not

does not describe

of die system.

The accurate

of

description

molécule is useless for

one

prédictions

about odier

molécules. The second

approach

functions (energy terms)

relies

on

depending

the

use

of

a sum

functionally simple analytical

of

selected internai coordinates, chosen

on

on

die basis

of chemical intuition. This is justified because A. A wealtii of chemical data tells

angles

and non-bonded

corresponding

entities such

us mat

interactions

are

as

bonds, bond-angles, torsional

physically meaningful, and tiius,

internai coordinates and distances in space appear

in which die functional forms of the interactions

are

likely

to

as

die

die natural choice

adopt

die most

simple

forms. B.

may C.

potential

Since such internai coordinate

give

an

appropriate description

Since internai coordinates involve

hope

to

obtain

building

a

of

energy terms

a

larger part

are

of

physically meaningful, they

configurational

limited number of atoms

blocks transférable from

one

(one

to

space.

four), tiiere is

a

molécule to another (assumption

oftransferability). In odier words,

one wants to

physically meaningful (and transférable from are

one

split me interaction function into a sum of functionally simple, tiras

molécule

insight providing)

to

called force-field terms. The

assumption

mentioned Section 1.4. l,i.e. diat for each term environment

can

be

averaged

terms, which would in addition be

another, and tiius, bear prédictive power. Thèse

out

that thèse

terms

(e.g. bond, bond-angle...),

by considering

Chapter

1

an

terms

exist is similar to the me

one

effect ofthe

ensemble of molecular Systems

29

(topologies)

More

conformations (geometries).

and

analytical expression

explicidy,

one

would like to hâve

V^ar,))

E n^^jea})

=

(1422)

terms a

jea indicates

where die notation

Vml({ r,}) a.

This

die

is die

of

description

Boltzmann factors

are

If this

équation

is

me

j

is involved in die force-field term

potential energy surface

force defined in

mean

îj, jeP

,

and its

on

with respect to

dr~, aV^,(tr;j6tt},{rt,tea})

équation

p,

over

the coordinates

{rk,këP}.

force fields where die coordinates

(1.4.2.4) will always be présent in force fields,

and distance

dépendent

means

often act

terms

that when

die energy term

p

me

on

coordinate

by solving consistently Equation (1.4.2.4)

for

P

a

given

?j

différent set for another

only

initio data

by considering

be removed to

an

set

Equation (1.4.2.4)

an exact

topological

set

of terms

a.

This type of corrélation

design

of force fields

using a

ab

proper

sélection of conformations. Unlike in the définition

of die potential of mean force in Section 1.4.1 has

given

of conformations used tiiere is, however, not

arbitrary

The

collection of différent molecular Systems. A

tiiis is followed in die consistent

(Section 1.7.5). The

statistical ensemble, but

a

one.

a

terms.

may be removed

molecular system. The

may be correlated to

for

procédure analogous

because covalent

is also involved in force field

me term

may

Witii the

normal mode

becomes correlated to thèse odier

corrélation arises from die fact that a

are

the Cartesian coordinate of the

géométrie corrélation arising from diis so-called coordinate redundancy

molecular system, and to

(1.4.2.4)

orj

averaging

spectroscopic

vectors, the second term in

one

one

*

of harmonie

terms a odier tiian

which do not appear

gets after rearrangement

where k dénotes ensemble

This

(1A2.3)

{rk, kf p}

the coordinates

H'Pjmx

same atom.

and

Equation (1.4.1.4) will give

négative logarithm differentiated

~

energy terms

a

of terms

of any molecular system if the

«-"--«vr-.W

D

dF,

exception

as a sum

„-v'°.~A(v-'t"i>/*i>7'

n

=

integrated with respect to

of die coordinates

me

tiiermodynamic properties

v»f (i',)vv

given force-field term p,

a

the

of

identical

e

in

tiiat die atom

analytical représentation

analytical représentation and

same

an

of die form

solution. In

(Equation (1.4.1.4)), practice,

Chapter

1

one

it is not

guaranteed that

fixes functional forms for die

30

flexibility

the

on

As

and calibrâtes die parameters. The

V",,^,

energy terms

entity incorporating

parameters corresponding

that die

for which die term

compounds

give

given

calibrated (Section

was

enough

so

(1.4.2.4)

that

picture

a correct

tiiat

guaranteed, however, 1.4.3. Coordinate Force fields

of the

usually

a

dynamical picture of

practice atoms

with

a

higher

valence

redundant, i.e.

internai coordinate

Hagler, 1991).

can

likely

to

A conformation of formamide is

hâve

an

example, the HCO, carbonyl

group,

assign spécifie vary in

cannot

redundancy is the is

case

of molecular Systems. It is not

will be obtained.

non-bonded interactions (even if in

me

systematic counting,

one

influence

HCN and OCN

angles

are

die six

a

set

however, tiiat tiiere

an

properties

independent

handled

differentiy

are

hydrogens (this only (this

nine

12

=

bonds, 6

necessarily redundant.

mese are

dépendent since,

due to die

manner

from

one

to, say,

a

generic

from the odier

planarity

amide OCN

angles.

For

of die

force field to another, can

as

tiiree

not

possibilities: (i)

use a

easily generalizable to

angle,

since it

Valence coordinate

is illustrated hère for

be defined in émane,

only

torsional coordinate which involves

less

symmetrical cases), (ii)

induces asymmetry in the system and

use one

of

requires arbitrary choices)

or

a

force constant

(tiiis may be computationally inefficient). Botii choices (ii) and (iii)

Chapter

1

one

groups with respect

(iii) accept die redundancy and calculate die nine four-body torsions witii

by

5

tiiey must sum up to 2it. This raises me question whetiier it is possible to and transférable

anodier. There

divided

of 3N-6 are

of

(Dinur &

tiiat is, 17 available internai coordinates,

of etiiane. Out of the nine torsional dihedrals

die dihedrals

problem

of formamide

fully specified by sees,

the energy. Five of

on

case

required to describe me relative rigid body motion of the two methyl

to one

molécule includes

a

from the others. The

be illustrated in die

angles, 4 dihedrals and 2 out-of-plane coordinates, ail

are

transferability

linearly dépendent

are

redundancy

internai coordinates. By

averaging

than two, die valence internai coordinates tiiemselves become

of them

some

also

die class of

When die

calculated in Cartesian coordinates). When

are

on

virtual

defined in terms of internai valence coordinates for the

covalent interactions, and atom-atom distances for the forces

dépendent

1.7.6.3).

a

means

die interaction function will be able

good solution,

assumption

and

really

is

selected for die energy terms

tiiermodynamic properties

a correct

redundancy are

has

term

environments. This

may be

term

performed correcûy and me analytical functions

process is

sensible

to a

a

force-field

possible

average effect of various

me

but radier characteristic of

spécifie molécule,

ensemble of molécules and conformations. Thus,

an

dépend

averaging process, the parameters characterizing die

not tiiose found in any

a are

solution will

quality of the

sensibleness of die selected functions.

conséquence of this second

a

force-field term

to

physical

and

are

31

found in current force fields. At last, the définition of relevant internai coordinates some

insight into

pyramidality

the

which choice may lead to die best transférable entities. For around

nitrogen

a

center

energy terms, it will be difficult to get at the

and

frequencies

is maintained same

time the correct

molécule

are

pyramidal

it will be introduced

dépendent

on

bond-angle vibrational

inversion barrier.

by limiting

Even if redundancy in the valence coordinates is avoided

3N-6, ultimately,

by

is about 0.16

a

die valence coordinates, and dius, die non-bonded interactions

interaction should induce strain

lengtii

tiieir number to

die non-bonded interaction. Distances witiùn

example,

will introduce die strain effects into the valence coordinates. For

its

if

bond-angle potential

three

by

requires

example,

on

the central C-C bonds in

[nm], whereas in

most

tri-ierr-butyl

equilibrium

force fields die

non-bonded

médiane

diat

so

length

bond

is

about 0.152-0.153 [nm].

averaging processes

1.4.4. Choices made in die

Choices to be made witii respect

example

of

a

C-C bond. As

influence die effective A. The bond

pointed

lengtii

potential

in

a

to me

out

averaging process

in die two

given conformation (a potential of

energy term

will be discussed

of a

given compound

mean

force

using

die

the factors diat will

previous sections,

over

are

ensemble of

an

molécules) B.

A

possible explicit dependence

when différent C atom types C. A

possible explicit dependence

cross-terms

D. The For

are

on

topology through différent

classes of C-C

on

topology

and geometry

tiirough valence

implicit dependence

example, die

on

topology and geometry tiirough non-bonded strain

DREIDING force field

for biomolecules,

not

(Mayo

very accurate, but makes

option

et

al., 1990) excludes botii factors B and

parametrization easy.

a

a

This is

spécifie calibration of the chemical entities required for

simple

accurate for

In most force fields

B is used and différent classes of C-C bonds

depending on die connectivity and environment of the bonded atoms. allows for

coordinate

(Section 1.6.5)

C, which is perhaps

leads to

interaction function. In class II force fields, which

a

are

defined,

more

accurate,

given

are

purpose and

meant to

complicated by

are

required

for ail but

me

simplest Systems,

the interaction function is

die cross-terms, and parameters hâve to be calibrated ail

togetiier

consistent way. On the other hand, die interaction function is very accurate and atom

two, C and

be very

molécules in vacuum, option C is mostiy used. The inconvenience hère is that

many parameters

since few

bonds,

used.

types hâve

to be

defined (e.g. in

H).

Chapter

1

me

in

a

élégant,

CFF93 force field for alkanes,

only

32 1.5. General characteristics of the

empirical

interaction function

1.5.1. Interaction function parameters and molecular An

empirical

interaction function,

topology

loosely called

or

a

force field, V, is defined

by

its

functional form and die parameters diat enter into its définition, i.e. its interaction function

parameters, {s,}. In order

to

express tiiis latter

V

can

be used, where

=

V(

spécifie System,

to model a

?;{*,})

some

information

This results in very différent interaction at die an

electronic level exist. For

Na+ and

a

Cl

not

will hâve

If me ions

for

me

really required

a

a

From this

example,

exceeds

function like

our

régimes

or even

a

it is clear that

needs, (ii)

correct

a

analytical approximation

can

pair and

one can

a constant

around die

K^R-R^)2 gives (i)

die

degrees

me

more

analytical description

is

more

information about die ions,

the parameters

pairs

between atoms

be calculated

by solving

say

quite safely

diat die

parameters is likely molecular

to

a

again

can

distance

approximation

R^.

is

electronic

the

we

énergies

is sufficient, and

same

required

(iii)

me

to

me

other hand,

it will

require

régime (bonded, non-bonded)

and

It is also clear that die transition between

problem.

die framework of which functional form and

analogy,

be solved

However,

die true energy.

to

computationally cheaper,

Even when

me

interaction between two

functional form, die best choice of function

be différent if die bonds

topology information

potential

die quantum mechanical character of the

(K^.K^R^) spécifie to die pair. by die

die

for two separate ions. However,

equilibrium

of relative

intuitive and

namely, die

of bonded atoms is described

are

of freedom.

Schrodinger équation contains information tiiat

bypass

bonded and non-bonded régime will be

This

die interaction between

interaction, but radier captures die essential physics from its solution. On if

In order

required.

and R die distance between die ions.

reasonable

a

description

does not

is

relationships

molécule, the Schrodinger équation

séparations

System) die

complète.

topology

Hagler, 1991),

phase

useful, since is

if différent

&

électrons of the ion

closer and form

for différent internuclear know diat

not

principles techniques, empirical force fields

example (Dinur

K^/R dependence, where K

come

die molecular

on

ion at 1 [nm] in the gas

Schrodinger équation tilis is

any coordinate

energy function diat averages out die electronic

potential

on a

defining (in

This information is, however,

arises from die fact diat, in contrast to first based

(1.5.1.1)

is die 6N dimensional vector

q"

configuration of die molecular system.

die notation

dependence,

are

not identical. To

summarize, die

décide which interaction is to be treated in

using which

values for die parameters.

By

only proper molecular topology information required for an ab initio molecular Chapter

1

33

orbital calculation

at a

certain level of

molecular

topology

is in most

is the number of protons and électrons for

theory,

each atom. Note diat due to coordinate

redundancy (Section 1.4.3),

die

spécification

of

a

unique.

cases not

1.5.2. Atom types and combination rules In

a

number of

considered

as a

charged

freedom. When

implicitly atoms.

point

mass

heavy

significandy reduces

This

with

constitute about 50% of

me

dynamics point

die number of

problematic

degrees

and enables correct

a

when the donor is treated

as a

(non-polar) hydrogens

négative

effects of the

quadrupole

moments

united atom,

steric effects (me united atoms

approach fails be

even

are

of

hydrogens

spherical). Finally,

non-polar hydrogens,

for

of

removing

me

a

high

larger time-step to intégrale force fields

handled

explicit hydrogens

serious for

not too

a

some

are

atoms

becomes mixed

use a

explicidy, whereas

ail

included into united atoms (see Table 1.2). The

are

suppression

(this is

of

hydrogen

modelling of me hydrogen bond

metiiod, where polar (and possibly aromatic) hydrogens the odier

atoms are

and 30% in DNA. From

advantage

me use

of

often

form so-called united

of freedom, since

proteins

of view, this also offers die

since

to

usually

degrees

internai

no

hydrogen

bearing tiiem,

total atom number in

équations of motion. However,

and

of biomolecules,

atoms mat are

frequency C-H bond stretching motion die

directionality

no

simulating large Systems

included into the

molecular

force fields, the basic unit is the atom, which is

empirical

and

an

there

are

die loss of

linked to are cases

dipole

carbons) and

and

loss of

a

where the united atom

explicit inclusion

of ail

hydrogens may

required for a proper description of me System (Miiller-Platiie et al, 1992,

Kaminski et

al., 1994). Common force fields

atom). Thèse to

tiieir

(e.g.

are atoms

usually define

(or groups)

limited number of atom types

are

physically

(Hwang

et

information for

a

n-body interaction

spécifie system.

interaction term between dièse atoms,

The

n atoms a

irrespective to

chemically

to

united

(i.e. witii respect

force field to anotiier & Tirado-

facilitate the attribution of interaction

terms, while

assumption

one

(possibly

OPLS/proteins (Jorgensen

al., 1994), 65 in

Rives, 1988) ). The purpose of dièse atom types is function parameters to

and

alike. This number varies from

physical environment)

2 in CFF93/alkanes

a

which

generating die

molecular

is that the parameters s, for

of atom type a„ is

solely

determined

by

topology

an

n-body

the types of

tiieir environment, i.e.

s,

=

s,

(K



a

=

l-n})

Chapter

1

(1.5.1.2)

34 Such rules

are

called combination rules and

Depending

tiiey can

weakly (i.e. easily overriden)

be

preferred,

is to be

on

possible

Mark, 1992). One ofthe

die form of tables

of die atom type ofthe

of required parameters. For

also include atoms which

atoms and

the

van

of discussion (Section

environment

topology

editing

flexibility (van

offers

more

information

using

Gunsteren &

(and physically based) combination rules

of the interaction between two atoms is to each atom

given by

type. Combination rules for

next

complexity

function of die

as a

diat the

indicating

constituting

same

atom, which reduces

specified

atom

1.6.6.2).

Note diat in

principle

directly participating

example,

in the

covalendy

of the force field and, to

knowledge,

différent atom types

use

of few atom types has the

has

significantiy modified by

usually defined

two

are

our

advantage

to

of

never

still

by

die bonded

rapidly increase

been done. Instead, when

die type of

distinguish

are

interaction, but define die

bond type could be defined

a

amount

die combination rules could

bonded atoms. This would, however, very

is

be used

der Waals parameters, proper combination rules

are not

an atom

to

significantiy me

precisely.

die environment of

types. Thèse

parameter is

For

more

the

a

die type of rule,

strongly implemented. The former possibility

or

most well established

tables may include "wildcards",

matter

on

(bond, bond-angle, torsional dihedral angle and out-of-plane coordinates)

generally given in

irrespective

of the définition of

important part

of the molecular

product of die (point) charges corresponding

valence terms are

manual

law, where the magnitude

is Coulomb's the

génération

since

combination rules and

are an

die structure of die simulation program and

force field.

neighbouring

atoms,

die différent environments. The

simplicity and

ease

of

parametrization.

For

example, if four atom types are defined for carbon (C(sp3), C(sp2), C(sp) and C(aromatic)), only

10 bond types hâve to be

die environment is low.

parametrized,

but the

sensitivity

Twenty types would surely allow

influence of die chemical environment, but this would then as

much

1.5.3.

as

210 bond types, which may be

Expression

particles

imply

on

better for die detailed

the

parametrization

of

hard task.

for die classical Hamiltonian

As in die quantum energy of the

a

of die bond behaviour

to account

description

of

a

molecular system, die classical Hamiltonian (total

system) dépends simultaneously on

in the system. In

a

similar

manner as

electronic Hamiltonian is

approximated by a sum

classical Hamiltonian

be

can

approximated by

the coordinates and die momenta of ail

in Hartree-Fock calculations, where die of

a sum

Chapter

1

one

of

and two électron operators, the

n-body terms

35

3W(^,i)

E iVK&yvvty

-

]

E E 0

phase shift,

and

which

(1.6.3.1.2)

(n)ô,

plays

die

0,7t

=

same

rôle

as

sign

the

is\, in die first formulation. Since die slope of the potential has to vanish at 0 and n, die only possible values of (n)ô, are 0 and 7t. If (n)k4) is négative or '"'Ô, is 0, die term has a maximum for d) 0. If {°\ is positive or ) + "'k^ (1 cos2(b) +=£

bonds

bonds j>i

i

*„,(V-W-ty

This term is présent in CVFF and CFF93. Since k is

asymmetric B.

bond

Bond-angle

-

stretching

bond

around

a

given

coupling (two bonds j

with bond i)

atom

(1.6.3.1.1)

tiiis term favours

positive,

site.

involved in the

angle i)

(2)

E V., := E

angles

l

reproduce

This term is used in CVFF, CFF93, MM2 and MM3 to

frequencies stretched

or

bond-angle C.

and

Bond-angle

me

bond

compressed.

effects in strained molécules where

length

Since k is

a.6.5.1.2)

tonds j

positive,

bond

lengtiiening

a

vibrational

bond-angle

is

is favoured when the

is reduced -

bond-angle coupling (angles j sharing one

common

bond witii

angle i)

( 10)

£9e

({e(,e;};{er,e;,feee,,y})

=

E *»-.* (tf-W-e,)

E angles

i

This term is présent in CVFF, CFF93 and MM3. It is used to

frequencies for coupled bending D.

Torsional-angle

-

bond

modes, k may be

positive

or

coupling (central / peripheral bonds j

hb MM ; i*/.( °WwWfflW--i>

(1.6.5.1.3)

angles j

reproduce

vibrational

négative.

involved in torsion i)

=

,,,.,„

(l)or(2) dihedrals

(l.O.D.l.t)

E ty'-bj) [%>,

E i

cos

bonds j

, +

(1.6.6.1.13)

:

bufferedfunction n-m

^

= '

m m

l+ôl""" f 1+Y

[

l

L0..+ÔJ 6i/+ô

A buffered 14-7 energy function has been

-

g"+y

(1.6.6.1.14)

n~m

proposed (Halgren, 1992)

Ebur-n-m(lr,J}^y-WJ),Rmm(iJ)})= atoms

atoms

where n=14, m=7, ô=0.07 and rare

gas

is at

expérimental

(0.996;-1.0006),

(1.6.6.1.2).

The

and

(reduced)

reduced intercept,

y=0.12,

tiiese parameters

being

obtained from

a

best fit to

data. Note tiiat witii tiiese values of ô and y, die minimum of

Ç(i,j),

mus

Equation (1.6.6.1.14) nearly

curvature at the

minimum,K(i,j),

is in this

case

is 0.89, both close to the Lennard-Jones value.

Chapter

1

1^(0^)

satisfies the conditions in

79.6, and die

52

Figure

In

1 6 the

with parameters

reduced energy functions mentioned above

vanous

corresponding

of the Lennard-Jones

curvature

to a curvature at

the

minimum

function), except me 9-6

van

54) and die 14-7 buffered function (curvature 79 6) As interactions is short and

tiiey

will

play

van

(curvature

an

essential rôle

of 72

are

displayed,

(reduced units, the

der Waals function

only

(curvature

be seen, die range of dièse

can

for direct

neighbour atoms

der Waals functions

at minimum

12

=

36, except for the 9-6 function)

14

16

20

Reduced distance pl{

Figure

1.6: Représentation ofthe

van

der Waals interactions,

in a

reduced form,

corresponding

to

Equation

(16616) with m=6 and n=12 or m=6 and n=9, Equation (1 6 1 1 10) with m=6 and f(ij)=13 77, Equation (16 11 12) with a(ij)=6, and Equation (16 114) with n=14, m=7, 8=0 07 and y=0 12 Due to the

întnnsically

small

magnitude

He to -2 3 [kJ/mol] for Xe-Xe), the =

1

are

likely to

may of

of die énergies involved

divergences

affect die overall energy

course not

function below Qit

=

m a minor

way

m

1 wdl influence die are

=

condensed

be true for gaseous Systems On the other

system When electrostaUc effects

(e

-0 1

[kJ/mol] for He-

between die différent functions above q,s

phase Systems

density and compressibihty

of

a

condensed-phase

présent, the balance between dus steep

repulsion and

the electrostatic mteraction will be the déterminant part of packmg forces Since

Chapter

1

This

hand, the steepness of the

van

der

53

determining parameters), just

Waals parameters (e,Rmm,curvature effective parameters, forms

give

can

effective

as

atomic

will

gas-phase

sum

seems

up for

a

an

pairs,

of

are

adjustment, condensed-phase

energetical

state. Since die small

large number

charges,

virtually any one of the above functional

reasonable results. Of course, after such

of the

représentation parameters

diat

so

der Waals parameters may not be suitable anymore to

van

neighbours

they can be adjusted

a

give

a

proper

contributions for nearest

proper choice for the

primordial, and combination rules (Section 1.6.6.2)

e

and

R^

should be considered

carefully. 1.6.6.2. Combination rules Because the définition of N atom types Waals interaction parameter sets for atom which

dépend

homonuclear

sets

on

&

al., 1984, Halgren, 1992), R^,,,

extent,

A.

although

Géométrie

RnJfJ) The

following

often

formally

meansfor

two rules are

Jones function and any

C6(ij) or

B.

a(ij)

Géométrie The

mean

for

following rules

KmJ'V)

=

e

md to

and

be calibrated

by studying

Since die

are

die

expérimental

interchangeable

M^WJ)

=

Equation (1.6.6.2.1)

Ca(ij)

to a

large

for the

(1.6.6.2.1) case

JCn(i,i) Cn(jj)

=

and

e(ij)

équivalent

mean

for die

JC^JV)+*„.

otherwise

atomic cutoff (AT) truncation scheme whereas

(2.1b)

cutoff (CG) truncation scheme.

(AT) scheme first. The force

consider die atom based truncation

by differentiating the négative

on atom

i is

of the interaction energy (2. la) with respect to r„

to

SQ'

j,J*l

where

J

(2"la)

complementary Heaviside function

describes

Equation (2.1a)

obtained

AT

Rc reads

R„ is die distance between die corresponding charge-group

( 1

a

distance

l

respectively dénote

H^

describes

at a

permittivity of the médium between the charges (usually one for free

respectively belong,

centres.

located at atomic sites and that ail

straight cutoff (SC) truncation

a

4ne,/,/cc)

dH(x;Rc)/dx

-

is the Dirac delta function. The second

infinitesimally

^

+

VSC

the

on

an

simulating

the

corresponding

Vsc 0FF

+

V0FF

AT

(2.5)

92

The second term,

Rg,

shifts the

V0FF,

such that it vanishes at

R,..

potential

gives rise

It

distances from

the system limit of at

Rc,

(i.e.

very

a

R,., expression (2.5)

die work

large

H(r,j;Rc)

will

required to bring charges together

cause

in (2.1a)

function S is called

a

and die

Rc,

dynamics,

which

artifacts in simulations. An alternative is to

replace the step

a

smoother

switching

replaced by force

a

et

in

S(r1];R„Rc)

unity for r.jSR, et

term in

et

R,

,,j,i

interaction

by differentiating die négative

water oxygens,

-1

(proportional

dipole-dipole

I and J. If die centres of the

present simulations the

(van Gunsteren & Berendsen, 1987,

reduce artifacts of

interaction

with respect to r,. Since the Heaviside function

charge groups

to

_3

charge groups

this force acts

lSr \ 4tie e,

dépends

on

')

(proportional

RB,

a

where atom

Nœ is

k

as

given

this

to

£ tùu({ru,kel,lej))

charge

charge

of

The

on

atomic sites, in die

atom

and

we

i

U

-£-

Ru

CG

*

-

die total number of

belongs

r"3).

by

hâve

"

assumed to be

to one

(2.8b)

R%

,

vector

a

(2.8a)

origin, r=0, solely the 1=0

die 1=1 term may generate

a

term

field

95

in

Equation (2.8b),

as

far

as

so

that

only thèse two terms of the multipole expansion

MD simulation is concerned

potential (frVu obtained

with r=Ô" and

î^ï

-•= E"

where the j-sum

équations,

we

ail

runs over

e,-e,

(2-10a)

t2(t2~ei)

q>

-

charges qt (including qj inside die cutoff sphère of i. In dièse potential (2.10a) due to die spherically by

in the continuum

functions of

derivative of

r so

that it cannot be

dielecttic continuum, die

following

neutral

charge density

charge groups (water molécules).

sphères,

thèse two

Clearly,

expected

inside die

polarization

sphère

induced in

quantities

has to be

as

the

on

zéro

everywhere

case

of

In diis case,

the most narural choice for

a

a

be valid in die

outside of die cutoff.

neutral System

consisting

fOT the whole System may be calculated

v~

-

£/ •

"

0 1

where the second groups are

preserving

the

neutrality

of the cutoff

by reversibly charging

die atoms

one

after

respective cutoff sphères

4*?r V1 f 1 *+ tq>H(R-^ H1ltotl 1

E

sum runs over

E2

KRF

>


Rc

as

if the atoms

were

scheme,

new

one

water

and the full

to

as one

still at

a

avoid any

of die atoms

distance Rc,

would still like to préserve the

R^.

When two

charge

me

validity

a

distance

Rc,

ail

charge pairs

at a

distance

Figure (2. lc,d)

is illustrated in

of

(2.8)),

as

is

r1J>Rc

for die

=

neutrality

one

pair of atoms

1), but also possibly

could be

spécial

case

of die

but avoid interaction

groups I and J hâve at least

Rc (expressed as H(min{rkl,keI,leJ};R(.)

beyond Rc,

some at

artificially replaced Rg^R,..

at

This leads to

following energy expression

V

=

vborn

+

*W

+

i E 2 ,

ë,-e,

i —

can

will be a

a water

die continuum also when die central atom

Rc+d(0-H)

k

charge group

a

tiiis

distance

as

Figure (2.1b),

to

however, the hydrogen of the central

that of die continuum and

a

As

We

Rgp in (2.16)

>

(AT) truncation scheme described below

atomic

at a distance below

die

ry

This may reduce die effect, but not eliminate it.

sphère (dius

distances

opposite dipole.

of distances

would hâve to be considered instead of solely the Born and

R,^

D. Treat any interaction at distances

cutoff

occurrence

completely inside die

an

charge-charge distances larger than R^.

belonging to it. done in

if it had

and the continuum.

as

Equation (2.8)

in

term.

occurrence

C.

molécule is as

O-H distance in

molécule would be off-centre witii respect

dipolar

water

gap between the furthest

hydrogen, Figure (2.1b).

expansion

a

interacts

water.

RRF2Rc+2d(0-H), a

happen

R„f beyond Rc. Ideally, according

A. increase

B.

even

région (lower left of the picture),

+

——

r„

E

j%

H(min{ru,kel ,lsJ} ;RC) "

—+C

\H(r;Rr)

e.-e.

ri

i +

-L

c

+

—ï—!

W,

4itE0E1

R. i £_+H

AT

(2.18)

[\-H(r ;« )]

seen from the Figure, the multipole moments of water molécule pairs close to Rc strongly affected by tiiis procédure. The leading effect can be roughly described as

be

progressive removal,

at

intermolecular distances close to

Chapter

2

Rc,

of the component of the two

100

dipoles along die intermolecular vector. Provided tiiat ail charge groups of

q1q]H(min{rk,keI,leJ};R )

ail

over

charge pairs

from différent

Noting additionaily diat H(min{rkl,keI,leJ};R(:)H(r1J;Rc)

=

are

charge

neutral, the groups is

Hir^R,.), Equation (2.18)

sum

zéro.

can

be

rewritten "AT "AT

,

a,ai

^EE

,y,i 4ite0E1

,

2

AT

E.-E

L r„

v

VSC

+

V '

DRF

(2.19)

*w

2ë,+E i R;



2e2 ei +

V

+

'

and thus this scheme turns out to be based

OFF

on

atom-atom

an

distance criterion for

Equation (2.19) hâve been denominated VB0RN

truncation. The contributions occurring in

(Born term, Equation (2.11)), VSELF (interaction of a charge group witii self-induced dipolar reaction-field, Equation (2.14)), V^ (straight truncation, Equation (2.1a)), VDRF (dipolar

reaction-field) and VOFF (offset term). When e^e,, Equation (2.19) becomes Equation (2.5), and the SC/AT scheme is thus condition.

Apart from the term

a

particular case of the RF/AT

Vs^»

which accounts for the

groups, die interaction energy vanishes at ry r, and rt in

VBORN

and for

rigid charge

=

scheme

corresponding to this

self-energy of isolated charge

Rc. Again ignoring the dependence of R„ on

groups, the force

corresponding to

this interaction

is

E

ai°j

2(E2-ei)

4nE„e.

2e2+El

r

] »W AT

[f

2e,

Since the delta function term is

Additionaily,

R,

R

always

V6!

Rc

2e2+6l

/&

die total force vanishes at r,

R^R,.,

so

that cutoff artifacts

correction which is known

to

are

avoided to

a

hâve bénéficiai effects

a

large on

(2.20)

àW

zéro, energy conservation is satisfied

in the limit e2»e, and when

tiius, die interaction defined in (2.19) implicidy contains function,

]

physically

=

reasonable

exactiy. R,.,

and

shifting

extent, and the reaction-field

die dielectric

properties

of

liquids,

is included. Note diat for both the RF/CG and RF/AT from

R^

In botil cases,

RrfsRj.

seems

schemes, Rw may be chosen independently

reasonable both to account for the thickness of charge

groups at the cutoff distance and to limit

(RF/CG)

Chapter

2

occurrence

of

inter-charge

distances

101

whereas

beyond Rgp occurrence.

at

R^

if the choice

Rc by 0.1 [nm] simulated

Rrf^Rc is made.

were

Both

attempted hère of

properties

Comparison

2.3.3.

will increase

Riu7t_0,M,2A.

where r„(t) is the location ofthe oxygen atom of molécule between two stored frame

?0(t+At)

trajectory

frames

selected

die

was

as

average diffusion constant, D

linear

régression

corresponding

line

the

same

the

régressions

where

=

5„(t)

is

by monitoring,


T_0,Al,2At


2At

L_

(2.26)

tin_,

expression may vanish,

in which

case

c„(r)

The average distribution, c(r), and its standard déviation, o(c(r)), were

was

,A

^

For low values of r, the denominator of this

arbitrarily set to zéro.

the

of (fourty) water molécules a, die function

Pl '««f*
)r-0.

P*a

permittivity, ê(r), obtained

B.

a

+

is die number of molécules

dépendent dielectric

function

a

1

1

size

spherical volume.

a

=

([ X„(r)

using Equations

Appendix

in

that die unit cell is

with the volume of the

molécules

1

with

given

also

the Fourier transformation of

on

assumes

evaluated in

evaluate

=

are

truncated octahedral). This formula need not, however,

properties

to

e„(r)

where

is based

in three dimensions and

be valid when the fi and E tensors

used

permittivity s^ of me

reaction-field correction. Values of K calculated

(B. 19,21,24) with the present values of R,., Rrp and t^ convolution

quantities.

(see Appendix B). It may dépend on the cutoff radius, Rc,

die simulation

during

die standard déviation of thèse

calculating

constant

die simulations intoa number of

by dividing

obtained

e

.,.-

for which r„psr

at

time

and its standard déviation,

by averaging

the

curves

£„(r)

using the convention described above of including Chapter

2

a

over

t.

The average

a(E(r)),

over

core-

the set

the set of molécules

single periodic image into

110

averages,

one

criterion for

has

from

E„(rïRext)=E

comparison

The dielectric relaxation time water

molécules a, and for

moment

Equation (2.29b)

for any molécule

a.

I(r)

is

cores

was

estimated

by monitoring,

of increasing radii

r

for

a

séries of

,Ma(r,t+x).Ma(r,x) t V W^,

Afa(r,T)

=

p

E P. r„p

Core-size

dépendent

inverse of the

slope

function of t at fixed Estimâtes of

Finally, e^WTrfe],

states

(fourty)

around water molécule a, the

*.V.*)

with

van

useful

dipole

autocorrélation function

=

&

a

between différent electrostatic schemes.

TM(r)

r

régression

between t

were

=

0.5

[ps]

TM(r)

fitting

lines

averaged

and die time of first

properties,

le.

estimated for five of die simulations

calculated

were

the

obtained for successive intervais in

self-consistent dielectric

were

r

dielectric relaxation times, of linear

(2.31)

?„(T)

r

occurrence

In

négative

$(r,t)

as a

$(r,t)




0006

1 181

1 14

0 538

113

0 680

113

0 979

Q, [kJ/(mol ps) 0 480

[g/cm3]







ATM

298 1




AT/2

45




300 2




CG/4

300 8




T[K]


R„, see also Figure (2 7), D diffusion constant, Table 2 3, corrected the followmg values are corrected by perturbation to dielectnc self-consistency (convergence was reached wittun at most 20

scalar dielectnc permitûvity, defined

from Table 2

e

properties Properties calculated for the simulation box and summary of orner important properties, G of me T tensor, Equation (2 28), GK fmite System Kirkwood G factor, defined as Tr[£-TJ,

Table 2.4: Structural and dklectric

699±13 7(68 3±9 5)

12 5

66

114

E«y Ek £jx

C

58 9

59 9

661

E«x *-yy CM

649

2 442±0 19

2 602±0 25

GK





AV^IU/mol]

-017

-013

AV», [kJ/mol]

83 6

5 33

D[10'm2/s] 5 26

36

40

Corrected:

0 054 0 005

0 781

63 3±6 5 (63 1±6 2)

-04

605

XM[Psl

X, fPSl

CG/5 0 753

2 260±0 15

-0008

0 012

-0 010

0 743

27

(68 1±10 6)

12 1

847

0006

0 002

0 087

0 897

29

70 0±17 2

87

119

C

603

651

2 374±0 23

-0028

0 014

0 043

0 055

0061

E«* eyy Em

r

r

CG/2

0 734

0 815

•=„ E„ C„

GK

r

r

G„GffG„ O^ G„ Gp

123

average square

r

r, cos0

f

r,

with

(A.7a)

based truncation scheme (see

r,0 and 0. When (A.7a) is

ï and

hâve

*>

(A.4b) through

spherical by the

r,R-2l-l

introduced after die second summation in (A.4b,5b) to avoid the

range of

be calculated from

Defining

a

expressions also lead

the

(2.8a)

(/+l)e2+/e,

(A.3b) requires

of

use

(/^1)(E2-E1)

^^Ï'^O'

and

Note diat die

%

4716^,

-

-J-

)

(A.9b)

r

where matter

©^©(f^r). Note that when 0=0 or 0=ti, ëx cannot be defined uniquely. This does not since the ex component vanishes

(sin0=O

and

ôcb^ryô©^,

see

(A. 11b)). From

(A.4b) and using the relation

(l-x2)

J- P,[x]

=

-

lxP,[x]

+

IP, ,M

(A.10)

wefind

l*RF&=t t'icjr'-iptcotd] or

j

x

-

*"

=

(AUa)

M

v^ [Ê''c,/'''sin"1e(cose^,[cos0]-/', ,[cos0]) if 0*0,71 r\'n 10

otherwise

Chapter

2

(A-11

135

Note that die 1=0 term vanishes in both

Consequently, the prime

after the second summation

rj=0. Finally, ôdjgj^rVôO

when

and has been omitted in die summations.

cases

now means

0=O,Tt

vanishes for

3cos0/30 vanishes. Combining (A.9) and (A. 11) leads

1

-

c;,

4^

l

Substituting

Oa-^Rup. Finally, is not

a

(A.4b)

case

suitable centre for

cannot be used.

r

expansion

Equation (2.8b)

leads to

along

die

to =

any Une

o^r) in a séries

of

lim

can

thus be used also for r=0,

r°P0[cos 0] prime

prime

is one,

just

as

well

the

after the summation

reasoning

applied

may be

vanish in (A. 12) when

r

(8a)

sign

to

range of

origin

of Legendre

term

origin

angle

©(r^r) with the ?

E^r)

in the

can so

one can

and

write

vector. In

except die 1=0 term.

zéro

j-sum

has

r

this

Equation

=0. The

meaning

of the

The

meaning

of the

The

same

case.

in (2.8a) is extended to include this

goes towards 0,

sphère

polynomials,

(A. 13)

is extended to include mis

and it

validity

of the

die 1*0 terms and widi die convention that

excluding a

a

«„

s

r

**

6

forming

when

as

after the summation sign in

(A.12)

RR

s

r


(r) R(r'-r) P(f)

where R is the interaction tensor describing die

]

dipole-dipole interactions

Chapter

2

(B.6)

in the system. If

139

die electrostatic

used in the simulation is the

entity

is exact. If this

Equation (B.6)

entity is

charge (as

the

(B.6) will only include die (leading) dipolar groups,

for

(Ewald),

to

a

truncated at

by

a

depending

given

a

distance

can

be recast

R(v)

I(v) is the

correction tensor,

m

c

=

die

following

approximations

charge

one can see

dérivation

used in the

—{— vvi 47tE0Ej

=

Âv) l I(v)

between distances

In ail thèse cases,

+

R,

E(v),

C ]

and

R,.

where

v

(B.7)

interaction tensor and Q is

for which

me

reaction-field

are

v^

—!— 47tE()Ë1

=

V

zéro

tensor.

as

dipole-dipole

exact

expressions

y3

rT»

.

=

—!— 4lte0ei

-5

[3vYvô-ôy6v2]

v5

(B.8)

1

_J_J^ÙJ_1

=

47te„e, o

In

on

switched to

Rj,

reaction-field "interaction"

short notation for r'-r,

where

model), Equation

deal with the electrostatic interactions, the tensor may be either almost exact

or/and corrected is

Stockmayer fluid),

of the interaction between neutral

term

from symmetry arguments diat die inclusion of thèse terms in the would not affect die resuit. In both cases,

simulation

a

for die SPC water

higher-order multipoles. However,

the contribution from

neglecting

dipole (as

\

2e.+e, 2

Equation (B.7), f(v)

r

is

a

3

txRF

1

function

characterizing

the

(isotropic)

truncation

or

switching

ofthe interaction at selected distances. In the is

equal

to one, at least in

functions

are

a

finite

following, we will assume that this function neighbourhood of v=Ô (this may not hold when shifting

used). In die spécial

orthogonal vectors,

i. e.

an

case

where Q is infinité and

infinité collection of periodically

U, Equation (B.6), which has the form of

a

periodic along

three

replicated rectangular unit cells

convolution

intégral,

may be Fourier

transformed

P(k)

Since

=

Ë0is homogeneous in

c, (

£

space,

-

I )

we

[Eo(k)

hâve

Chapter

2

+

R(k) P(k)]