Unified approach to multipolar polarisation and

1 downloads 0 Views 2MB Size Report
Sep 5, 2013 - of the external test set Na+(H2O)6 configurations. 1. Introduction ..... In practice, it is usually the logarithm of L that is maximised. Note that ...... 20 A. J. Stone, The Theory of Intermolecular Forces, Clarendon. Press, Oxford, 1st ...
PCCP View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PAPER

View Journal | View Issue

Cite this: Phys. Chem. Chem. Phys., 2013, 15, 18249

Unified approach to multipolar polarisation and charge transfer for ions: microhydrated Na+† Matthew J. L. Mills,zab Glenn I. Hawe,yab Christopher M. Handleyzab and Paul L. A. Popelier*ab Electrostatic effects play a large part in determining the properties of chemical systems. In addition, a treatment of the polarisation of the electron distribution is important for many systems, including solutions of monatomic ions. Typically employed methods for describing polarisable electrostatics use a number of approximations, including atom-centred point charges and polarisation methods that require iterative calculation on the fly. We present a method that treats charge transfer and polarisation on an equal footing. Atom-centred multipole moments describe the charge distribution of a chemical system. The variation of these multipole moments with the geometry of the surrounding atoms is captured by the machine learning method kriging. The interatomic electrostatic interaction can be computed using the resulting predicted multipole moments. This allows the treatment of both intra- and interatomic polarisation with the same method. The proposed method does not return explicit polarisabilities but instead, predicts the result of the

Received 29th July 2013, Accepted 5th September 2013

polarisation process. An application of this new method to the sodium cation in a water environment is described. The performance of the method is assessed by comparison of its predictions of atomic multipole

DOI: 10.1039/c3cp53204f

moments and atom–atom electrostatic interaction energies to exact results. The kriging models are able to predict the electrostatic interaction energy between the ion and all water atoms within 4 kJ mol1 for any

www.rsc.org/pccp

of the external test set Na+(H2O)6 configurations.

1. Introduction Ions in solution are critical to the action of a number of biological and chemical systems. Monatomic ions in particular are important for a great number of biological processes, such as the transport of ions across membranes, nerve signal propagation, osmosis and the folding of proteins, peptides or DNA.1–3 Many of these processes rely on the ability of biological systems to discriminate between different ions.4 One of the most actively studied examples of systems that involve ion–water interactions is the mechanism by which ozone depletion occurs.5 There are several challenges in simulating the behaviour of any of the above examples. Longrange electrostatic interactions must be accommodated, a proper a

Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, UK b School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, UK. E-mail: [email protected] † Electronic supplementary information (ESI) available. See DOI: 10.1039/ c3cp53204f ‡ Current address: Department of Chemistry (SGM 418), University of Southern California, 3620 McClintock Avenue, Los Angeles, CA 90089, USA. § Current address: School of Computing and Mathematics, University of Ulster, Shore Road, Newtonabbey BT37 0QB, UK. ¨t Bochum, ¨r Theoretische Chemie Ruhr-Universita ¶ Current address: Lehrstuhl fu D-44780 Bochum, Germany.

This journal is

c

the Owner Societies 2013

description of charge transfer is needed, and the anisotropic polarisation of the ionic species must be included. In addition, the challenge of understanding the structural preferences of ions, ion selectivity of proteins, and the well-known (but little understood) trends observed by Hofmeister6 and Heydweiller7 are persistent problems. There are a number of case studies that have concluded that there is a need for the inclusion of polarisation in simulations of ions. In particular, there are problems in simulating ions on the water liquid/vapor interface,8,9 and the nucleation of ions in solution.10 Water is highly polarisable and has a permanent dipole moment, which causes it to solvate readily polar species and ions.11,12 Ions are also polarisable, and become more polarisable with increasing ‘‘size’’, which is influenced by both an increase in the number of electrons around the nuclei, and the increase in average distance between the electrons and the nuclei as orbitals fill up. For chemical systems of significant interest we require models that allow for simulation of the dynamic behaviour of many thousands of atoms (e.g. ref. 13) over very long time frames, covering 100 ns to 2000 ns. For this reason ab initio methods are not suitable, as solution of ¨dinger equation quickly becomes intractable with the Schro increasing system size. A popular alternative is to use pure molecular mechanics potential energy functions14,15 to allow extensive sampling of the potential energy surface.

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18249

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

Paper

PCCP

Molecular mechanics operates within the philosophy of the Born–Oppenheimer approximation, which enables the representation of potential energy surfaces using classical mechanics. Bond-stretching, angle-bending and torsional rotations are treated with classical functions. The electrostatic interactions of the system are usually represented with a non-polarisable method. Typically, the electron distribution of a system is represented with point charges at the nuclear positions of the atoms that interact via an inverse dependence on internuclear distance. There are many ways of determining point charges, and they vary16 widely according to how the electron density is partitioned or how the charges are fitted to an electrostatic potential. However, the deficiencies of point charges have been pointed out many times in the literature (e.g. ref. 17 and 18) and are the subject of a systematic review.19 Fortunately, a more detailed representation of the electron density is available when using multipole moments. These multipole moments are described more compactly by spherical harmonics20 rather than by Cartesian tensors, and a collection of multipole moments are able to represent a real, three-dimensional atomic or molecular electron density.21–23 Finally, in a pragmatic topdown approach, models of monatomic ions can be fitted to gas phase properties, or empirical hydration free energies, or the enthalpy of the monohydrate.24–27 However, determining these physical properties for fitting is often not straightforward,26 and there is no guarantee that macroscopic thermodynamic properties can be properly simulated. Polarisation is not just important for the accurate simulation of water, but also ions, especially in the case of ion channels. At surfaces and interfaces, the polarisation effect upon ions shows pronounced anisotropy, and is strongly related to the dynamic structure of water.2,28,29 Essentially there are three common approaches or classes of methods30 to include polarisation in molecular mechanics based simulations: Drude oscillators (Charge on Spring, Shell Model), fluctuating charges (FQ) and polarisable dipole moments (PD). Each of these approaches31 has been implemented in simulations of ions, though each method comes with associated problems.32 It is useful to highlight the main features of each method and its concomitant challenge or disadvantage. The Drude oscillator approach models polarisation by using two point charges tethered by a harmonic spring. One charge site is mobile and the other is fixed in position. The position of the mobile charge is influenced by the charges of all charged sites except for the site to which it is tethered. One of the challenges with this model is the determination of a selfconsistent polarisation response to changes in the local chemical environment.26,33–35 Concerning the second type of polarisation treatment, FQ models rely on point charges that can change in magnitude in response to an external field. The charge flows between atoms until the electronegativity is equalised, which enables the modeling of charge transfer. However, one drawback is that charge transfer effects can take place over too large a distance, and result in too much charge being shifted. There are a number of modifications to counter these problems. A second drawback is that for some molecular

18250

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

systems the appropriate out-of-plane polarisation cannot be simulated. Thirdly, in the case of monatomic ions, FQ models are never used. Finally, the third class of polarisation methods is that of induced dipole models. They rely on a dipole, which is fixed at a given site, with a given polarisability, and which responds to an external field. The dipoles may respond to other induced dipoles within the same molecule, and again there is the issue of determining self-consistent dipoles, similar to the Drude Oscillator models. In addition, a further complication is the polarisation catastrophe where the induced dipoles located very close to each other cause a polarisation response that leads to infinite solutions.36,37 A more modern class of methods uses machine learning (e.g. artificial neural networks38), which turns out to be a promising route by which to represent potential energy surfaces and electrostatic properties of atoms. A recent review39 portrays the situation up to 2010 but even more recent examples can be found in references.40–42 In this work, we build upon our past expertise with multipolar electrostatics and machine learning43–46 to create realistic polarisable electrostatic models for ions in water. The current approach focuses on the result of the polarisation process rather than on the process itself. This means that there is no explicit polarisability tensor being calculated. Instead, we construct a direct mapping between a given multipole moment of a given atom (output) and the nuclear coordinates (input) of the atoms surrounding this given atom. This is achieved by using a machine learning method called kriging,47 which has its origin in geostatistics. The intramolecular polarisation of two amino acids, alanine48 and histidine,49 has already been modeled by kriging, and here we present proof-of-concept for ions in aqueous solution, for the first time. The proposed method has a number of advantages, by avoiding some of the problems that the traditional methods discussed above introduce. First, the present method does not suffer from the polarisation catastrophe. This is due to two important attributes of the method: (i) the topological atoms featuring in the partitioning of the system have finite volumes, leaving no gaps between them and without overlapping each other and (ii) atomic multipole moments are well-defined even at short range. There is no polarisability tensor that causes a breakdown of the polarisation process at short range. Instead, kriging, which is essentially an interpolation method, correctly predicts the atomic multipole moments, provided it has been trained at short range, which is the case. As a pleasing by-product there is no need for damping functions, which artificially prevent the collapse of traditional polarisation models. Secondly, the proposed method altogether avoids on-the-fly iterations towards self-consistency during a molecular dynamics simulation. Instead, the method directly generates the multipole moments themselves after the process of polarisation, because the kriging method predicts the end result of the polarisation process rather than the process itself. Thirdly, there is no (electrostatic) penetration effect, which again would call for damping functions. This advantage is due to the non-overlapping nature of the topological atoms. Fourthly, the method

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

is valid for any multipole moment (typically up to the hexadecupole moment) and hence it is more general than the methods above, which are confined to dipole moments. Fifthly and finally, charge transfer is treated completely on a par with dipolar (and higher rank) polarisation. A kriging model is constructed for the monopole moments (or net atomic charges) in exactly the same way as for the dipole moments. This universality in treatment is due to the training information being obtained from supermolecular ab initio calculations rather than through the long-range Rayleigh– ¨dinger perturbation formalism, which still dominates the Schro thinking behind modeling of intermolecular interactions.

2. Methodology 2.1

General background

The description of the electrostatic interactions in a chemical system can be made more realistic by increasing the detail in the description of the system’s electrostatic properties. An improved description will necessarily result in an increase in demand on computing resources, however, when the increase in detail is carefully controlled, contemporary computing power can be used to appropriate advantage. From a more detailed electrostatic description50 will follow a more accurate calculation of the electrostatic interactions between atoms. The method for computation of atom–atom electrostatic interaction energies described in this communication differs in two ways from the typical force field approach. The usual method involves the employment of atom-centred point charges where the values of the charges are constant throughout a simulation. The first difference herein is the inclusion of atom-centred multipole moments (AMMs) in place of point charges in the description of the electronic distribution of a system.51–54 These AMMs are discussed in Section 2.2. The second difference is the inclusion of an accurate method for representing the polarisation of the electron density (and hence the dependence of the AMMs on the system geometry). Modelling of the latter effect is achieved via a set of machine learning models, each of which is able to relate the geometry of a chemical system (represented by a set of non-redundant internal coordinates) to the value of an AMM of a specific system atom. That is, one machine learning model is built for each AMM. The machine learning method employed in this investigation is termed kriging47 (also known as Gaussian process regression), and its nature and application are discussed in Section 2.3. The systems to which this method is applicable are limited only by CPU requirements, where the number of atoms and number of AMMs employed in the description of each atom are the limiting factors. Here we present an application to sodium cations solvated in cages of rigid water molecules. Significant charge transfer and dipolar polarisation effects are expected to exist in such systems. Capturing these phenomena will result in a more physically realistic description of the electrostatic interactions in the system compared to that obtained with point charge methods. The Na+(H2O)n system acts as a pilot for both solvated ions and builds on previous work where a similar method was applied to computing intramolecular polarisation only.

This journal is

c

the Owner Societies 2013

This system constitutes a chemically significant problem that can be analysed in detail. The applicability of the described electrostatic method to the solvated sodium cation systems can be quantitatively assessed before applications to larger systems are attempted. 2.2

Atomic multipole moments (AMM)

The atoms in this work are defined according to the Quantum Chemical Topology (QCT) method. At the heart of this approach is the idea that the concepts of dynamical systems (critical point, separatrix, basin, etc.) locally characterise a quantum mechanical function and partition it into subspaces, each endowed with numerical property values. Examples of this approach have been listed before55 and the name was first justified about ten years ago.53,56 When QCT acts on the electron density r(r) then one obtains topological atoms, as defined by the quantum theory of atoms in molecules (QTAIM).57,58 These (topological) atoms are separated by two-dimensional separatrices called interatomic surfaces. QTAIM is rooted in quantum mechanics and uses atomic theorems59 to define atomic properties. The partitioning of a chemical system is achieved by analysis of the gradient vector field of r(r), and enables extraction of atomic properties from modern wavefunctions. Generally, an atomic property can be expressed as the integral of a point property over an atomic basin, i.e. the topological subspace that it occupies. Fig. 1 shows60,61 the atomic basin of the sodium ion inside a set of water clusters of increasing size. Note that the wavefunction extends to infinity in all directions and the electron density inherits this characteristic. Therefore, an atom extends to infinity in all directions except those that are bounded by an interatomic surface. For the purpose of visualisation and computation of properties, atoms are often capped by a constant electron density envelope. The magnitude of r(r) used for the envelope is application-dependent. In this work it has been set at 1  106 au for generation of data, and 1  103 au for creating atomic images (see Fig. 1). The Coulombic interaction energy between two atoms (labelled A and B) can be written62 in terms of the partitioned total charge density rtot(r) of the system, i.e. the sum of the nuclear density and the electron density r(r), as in eqn (1), ð ð r ðr1 Þrtot ðr2 Þ Coulomb EAB ¼ dr1 dr2 tot (1) r12 OA OB where OA and OB refer to the separate finite-volume regions ascribed to atoms A and B, and r12 is the distance between an infinitesimal element of charge in OA and one in OB. This equation expresses the electrostatic interaction between two atoms in a more realistic and complete way than when expressed by the familiar point charge  point charge electrostatic interaction term in force fields. Eqn (1) cannot be used directly in a force field because its computation is very expensive and also not efficient because the interaction energy has to be calculated every time again for each mutual orientation the two interacting atoms find themselves in. Fortunately, it is possible to separate the electronic coordinates r1 and r2, which are entangled in the denominator r12.

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18251

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

Paper

PCCP Any pair of multipole moments for which L surpasses a preset number is not included. The quantities lA and lB take non-negative integer values, and are referred to as the ranks of the spherical harmonics Rlm, while m denotes a particular component of a spherical harmonic of a particular rank l. Each value of l has 2l + 1 such components, indexed from l to +l. For example, the quadrupole moment has l = 2, and thus has 5 components and m can take the value 1, 2, 0, +1 or +2. In this communication a value of 5 is employed for L, which has been shown to give good convergence behaviour in previous work.66,67 On a related note it is worth pointing out that due to its finite nature a topological atom has a finite convergence radius, which ensures perfect formal convergence68 of the electrostatic potential that it generates, outside its own divergence/ convergence sphere. From eqn (2) we can define an AMM as the integral of the total charge density (multiplied by a spherical harmonic) over an atomic basin, ð Q‘m ¼ drrtot ðrÞR‘m ðrÞ (4) O

Fig. 1 Three-dimensional images of the atomic basin of the Na+ atom for systems containing a single ion and n (n = 1 to 6) nearest neighbour water molecules. The bond critical points are marked by smaller purple spheres and the atomic interaction lines by dashed curves.

We work with real spherical harmonics in this application, and as such eqn (4) must be modified,20 necessitating a small change in notation for m, which we do not specify here because it detracts from the main flow. Substitution of eqn (4) into eqn (2) allows us to write Coulomb EAB ¼

‘A 1 X 1 X X

‘B X

mB Q‘A mA T‘‘ABm Q‘B mB A

(5)

‘A ¼0 ‘B ¼0 mA ¼‘A mB ¼‘B

This is achieved by carrying out an expansion of 1/r12 in terms of spherical harmonics.63 Eqn (1) can therefore be reexpressed as ð ‘A ‘B 1 X 1 X X X mB Coulomb EAB ¼ T‘‘ABm dr1 rtot ðr1 ÞR‘A mA ðr1 Þ A ‘1 ¼0 ‘2 ¼0 mA ¼‘A mB ¼‘B



ð OB

OA

dr2 rtot ðr2 ÞR‘B mB ðr2 Þ

(2)

where Rlm are regular spherical harmonics,20 which can be complex, in general. The interaction tensor T is a purely geometric object whose scalar components are analytical functions of the mutual orientation of local axis systems centered on nuclei A and B, as well as the internuclear vector.64,65 The dependence on the mutual orientation must be included because the atomic multipole moments are expressed in local axis systems rather than a single global axis system. The infinite summations (over lA and lB) must be truncated to allow practical use of eqn (2). The appropriate number of expansion terms is decided by balancing the higher computational cost of their inclusion (where the cost is directly proportional to the number of terms in eqn (2)) against how rapidly the interaction energy converges. The extent of the expansion is monitored by a value termed the maximum rank L, which is defined as L = lA + lB + 1

18252

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

(3)

This equation makes clear that the AMMs can be evaluated separately, once and for all, independently from their mutual orientation. In other words, the computationally expensive six-dimensional integration of eqn (1) has been replaced by two more manageable three-dimensional integrals, of the type appearing in eqn (4). If the charge density is calculated for an optimised geometry of a molecule then all its corresponding AMMs can be calculated. These multipole moments Qlm can then be used to calculate the electrostatic interaction energy between this molecule and another molecule for which the AMMs have also been calculated. This approach has been applied successfully both in the context of potential energy surface exploration and molecular dynamics simulation52,69–72 with the constraint that the internal geometry of the interacting molecules did not change. This is the rigid body approach, which is also utilised in this investigation, i.e. the assumption has been made that geometries of individual water molecules do not change during a simulation. This is equivalent to stating that any intramolecular polarisation effect in the water molecules will be ignored. This decision allows for fast generation of system configuration data and also allows for a smaller number of internal coordinates than would be needed in the flexible case. It should be noted that the general method described in this communication is also applicable in the context of a system of flexible molecules. Previous applications to ethanol73 and double capped alanine48 have been reported for the flexible case.

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

The rigid body approach makes the additional approximation that the AMMs do not change with the overall system geometry (i.e. the relative position and orientation of the rigid molecules). Typically, the multipole moments of the optimised geometry are employed. In reality, the AMMs are dependent on system conformation (via the dependence of rtot, as shown in eqn (4)), and capturing this effect should lead to a more accurate description of the system. This phenomenon is the intermolecular polarisation and modelling this effect accurately is the subject of this communication. The application of kriging to the problem of intermolecular polarisation will be described below. The purpose of the kriging models is to predict AMMs, for previously unseen system geometries. In this case we model the relationship between the positions and orientations of solvating water molecules on one hand (input) and the AMMs of the solvated ion on the other hand (output), where both charge transfer and polarisation of the ion are captured by the same method. 2.3

Kriging

Kriging, also known as Gaussian process regression,47 is a regression method whose use in chemistry is uncommon, save for a few exceptions, for example in the context74,75 of QSAR/QSPR. A kriging model predicts how a response variable y varies over a multi-dimensional ‘‘feature’’ space, based on a set of training data. In this paper, the feature space comprises the d geometrical features needed to describe the orientation of rigid water molecules around a central Na+ ion, and the response variable y is one of 25 AMM values for the Na+ ion: see Section 2.4 for further details. In this section, we summarise the main facets of kriging; the reader is directed to ref. 76 and the ESI of ref. 45 for a more detailed introduction. Using kriging, the response y(x) of an unevaluated feature vector x = [x1 x2    xd]t is modeled as the sum of a global term (a weighted sum of basis functions fi, which are usually polynomial terms in x) and a localised ‘‘error’’ term z, according to the following equation: yðxÞ ¼

m X

bk fk ðxÞ þ zðxÞ

(6)

k¼1

It is actually the error term z that is the main part of the kriging prediction formula; it is so effective that the global term is often simplified to a constant, as is done in this paper. This is known as simple kriging, and is equivalent to setting m = 1 and f1(x) = 1 in eqn (6). The global term may alternatively be omitted altogether. The ‘‘error’’ term z is modeled as a random variable over the feature space, that is, as a random process. More precisely, z is modeled as a Gaussian process with zero mean and variance s2. To understand the properties required of z, consider two Na+(H2O)n water configurations described by the feature vectors xi and x j. Intuitively, as the configurations become more similar (i.e. as the distance between the two feature vectors xi and x j approaches zero), the values of z at xi and x j, z(xi) and z(x j), should approach each other. In other words, z(xi) and z(x j) are correlated. The extent to which z(xi) and z(x j) are correlated

This journal is

c

the Owner Societies 2013

depends on the distance between xi and x j. As xi tends to x j, then the correlation tends to 1; and when xi = x j, the correlation is equal to 1. In contrast, as the distance between xi and x j tends to infinity, the correlation should tend to 0. There are several ways in which to parameterise such a correlation between z(xi) and z(x j). The most common approach is: ! d X  i ph   i   j  j (7) yh x  x  Corr z x ; z x ¼ Rij ¼ exp  h

h

h¼1

where yh Z 0, 1 r ph r 2, Rij is an element of the correlation matrix R (aka Gram matrix), and d is the number of features (or the dimensionality of the feature space). Therefore, overall the kriging model is parameterised by (i) a vector b = [b1 b2    bd]t that determines the global term in eqn (6), (ii) the variance s2 of the Gaussian process z, and (iii) two vectors h = [y1 y2    yd]t and p = [ p1 p2    pd]t, which describe how the Gaussian process values are correlated over feature space. The vector b is determined by fitting to the data using the generalised least-squares method, b = (FTR1F)1 FTR1y

(8)

where y is an n  1 vector containing the n observed data, F is the n  m matrix whose i–jth entry is fj (xi), and R is the correlation matrix whose i–jth entry is the correlation between the i-th and j-th entry in the training set, given by eqn (7). Note that in this work, as m = 1 and f1(x) = 1, F becomes a vector filled with 1’s, and therefore b is a scalar, rather than a vector. The values of s2, h and p are found by maximising the likelihood L of the observations,    L h; p; s2 yi ; i ¼ 1; 2; ::::; n ¼

1 ð2pÞn=2 ðs2 Þn=2 jRj1=2 " # ðy  FbÞT R1 ðy  FbÞ  exp  2s2 (9)

In practice, it is usually the logarithm of L that is maximised. @ log L ¼ 0 and solving this equation, the Note that setting @ðs2 Þ 2 optimal value of s may be written in terms of h and p. Therefore eqn (9) only needs to be optimised over h and p. In this work, a Particle Swarm Optimization (PSO) algorithm77 was used to globally optimize eqn (9) over h and p. The optimal values found by the PSO were then used to initialize the Nelder– Mead optimization algorithm,78 which was used to further optimize h and p locally. With the kriging parameters determined, it can be shown76 that the kriging prediction formula (see eqn (6)) may be written as ! m n d X X X   i ph  bi fi ðxÞ þ bi exp  yh xh  x (10) yðxÞ ¼ h

i¼1

i¼1

h¼1

where bi is the i-th element of R1(y  Fb). With the kriging parameters determined for a particular AMM, this equation can

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18253

View Article Online

Paper

PCCP

be used to predict the value, y, of that AMM for any configuration x of water molecules around the Na+ ion.

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

2.4

Internal coordinate representation

The kriging method is expected to be optimally predictive when the input is expressed as compactly as possible, giving the lowest dimensional feature space for the method to work in. The systems studied in this communication consist of a single atomic ion surrounded by a number of rigid water molecules. The minimum number (i.e. with no redundant information) of internal coordinate features, N, required to describe such a rigid water solvated ion system (see the ESI†) is d = 6(Nwater  1) + 3Nion

(11)

where Nwater is the number of (rigid) water molecules in the system and Nion is the number of free atoms (here ions; Nion = 1 in all cases studied herein). In the described application, the dimensionality of the feature space is directly proportional to the number of ions and water molecules that constitute the system. In order to discover how well the kriging method is able to cope with higher dimensional systems we investigated a series of clusters of the type Na+(H2O)n, where n ranges from 1 to 6. The performance of kriging with increasing dimensionality was assessed. Each additional water molecule (after the first) introduces a further 6 coordinates to the geometric description of the system. The value of d ranges from 3 (Nwater = 1) to 33 (Nwater = 6). The particular internal representation employed in this work is described briefly here and its derivation is given in greater detail in the ESI.† The origin of the axis system is placed on the ion itself. The first coordinate is the distance from the ion to the oxygen atom of its nearest neighbour water molecule and is labelled R. Only one value is required, because the x-axis of the local frame is chosen to lie along the line between the ion and the oxygen atom of the nearest neighbour water molecule. The second and third coordinates describe the orientation of the nearest neighbour water molecule and are labelled w1 and w2. Three coordinates are required to specify the orientation of a rigid molecule. However, the xy-plane of the local frame is defined to contain the lowest numbered hydrogen atom of the nearest neighbour water molecule. This introduces a restriction on the orientations that the water molecule may take. Specific details are given in the ESI.† For all water molecules other than the ion’s nearest neighbour, 6 coordinates are required. Here we employ three spherical polar coordinates (r, y, j) that define the position of the oxygen atom, along with three Euler angles (a, b, g) that describe the orientation of the rigid molecule and thus the positions of its hydrogen atoms. The resulting coordinate description is displayed in Fig. 2 for the n = 2 system. This set of coordinates can be evaluated for all geometries in a training set, and the resulting geometric descriptions form the input for the building of kriging models. The generation of appropriate system configurations and computation of their AMMs is described in the following section.

18254

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

Fig. 2 Diagrammatic representation of the internal coordinates used to describe a particular Na+(H2O)n system. The non-nearest neighbour molecule (purple) shows the internal coordinates for all water molecules other than the nearest neighbour (blue). The position vectors and coordinates are shown in black, while orientational coordinates are given in the appropriate colours.

2.5

Data generation

In order to model the dependence of the AMMs of an ion on the positions and orientations of its n neighbouring water molecules, a selection of system geometries is required. These geometries must be representative of the variety that will be encountered in aqueous solution. We assume that the sampling afforded by a classical Molecular Dynamics (MD) simulation represents closely enough the true ensemble to be useful for our purpose. Therefore, a MD simulation was carried out to generate the cluster geometries and appropriately sized clusters were extracted. All molecules were treated as rigid bodies in the simulation and polarisation was not included in the evaluation of the electrostatic interaction energies. The MD program DL_POLY79 was run with a timestep of 0.5 fs, using periodic boundary conditions and standard Ewald summation. The cubic simulation box had a side length of 30 Å containing 27 pairs of Na+ and Cl ions and 486 water molecules. The water molecules were represented by a rigid, non-polarisable simple point charge (SPC/E) model,80 which prescribes 0.8476e for O and +0.4238e for H. The Na+ and Cl ions were simply represented by a point charge of +1e and 1e, respectively. All atoms except hydrogen had their interactions governed by the Lennard-Jones potential whose parameters s and e were set to 2.876 Å and 0.5216 kJ mol1 respectively for Na+, to 3.785 Å and 0.5216 kJ mol1 respectively for Cl, and to 3.169 Å and 0.6501 kJ mol1 respectively for O. The Lorentz–Berthelot mixing rules were used in the simulation to calculate the LJ parameters for interactions between different atoms or ions. First, an adiabatic simulation was run for 150 ps (hence 300 000 time steps of 0.5 fs) as an NVE ensemble. This was then followed by a 50 ps isothermal–isobaric (NpT) run (i.e. 100 000 time steps), always at a pressure of 1 atm and at 298 K. This second simulation used Berendsen’s thermostat and Hoover’s barostat. Periodic boundary conditions were applied in both simulations. From the second run clusters consisting of an ion and its n nearest neighbour water molecules were sampled every 100 steps via an in-house FORTRAN90 computer program. This program selected water molecules for extraction based on their distance from the central ion atom. Each of the 27 ions of the

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

system was investigated for each system snapshot. If a given ion’s n nearest neighbours were water molecules (rather than other ions) then that cluster was saved for later use. The water– ion distance was taken as the ion–oxygen distance in all cases. Periodic boundaries were taken into account in the evaluation of the interatomic distances. Inclusion of increasingly higher numbers of nearest neighbours affects the QCT partitioning of the system and thus affects the shape and AMMs of the central ion. Fig. 1 illustrates this for all six Na+(H2O)n systems. The GUI60,61 of the program MORPHY81,82 was employed in generating the atomic basin images. For the case n = 0, the ion is spherical and as such is not shown in this figure. Introduction of a single neighbouring water molecule results in the generation of an interatomic surface between the ion and an atom of the water molecule. A critical point (purple) can be seen at the intersection of the atomic interaction line and the interatomic surface. Addition of a second nearest neighbour forms a second interatomic surface with the central ion, and results in a shape that deviates significantly from the spherical. Introduction of more water molecules further increases the complexity of the shape of the ion’s boundaries. When six water molecules are included, the sodium ion is almost entirely bound by interatomic surfaces. Extraction of solvated clusters from the simulation snapshots introduces an approximation. In extracting an ion and its n nearest neighbours, the effect of the remainder of the entire simulation system on the ion AMMs is neglected. The role of the remaining system atoms in determining the particular AMMs of the central ion atom can be investigated by plotting a particular AMM value against n. Of the 25 AMMs modelled, the charge is dominant in the electrostatic interaction. Therefore, to investigate the error arising from neglecting the presence of water molecules we look at this value in particular. The dependence of the charge of a particular Na+ atom on the number of neighbouring water molecules is given in Fig. 3. To produce this plot, 1000 geometries (extracted from the simulation as described above) were analysed for each value of n from 1 to 8. Computational expense prohibited computing further points on this plot, as evaluation of wavefunctions and AMMs for higher values of n was not feasible with the available resources. The mean charge value for a given n was evaluated over the 1000 geometries, as well as the minimum and maximum. The resulting figure shows that the mean, minimum and maximum values form a plateau with increasing n. By n = 6 the change in the mean appears to converge with the addition of more water molecules. Thus for the current pilot investigation we restrict the study to values of n from 1 to 6. 2.6

Computational details

The wavefunction must be evaluated for each extracted ion– water cluster. These were obtained at the B3LYP/6-311+G(2d,2p) level of theory via the program GAUSSIAN03.83 The current approach does not introduce any BSSE because this only arises when energies of supermolecular and monomeric wave functions are subtracted (and small basis sets are used). There is no such reference to monomeric wave functions in the current approach. For each wavefunction, the QCT partitioning must be carried out,

This journal is

c

the Owner Societies 2013

Fig. 3 Dependence of the mean, minimum and maximum values of the zeroth atomic multipole moment, Q00, of Na+ on the number of neighbouring water molecules (n) present in the Na+(H2O)n system. A set of 1000 representative geometries was employed for each n.

followed by the computation of the necessary AMMs of each system atom by integration over the resulting atomic basins, within an acceptable error.84 These calculations were performed using the programs MORPHY and AIMALL.85 Kriging models were built with the in-house software EREBUS for each AMM (up to hexadecapole, 25 per atom) in order to allow prediction of the complete L = 5 interaction energies between the ion and surrounding water molecules. For each cluster size (n = 1, 2,. . .,6) it is necessary to build 25 kriging models, resulting in a total of 6  25 = 150 models. Each model was built using 800 Na+(H2O)n configurations, termed training examples. A sequential selection method was employed in the model building.86 The first model for each AMM was built with 100 examples. The training set size was then increased in steps of 50 up to the chosen maximum of 500. The 150 optimal models (determined by prediction of the AMM values of the training set) were validated by comparing their predictions of AMMs and interatomic electrostatic interactions to exact results as described in Section 3.

3. Results and discussion The accuracy of the models for the central ion AMMs is assessed in two ways. Initially (and most straightforwardly) it is possible to investigate the ability of each kriging model to predict the correct AMMs of the central ion for a system with n neighbouring water molecules. In order to rigorously test the optimal kriging models, an external test set is required. This must be produced in the same manner as the training set data, starting at a later point in the simulation than that at which the final training configuration was sampled, to ensure that the extracted geometries are different, i.e. to ensure that there is no overlap between training and test data. One thousand such geometries were extracted as described previously from the same simulation used to generate the training set. The wavefunctions were evaluated for each system geometry and the AMMs were computed using the programs described above. The kriging models’ prediction accuracy can

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18255

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

Paper

PCCP

Fig. 4 Dependence of the average absolute error in atomic multipole moment magnitude prediction for each of the five studied multipole moments (monopole, dipole, quadrupole, octupole and hexadecapole moment) of Na+ on the number of nearest neighbour water molecules, n.

maximum absolute errors are tabulated in Table S2 (ESI†). Both tables can be found in the ESI.† These tables provide more detail than Fig. 4, and show that for an AMM of a given rank l, the average and maximum prediction errors tend to be of a similar magnitude for all values of m for all cluster sizes, n. Analysing the quality of prediction of the AMMs themselves is valuable but the ultimate test of the proposed method is the evaluation of the accuracy of atom–atom electrostatic interaction energies computed with predicted AMMs. This accuracy can be tested by comparison of the energy evaluated from the true AMMs (i.e. those obtained by integration over atomic basins) to the energy from the AMMs predicted by the kriging models. Marking the latter, predicted energy with a bar, the energy error (with E given by eqn (5)) is then simply defined as, 

DEAB = EAB  EAB be measured by a variety of different quantities. The absolute error in prediction of a particular AMM is employed here, along with the average and maximum values of this quantity over the employed test set. Denoting the predicted value of an AMM of  the central ion for any cluster size as Qlm, the absolute prediction error is given by eqn (12), 

|DQlm| = |Qlm  Qlm|

(12)

The combination of the mean and maximum of this quantity over a varied test set allows both a global and specific evaluation of the accuracy of the final AMM models. Fig. 4 shows the dependence of the average accuracy for each of the five AMM magnitudes, |Ql|, (charge, dipole, quadrupole, sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ‘ P Q‘m 2 ) octupole and hexadecupole moment, where jQ‘ j ¼ m¼‘

on the number of nearest neighbour water molecules. This figure superimposes the respective profiles, each having its own physical dimension in the corresponding atomic unit. In spite of the different units it still makes sense to draw the following two conclusions. First we notice that the prediction error monotonically increases with an increasing number of water molecules for the kriging models associated with each of the five types of moment. One can divide the reasons for degradation of the kriging with increasing system size into two coupled effects. The first is that the distribution of multipole moments is more varied with increasing the number of neighbours and hence interatomic surfaces (see Fig. 1). The second reason is that the feature space itself is bigger, i.e. the multipole moment depends on more coordinates. Secondly, we notice that the higher rank AMMs are more affected by the number of water molecules than the lower rank ones. The increased sensitivity of the higher rank AMM makes sense by inspecting eqn (4), which defines an AMM. Indeed, higher rank tensors amplify any change in the electron density through their higher order powers of x, y and/or z. The average absolute error values for all 25 individual AMMs are tabulated in Table S1 (ESI†), and the corresponding

18256

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

(13)

For the Na+(H2O)n systems studied here, we have only built models for the AMMs of the central ion, as a proof-of-concept. Therefore, the error in energy predictions will arise only from the error in prediction of the AMMs of those atoms. For the energy evaluation we are only interested in interactions between pairs of atoms that are not part of the same molecule. This is a result of the employment of the rigid body formalism, as there will be no forces on the atoms of a given molecule due to interaction with other atoms in the same molecule. Due to the lack of kriging models for the AMMs of the water molecule atoms, we additionally need not evaluate the interactions between atoms of different water molecules. This leaves only the interactions between the atoms of the water molecules and the central ion. Hence, for a cluster with n water molecules, there will be 3n interactions to evaluate (3 for each water molecule). The expression for the total system’s electrostatic energy is obtained by summing eqn (5) over all atom pairs A and B, Coulomb ETotal ¼

‘A 1 X 1 X XX

‘B X

mB Q‘A mA T‘‘ABm Q‘B mB (14) A

B ‘A ¼0 ‘B ¼0 mA ¼‘A mB ¼‘B

where A is always a central ion, and B runs over the remainder of the system atoms (the hydrogen and oxygen atoms of the n nearest neighbour water molecules of each system). Returning to the interaction of one atom pair A and B, the corresponding energy error DEAB can be written as follows, using part of eqn (14) and (13), DEAB ¼

‘A 1 X 1 X X

‘B X

mB Q‘A mA T‘‘ABm Q‘B mB A

‘A ¼0 ‘B ¼0 mA ¼‘A mB ¼‘B

(15) 

‘A 1 X 1 X X

‘B X

‘ m T ‘B mB Q ‘ m Q B B A A ‘A m A

‘A ¼0 ‘B ¼0 mA ¼‘A mB ¼‘B

Via some elementary algebra this equation can be rewritten as a function of DQlm (defined in eqn (12)), DEAB ¼

‘A 1 X 1 X X

‘B X

mB DQ‘A mA T‘‘ABm Q‘B mB A

(16)

‘A ¼0 ‘B ¼0 mA ¼‘A mB ¼‘B

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

where the fact that DQB = 0 has been invoked, which is caused by the exact moments being used for all atoms B. Eqn (16) measures the collective performance of all kriging models for the sodium ion, one for each AMM, through an energy difference obtained from electrostatically probing this ion with surrounding water molecules represented by true (i.e. not predicted) moments. To assess the kriging models in these energy terms, the total interaction energy error for the ion  water clusters was evaluated for each individual cluster configuration of the external test set. This is done by using eqn (17),      X  DEcluster configuration  ¼  DEAB  (17)  B  Fig. 5 shows these errors in total interaction energy, in the form of so-called S-curves, one for each of the six types of Na+(H2O)n clusters (n = 1 to 6). Each point on a given S-curve corresponds to a particular configuration of a cluster of fixed n value. Note the logarithmic scale on the x-axis. These DEcluster config energy errors are ordered by magnitude and build up an S-curve that then returns, on the y axis, the percentage of test configurations with an energy error up to a certain value. For example, 50% of (previously unseen) Na+(H2O)6 (the largest and worst predicted system) test configurations have a total energy error of 1 kJ mol1 (purple S-curve). This S-curve ‘hits the ceiling of 100%’ at a value of 3.7 kJ mol1. The average energy error (again for n = 6) is 1.1 kJ mol1, which is not shown in Fig. 5. The maximum errors for the smaller clusters are 3.3, 3.0, 2.2 and 1.2 kJ mol1 for n = 5, 4, 3 and 2, respectively. This monotonically decreasing sequence of values expresses the expected increased accuracy of the kriging models for smaller clusters, which may be ascribed to lower variation in the AMM values and smaller feature spaces with smaller n (via eqn (11)). This behaviour is also reflected in the energy errors associated with the 50% percentile, which are approximately 0.8, 0.7, 0.5 and 0.005 kJ mol1, for n = 5, 4, 3 and 2, respectively. As the cluster size decreases, the overall profiles of the S-curves orderly translate to the left, without ever intersecting, articulating a systematic increase in accuracy. A final observation on the

Fig. 5 Absolute errors in evaluation of the total electrostatic energy between the ion and all water atoms for the Na+(H2O)n systems in the external test set geometries, where n ranges from 1 to 6.

This journal is

c

the Owner Societies 2013

collective pattern of S-curves is that they bunch up towards the higher n values. Starting with excellent errors for the Na+(H2O) clusters, the errors for the Na+(H2O)2 clusters deteriorate dramatically but are still very good. Further error deterioration shows asymptotic behaviour with increasing n as the gap narrows between the S-curves moving to the right. For comparison of these results, earlier work on clusters of rigid water molecules45 used the described method to model the intermolecular polarisation of the atoms of a water molecule as dependent on the position and orientation of its neighbouring water molecules. The energy results for clusters of 5 water molecules showed a maximum absolute energy near 30 kJ mol1, compared to the value of 3.7 kJ mol1 found for the largest cluster studied here. One possible reason for the discrepancy is that in the older work, all AMMs of all water atoms were predicted in evaluating the cluster energy, thus a much greater number of predictions contributed to the total error. This suggests that for a model that treats both water and ions, the error may be larger than found here. In order to understand better the origin of the total energy errors (and potentially improve the overall method) they can be broken down into components. Consideration of eqn (16) shows that the error in an interaction energy prediction is directly proportional to the AMMs of atom B and the error in prediction of the AMMs of atom A, as well as the interaction tensor T. The nature of the AMM prediction errors have been discussed above; for a given cluster size, n, they tend to increase with l and are similar for all m at a given rank l. Improving the results for prediction of AMMs will directly improve the energy error arising from their interactions, where the accuracy of the AMM predictions is dependent on the sampling and machine learning methods employed. The relationship of DEAB to the AMMs of atom B suggests that energies of interactions between ion A and atoms B with relatively large AMMs will be worse predicted (i.e. the energy error terms will be larger) than interactions between ion A and atoms B with relatively small AMMs. This also has an effect on the contribution of AMMs with different values of l, as the magnitude of an AMM tends to decrease with increasing rank. This means that the lower-rank AMMs should receive focus when attempting to improve the method. As a result of the dependence on AMM magnitude, the energy errors for interactions between the sodium and oxygen atoms should be larger than for interactions between the sodium and hydrogen atoms due to the greater magnitude of the charge of an oxygen atom. This can be investigated by computing the Na  O and Na  H interaction energy errors separately. Fig. 6 shows the resulting S-curves for the Na  O (Fig. 6a) and Na  H (Fig. 6b) interactions. The total absolute error for each test configuration was taken, confining the sum in eqn (17) to either oxygen atoms or hydrogens only. This error was then divided by the number of atoms of the particular type that were present. For example, for three water molecules (n = 3), the S-curve for Na  H is divided by 6, while the S-curve for Na  O is divided by 3. Again, both plots show just as strongly as before (see Fig. 5) that the n = 1 case is trivial compared to the rest, and that the contributions per atom from a particular element do not get significantly worse when three or more water molecules are present. Comparing the plots in

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18257

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

Paper

PCCP

Fig. 6 (a) Absolute errors in prediction of the Na+  O interatomic interaction energies in Na+(H2O)n (n = 1 to 6) clusters of the external test set. (b) Absolute errors in prediction of the Na+  H interatomic interaction energies in Na+(H2O)n (n = 1 to 6) clusters of the external test set.

Fig. 6a with those in Fig. 6b shows that the Na  O interactions contribute errors of greater magnitude per atom, as expected. The energy prediction error will be determined by the observed AMM size effect and the influence of the distance between the ion and its interaction partner. The direct proportionality of the energy error to the value of the interaction tensor (see eqn (16)) results in an inverse dependence of the energy error on the interatomic distance, r, as the denominator of T contains a power of r or T p 1/r lA+lB+1

(18)

Interactions that occur over a greater distance will therefore result in smaller magnitude energy prediction errors, as the value of T in eqn (16) becomes smaller with increasing distance. Fig. 7 shows the distribution of the Na  O and Na  H distances in the training and test sets for values of n from 1 to 6. For the Na  O interactions, increasing n results in the broadening of the distance distribution, accompanied by a movement of the peak to larger interaction distances. These larger distances result in smaller interaction tensor values and so increasing n will have the effect of reducing the average energy error for the whole system. Increasing n appears to have little effect on the Na  H distance distribution. This can help explain the ‘‘bunching effect’’ in Fig. 5 and 6. The microhydrated Na+ systems studied here

18258

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

Fig. 7 Distribution of Na  O (top) and Na  H (bottom) interatomic distances in Na+(H2O)n for n = (1, 2,. . .,6).

are dominated by short-range interactions (o5 Å in all cases) and so we can expect results for systems dominated by longerrange interactions to give better energy prediction results than those found here. In general, the energy error for a particular electrostatic interaction A  B computed by the described method is determined by: (i) the magnitude of the error in the ion AMM prediction, which can be systematically improved by focusing on the machine learning and sampling aspects of the method, (ii) the size of AMMs involved in the atom–atom interactions, which cannot be improved and is dependent on the distribution of elements in the particular system being studied and (iii) the interaction distances, whose distribution is again dependent on the chemical system being modelled and thus cannot be engineered. For this reason, focus on improving the method must be placed on the machine learning and sampling elements, while understanding of the performance of the method for a given chemical system must take into account its atoms and spatial structure.

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

In future, improvements to the model could potentially be made by employment of different machine learning methods, in particular those that are able to accommodate increasingly large and more complex features spaces, or by careful assessment of the genesis of the errors in AMM prediction accuracy found in this work. The initial sampling method can also be scrutinised in order to provide a physically more realistic set of training and test data for the kriging models, potentially by employing more realistic simulation methods as generators of cluster configurations. In order to be used to treat dynamic chemical problems, it must be possible to carry out simulations with the method. In order to be applied in an MD simulation, a definition of the atomic electrostatic forces is required. This can be derived analytically for the current kriging method and will be described in a forthcoming publication. An application of the method to water molecules is necessary in order for the complete picture of the usefulness of the method in simulations to be obtained, as it would also be necessary to model the dependence of the AMMs of the water molecules on the positions and orientations of neighbouring water molecules and ions. This will require a definition of a local frame for water molecules, a procedure that is significantly complicated by the possibility that a single water molecule may be neighboured by any combination of water and ion molecules at any point in the simulation. Future work will (i) obtain kriging models for the oxygens and hydrogens, ideally with (ii) flexible water molecules, (iii) venture into transition metal cations, where no new conceptual problems should appear, and (iv) tackle anions, which were proven to be more challenging than cations.

4. Conclusions A generic method for the evaluation of interatomic electrostatic interaction energies in solvated ion systems has been described. Polarisation and charge transfer has been accounted explicitly, and in a streamlined manner, without reference to long-range perturbation theory. Atoms (and ions) are represented electrostatically by a set of compact multipole moments, which are defined via the partitioning afforded by Quantum Chemical Topology. The polarisation of atoms is modelled directly, without constructing a polarisability tensor. For this purpose we trained the machine learning method kriging to capture the dependence of the multipole moments of a given atom on the geometry of the surrounding molecules. Configurations of the ion  water complexes were sampled from a molecular dynamics simulation using a non-polarisable point charge potential. The coordinates for description of solvated ion systems have been described with the rigid body approximation, and an application to Na+  water clusters is presented. The effect of the size of the solvation shell on the final results is evaluated. The net charge of the ion has converged for a shell of six water molecules. The kriging models are able to predict the electrostatic interaction energy between the ion and all water atoms within 4 kJ mol1 for any of the external test set Na+(H2O)6 configurations. This accuracy depends on an intricate combination of the magnitude of the true atomic multipole moments

This journal is

c

the Owner Societies 2013

of an ion, the interaction tensor that relates the energy to the interatomic separation and mutual orientation of the local frames, and finally the accuracy of the predictions of the ion’s atomic multipole moments. The maximum prediction errors systematically improve for smaller clusters, being less than 0.2 kJ mol1 for Na+  H2O. In applications, only the accuracy of the kriging models is under user control and thus improvement of the multipole moment predictors is the sole way to guarantee improvement in the interatomic interaction energy predictions. The accuracy of the energy predictions is evaluated in terms of average and maximum errors and is found to be most encouraging for the current proof-of concept study.

Acknowledgements The authors are grateful for the financial support received from EPSRC for two studentships (MJLM and CMH), and BBSRC and UMIP for a postdoctoral position (GIH). Gratitude is due to several group members who worked on previous applications of the described method to solvated ion systems. Part of the computational element of this research was achieved using the High Throughput Computing (CONDOR) facility of the Faculty of Engineering and Physical Sciences, the University of Manchester.

References 1 P. Jungwirth and D. J. Tobias, Chem. Rev., 2005, 106, 1259. 2 T. W. Allen, O. S. Andersen and B. Roux, Biophys. Chem., 2006, 124, 251. 3 I. Benjamin, Chem. Rev., 1996, 96, 1449. 4 B. Roux, S. Berneche, B. Egwolf, B. Lev, S. Y. Noskov, C. N. Rowley and H. B. Yu, J. Gen. Physiol., 2011, 137, 415. 5 B. J. Finlayson-Pitts, Chem. Rev., 2003, 103, 4801. 6 F. Hofmeister, Naunyn-Schmiedeberg’s Arch. Pharmacol., 1888, 24, 247. 7 A. Heydweiller, Ann. Phys., 1910, 338, 145. 8 L. X. Dang and T. M. Chang, J. Phys. Chem. B, 2002, 106, 235. 9 M. Mucha, T. Frigato, L. M. Levering, H. C. Allen, D. J. Tobias, L. X. Dang and P. Jungwirth, J. Phys. Chem. B, 2005, 109, 7617. 10 P. Auffinger, T. E. Cheatham and A. C. Vaiana, J. Chem. Theory Comput., 2007, 3, 1851. 11 J. L. Finney, Philos. Trans. R. Soc. London, Ser. B, 2004, 359, 1145. 12 F. Franks, Water: a matrix of life, Roy. Soc. of Chem., Cambridge, Great Britain, 2000. 13 E. Pluharova, L. Vrbka and P. Jungwirth, J. Phys. Chem. C, 2010, 114, 7831. 14 B. Guillot, J. Mol. Liq., 2002, 101, 219. 15 P. K. Yuet and D. Blankschtein, J. Phys. Chem. B, 2010, 114, 13786. 16 P. L. A. Popelier, M. Devereux and M. Rafat, Acta Crystallogr., Sect. A: Found. Crystallogr., 2004, 60, 427. 17 A. J. Stone, Science, 2008, 321, 787.

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18259

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

Paper

PCCP

18 P. L. A. Popelier, Curr. Top. Med. Chem., 2012, 12, 1924. 19 S. Cardamone, T. Hughes and P. L. A. Popelier, Chem. Soc. Rev., 2013. 20 A. J. Stone, The Theory of Intermolecular Forces, Clarendon Press, Oxford, 1st edn, 1996, vol. 32. 21 P. L. A. Popelier, Atoms in Molecules; An Introduction, Pearson Education, London, 2000. 22 P. L. A. Popelier and M. Rafat, Chem. Phys. Lett., 2003, 376, 148. 23 P. Ren and J. W. Ponder, J. Phys. Chem. B, 2003, 107, 5933. 24 L. X. Dang, J. E. Rice, J. Caldwell and P. A. Kollman, J. Am. Chem. Soc., 1991, 113, 2481. 25 A. Grossfield, P. Ren and J. W. Ponder, J. Am. Chem. Soc., 2003, 125, 15671. 26 G. Lamoureux and B. Roux, J. Phys. Chem. B, 2006, 110, 3308. 27 D. Jiao, C. King, A. Grossfield, T. A. Darden and P. Ren, J. Phys. Chem. B, 2006, 110, 18553. 28 B. Roux, Chem. Phys. Lett., 1993, 212, 231. 29 D. Bucher, L. Guidoni, P. Carloni and U. Rothlisberger, Biophys. J., 2010, 98, L47. 30 S. W. Rick and S. J. Stuart, in Reviews in Comput. Chem., ed. K. B. Lipkowitz and D. B. Boyd, Wiley-VCH, 2002, vol. 18, p. 89. 31 R. Kumar, F.-F. Wang, G. Jenness and K. A. Jordan, J. Chem. Phys., 2010, 132, 014309. 32 H. Yu and W. F. van Gunsteren, Comput. Phys. Commun., 2005, 172, 69. 33 C. Schroder, Phys. Chem. Chem. Phys., 2012, 14, 3089. 34 Y. Luo, W. Jiang, H. Yu, A. D. MacKerell Jr and B. Roux, Faraday Discuss., 2013, 160, 135–149. 35 S. J. Stuart and B. J. Berne, J. Phys. Chem., 1996, 100, 11934. 36 J. C. Wu, J. P. Piquemal, R. Chaudret, P. Reinhardt and P. Y. Ren, J. Chem. Theor. Comput., 2010, 6, 2059. 37 S. Yoo, Y. A. Lei and X. C. Zeng, J. Chem. Phys., 2003, 119, 6083. 38 K.-H. Cho, K. T. No and H. A. Scheraga, J. Mol. Struct., 2002, 641, 77. 39 C. M. Handley and P. L. A. Popelier, J. Phys. Chem. A, 2010, 114, 3371. 40 J. Behler, Phys. Chem. Chem. Phys., 2011, 13, 17930. 41 T. Morawietz, V. Sharma and J. Behler, J. Chem. Phys., 2012, 136, 064103. 42 N. Artrith, B. Hiller and J. Behler, Phys. Status Solidi B, 2013, 50, 1191. 43 M. G. Darley, C. M. Handley and P. L. A. Popelier, J. Chem. Theory Comput., 2008, 4, 1435. 44 C. M. Handley and P. L. A. Popelier, J. Chem. Theor. Comput., 2009, 5, 1474. 45 C. M. Handley, G. I. Hawe, D. B. Kell and P. L. A. Popelier, Phys. Chem. Chem. Phys., 2009, 11, 6365. 46 M. J. L. Mills and P. L. A. Popelier, Comput. Theor. Chem., 2011, 975, 42. 47 C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, The MIT Press, Cambridge, USA, 2006. 48 M. J. L. Mills and P. L. A. Popelier, Theor. Chem. Acc., 2012, 131, 1137.

18260

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

49 S. M. Kandathil, T. L. Fletcher, Y. Yuan, J. Knowles and P. L. A. Popelier, J. Comput. Chem., 2013, 34, 1850. 50 P. L. A. Popelier, in Structure and Bonding. Intermolecular Forces and Clusters, ed. D. J. Wales, Springer, Heidelberg, Germany, 2005, vol. 115, p. 1. 51 U. Koch, P. L. A. Popelier and A. J. Stone, Chem. Phys. Lett., 1995, 238, 253. 52 P. L. A. Popelier, A. J. Stone and D. J. Wales, Faraday Discuss., 1994, 97, 243. 53 P. L. A. Popelier and F. M. Aicken, ChemPhysChem, 2003, 4, 824. 54 L. Joubert and P. L. A. Popelier, Phys. Chem. Chem. Phys., 2002, 4, 4353. ´. A. G. Bre ´mond, Int. J. Quantum 55 P. L. A. Popelier and E Chem., 2009, 109, 2542. 56 N. O. J. Malcolm and P. L. A. Popelier, Faraday Discuss., 2003, 124, 353. 57 R. F. W. Bader, Atoms in Molecules. A Quantum Theory, Oxford Univ. Press, Oxford, Great Britain, 1990. 58 P. L. A. Popelier, Atoms in Molecules. An Introduction, Pearson Education, London, Great Britain, 2000. 59 R. F. W. Bader and P. L. A. Popelier, Int. J. Quantum Chem., 1993, 45, 189. 60 M. Rafat and P. L. A. Popelier, J. Comput. Chem., 2007, 28, 2602. 61 M. Rafat, M. Devereux and P. L. A. Popelier, J. Mol. Graphics Modell., 2005, 24, 111. 62 P. L. A. Popelier and D. S. Kosov, J. Chem. Phys., 2001, 114, 6539. 63 P. L. A. Popelier, L. Joubert and D. S. Kosov, J. Phys. Chem. A, 2001, 105, 8254. 64 P. L. A. Popelier and A. J. Stone, Mol. Phys., 1994, 82, 411. 65 C. Haettig, Chem. Phys. Lett., 1996, 260, 341. 66 S. Liem and P. L. A. Popelier, J. Chem. Phys., 2003, 119, 4560. 67 S. Y. Liem, P. L. A. Popelier and M. Leslie, Int. J. Quantum Chem., 2004, 99, 685. 68 D. S. Kosov and P. L. A. Popelier, J. Phys. Chem. A, 2000, 104, 7339. 69 S. Y. Liem and P. L. A. Popelier, J. Chem. Theory Comput., 2008, 4, 353. 70 M. S. Shaik, S. Y. Liem, Y. Yuan and P. L. A. Popelier, Phys. Chem. Chem. Phys., 2010, 12, 15040. 71 M. S. Shaik, S. Y. Liem and P. L. A. Popelier, J. Chem. Phys., 2010, 132, 174504. 72 S. Y. Liem, M. S. Shaik and P. L. A. Popelier, J. Phys. Chem. B, 2011, 115, 11389. 73 M. J. L. Mills and P. L. A. Popelier, Comput. Theor. Chem., 2011, 975, 42. 74 F. R. Burden, J. Chem. Inf. Model., 2001, 41, 830. 75 G. I. Hawe, I. Alkorta and P. L. A. Popelier, J. Chem. Inf. Model., 2010, 50, 87. 76 D. R. Jones, J. Global Optim., 2001, 21, 345. 77 J. Kennedy and R. C. Eberhart, Proc. IEEE Int. Conf. Neural Networks, 1995, 4, 1942. 78 J. A. Nelder and R. Mead, Comput. J., 1965, 7, 308.

This journal is

c

the Owner Societies 2013

View Article Online

Published on 05 September 2013. Downloaded by The University of Manchester Library on 02/10/2013 08:15:41.

PCCP

Paper

79 DLPOLY, a computer program written by W. Smith, M. Leslie and T. R. Forester, CCLRC, Daresbury Lab, Daresbury, Warrington WA4 4AD, England, 2003. 80 H. J. C. Berendsen, J. R. Grigera and T. P. Straatsma, J. Phys. Chem., 1987, 91, 6269. 81 P. L. A. Popelier, Comput. Phys. Commun., 1996, 93, 212. 82 P. L. A. Popelier, Mol. Phys., 1996, 87, 1169. 83 M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, J. A. J. Montgomery, T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin,

This journal is

c

the Owner Societies 2013

R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez and J. A. Pople, GAUSSIAN03, Gaussian, Inc., Pittsburgh, PA, 2003. 84 F. M. Aicken and P. L. A. Popelier, Can. J. Chem., 2000, 78, 415. 85 T. A. Keith, AIMAll (Version 10.07.25), 2010, aim.tkgristmill. com. 86 G. Rennen, Struct. Multidisc. Optim., 2009, 38, 545.

Phys. Chem. Chem. Phys., 2013, 15, 18249--18261

18261

Suggest Documents