Building wave functions for large molecules from their fragments

PHYSICAL REVIEW A, VOLUME 64, 042501

Building wave functions for large molecules from their fragments R. Santamaria,* J. A. Mondragoń-Sańchez, and M. A. Cunningham Instituto de Fisica, UNAM, Apartado Postal 20-364, Distrito Federal 01000, Mexico 共Received 12 January 2001; published 10 September 2001兲 The central premise of density functional methods is that functionals of the electron density describe the properties of many-electron systems, including quantum effects such as electron exchange and correlation. In practice, such methods have demonstrated accuracies comparable to those of more computationally intensive quantum methods; however, they are limited by their demand of computational power. In this report we describe an alternative approach to constructing the initial electron density that serves as the starting point for density functional methods. In our method, the total density is constructed piecewise from fragments. Assuming that fragment densities can be computed efficiently, this leads to significant improvements in the overall calculation effort. We report results of calculations in several model systems. DOI: 10.1103/PhysRevA.64.042501

PACS number共s兲: 31.15.Ew, 31.15.Ar

I. INTRODUCTION

Quantum molecular simulations of chemical systems can provide detailed information that is often inaccessible to direct experimental measurement. However, there are a number of practical considerations that limit the applicability of numerical simulations to large systems 关1兴. In particular, the solution of complex equations and optimization of large numbers of variables scale poorly. In this regard, new implementations of quantum mechanics have made important advances. The Kohn-Sham 共KS兲 version of density functional theory 共DFT兲关2兴 has increased the size of systems to a few tens of heavy atoms, larger than could previously be studied with traditional ab initio approaches while still maintaining chemical accuracy 关3兴. The key aspects introduced by KSDFT refer to the use of a single determinant wave function and the representation of exchange and correlation energies as functional forms of the electron density 关4兴. In spite of the increase in the number of atoms, the size of systems that can be treated remains modest. Certainly, dealing with protein molecules or accounting for the immediate molecular environment lies outside the range of current computational capabilities. We are thus motivated to explore alternative approaches that might lead to a significant expansion in the size of molecular systems. The mixing of quantum methods with model potentials 共including force fields兲 that mimic quantum forces with simple expressions has permitted the study of large systems while taking into account the environment 关5兴. Nevertheless, other important problems have simultaneously arisen. For instance, model potentials are unable to estimate important quantum effects like the exchange and correlation among electrons. Also, it is not clear how to deal with overlap atoms that define the interface between the quantum and classical regions 关6兴. Although the simulation of environmental forces with model potentials is a choice, care should be exercised as deviations introduced by model interactions can overshadow the accuracy achieved in the quantum region. Shifts in free

*Corresponding author. Electronic address: [email protected] 1050-2947/2001/64共4兲/042501共8兲/$20.00

energy of a few kilocalories per mole results in orders of magnitude changes in reaction rates. Hence, theoretical methods must be accurate. Other quantum methods have avoided the use of lower levels of theories and have instead focused their efforts on appropriate partitions of the energy that may lead to additive interactions 关7兴. Then, the total energy is assumed to be a summation of additive energies with corrections coming from nonadditive terms. The results from these methodologies are marginally accepted and much work is still to be performed. Among other important techniques to speed up computations is the effective core potential approach 关8兴, in which only the outermost 共valence兲 electrons participate in bond formation, while the innermost 共core兲 electrons of the atom are implicitly accounted for by introducing effective potentials. Also, the need for a fast and reliable evaluation of interaction integrals has led to multipolar expansions 关9兴, linear scaling algorithms 关10兴, and the introduction of auxiliary basis sets 关11兴, in addition to the common orbital basis sets, to efficiently reduce computational times. The use of cutoffs for large distance interactions and, whenever possible, the use of molecular symmetry are also techniques to reduce the number of integral evaluations 关12兴. Many of the above schemes favor the use of processors in parallel, and in combination with fast supercomputers have made possible the study of relatively big systems. In spite of such sophisticated techniques the application of quantum theories to large systems continues to be a significant challenge, due to the great number of variables that have to be optimized. Therefore, better and more flexible approaches are required to more realistically simulate the processes that occur at the quantum level. It is our objective in this work to design an approach that allows us to extend the present Kohn-Sham DFT to large systems. The approach should be kept within the ab initio machinery framework so that DFT packages can implement it with minor changes in their codes. For this, our experience suggests that any attempt to apply quantum mechanics to large compounds should essentially start from smaller fragments 关13兴, following nature’s course in building large molecules from smaller ones. We have organized this paper as follows. In Sec. II we briefly discuss the basic entities 共molecular orbital coefficients兲 that determine the wave function

64 042501-1

©2001 The American Physical Society

´ N-SA ´ NCHEZ, AND CUNNINGHAM SANTAMARIA, MONDRAGO

and which we use later in Sec. III to propose models for the electron density and kinetic energy. In Sec. IV we build the matrix of molecular coefficients for the big molecule, based on the previous electron density and kinetic energy models. Section V is devoted to testing the fragmentation method for some DNA organic compounds and C60 carbon cluster. Finally, Sec. VI discusses the implications and gives the conclusions of our fragmentation method.

PHYSICAL REVIEW A 64 042501 TABLE I. The molecular orbital coefficients, designated as a ikj and illustrated for formaldehyde (CH2 O), are the basic entities that determine the Kohn-Sham wave function once primitive Gaussians have been assigned to the atomic orbitals.

Atom number

Matrix of molecular orbital coefficients Atom Atomic Molecular orbitals type orbital ␾1 ... ␾k ...

1

C

2

O

3 4

H H

II. THE WAVE FUNCTION

We start by dividing a large molecule into M fragments, say F 1 ,F 2 , . . . ,F M , for which we can perform ordinary self-consistent field 共SCF兲 DFT quantum calculations. Each fragment is assumed to have a wave function ⌿ represented by a single determinant that satisfies the antisymmetry principle and makes electrons indistinguishable from each other. The determinant can be written in a compact Dirac ket form as ⌿ 共 xជ 1 ,xជ 2 , . . . ,xជ n 兲 ⫽ 兩 ␾ 1 共 xជ 1 兲 ␾ 2 共 xជ 2 兲 ••• ␾ n 共 xជ n 兲典 ,

共1兲

where only diagonal terms are shown. The elements of the determinant are molecular orbitals ␾ constructed from linear combinations of atomic orbitals 共shells兲 ␹ 关14兴, N

␾ k 共 xជ 兲 ⫽ 兺

Ni

兺

i⫽1 j⫽1

a ki j ␹ i j 共 xជ 兲 ;

共2兲

summations are taken over the total number N of atoms in the fragment and total number N i of shells that belong to the ith atom. Atomic shells are further expanded in terms of Gaussians to facilitate the solution of integro-differential equations that appear in SCF quantum methods: L

␹ 共 xជ 兲 ⫽

兺

k⫽1

␤ k g k 共 ␣ k ,xជ ⫺Rជ 兲 .

共3兲

For simplicity, we work with minimal Gaussian sets (L ⫽3), recognized in the literature as basis sets of the STO-3G type, which reproduce the basic features of atomic shells and are centered on the nuclei. Because all parameters ( ␣ exponents and ␤ coefficients兲 involved in the STO-3G Gaussians are fixed, the shape of the wave function is completely defined by the molecular orbital coefficients a ki j . This is the reason that efforts are concentrated on finding the best a ki j coefficients, usually by a variational process that minimizes the DFT energy. Given the important role that molecular orbital coefficients play, they are conveniently accommodated in matrices with a simple form. For example, in the case of formaldehyde (CH2 O) with 16 electrons located in eight doubly occupied molecular orbitals, we have the matrix shown in Table I. Even though formaldehyde is a small molecule, the number of variables to optimize is 96, equal to the number of atomic shells times the number of occupied orbitals. In the case of fullerene (C60) the number of variables becomes 54 000, although molecular symmetry can be used to reduce this number. In the general case of biomolecules, commonly

a 111 a 112 . . a 115 a 121 a 122 a 123 . . a 131 a 141

1s 2s 2px 2py 2pz 1s 2s 2px 2py 2pz 1s 1s

••• •••

••• •••

••• •••

a k11 a k12 . . . a k21 a k22 . . . a k31 a k41

••• •••

••• •••

••• •••

␾8 a 811 a 812 . . . a 821 a 822 . . . a 831 a 841

showing no symmetry at all, the number of degrees of freedom prohibits any quantum analysis. Thus, if we want to study big molecules at the quantum level, a natural and convenient way to do it is by fragmentating the big molecule into small components for which it is possible to perform standard KS-DFT computations. With this procedure we try not to reduce the large number of molecular orbital coefficients that are inherent in quantum calculations, but divide the configuration space in such a way that we can optimize small subsets of molecular coefficients. The first complication faced in this enterprise is how to link the molecular coefficients of fragments to conform to the matrix of molecular coefficients of the big molecule. Unfortunately, we do not know of any other relation 共other than the computationally expensive SCF procedure兲 that directly involves the molecular orbital coefficients and which could simplify the mapping process. Hence, we have recourse to the electron density as an indirect form to combine the molecular coefficients of fragments that will determine the wave function of the large molecule. The work with molecular coefficients does not alter the single-determinant representation of the wave function, assumed to be one of the key aspects in Kohn-Sham theory.

III. THE ELECTRON DENSITY

The central premise in DFT is that the system Hamiltonian can be expressed in terms of functionals of the electron density ␳ . The electron density depends on the molecular coefficients through the n lowest occupied molecular orbitals:

042501-2

n

␳ 共 rជ 兲 ⫽

兺

k⫽1

兩 ␾ k 共 rជ 兲兩 2 .

共4兲

BUILDING WAVE FUNCTIONS FOR LARGE MOLECULES . . .

PHYSICAL REVIEW A 64 042501

If it is possible to propose an adequate electron density from combination of the molecular coefficients of fragments, we should in turn expect adequate energy contributions. In other words, we construct an electron density that virtually has the same effects in the energy as that DFT electron density obtained by means of traditional KS-DFT calculations. As remarked above, construction of the density via a single large SCF calculation is not a numerically efficient strategy and, consequently, we look for the best mapping

the molecular orbital coefficients of those atoms that overlap. In particular, we scale all coefficients of valence shells, assumed to take part in bonds, while we make zero the coefficients of first-neighbor core shells. Following the example of ethane, charge overestimation is avoided by scaling the valence shells ( P states兲 of the C and C⬘ atoms and making zero the core shells (S states兲 of the neighbor C⬘ atom. The scaling process is restricted to reproducing the local density distribution around overlapping atoms and/or preserving the number of electrons of the system. It is recommended to work out the proper density distribution at those points that are used as a grid for numerically integrating energy terms like exchange and correlation 关15兴. Briefly, the scaling process is done in an automatic and straightforward manner, essentially consumes no computation time, and solves the problem of overestimating the electron density. Using the electron density model in conjunction with the scaling proposal permits the calculation of most energetic contributions to the total energy. However, the Kohn-Sham kinetic component is still an expression of the molecular orbitals, so we additionally require a mechanism to compute the kinetic energy. In the KS scheme the kinetic energy is

共 ␳ 1 , ␳ 2 , . . . , ␳ M 兲 → ␳ v irtual ⬇ ␳ DFT ,

共5兲

where ␳ i is the electron density of the ith fragment. Because ␳ is an extensive quantity, all fragment densities should contribute at a given rជ q point, so we propose the following density model:

␳ v irtual 共 rជ q 兲 ⫽ ␳ 1 共 rជ q 兲 ⫹ ␳ 2 共 rជ q 兲 ⫹•••⫹ ␳ M 共 rជ q 兲 .

共6兲

The validity of this approximation depends on the connectivity among fragments. If fragments are completely unaffected by the presence of other fragments, ␳ v irtual is not expected to properly reproduce ␳ DFT ; otherwise ␳ v irtual should be a good approximation. In order to communicate between neighboring fragments, we suppose for simplicity that the fragment influence just extends up to the first atomic neighbors in the contiguous fragments 共although second, third, and more neighbors can in principle be considered兲. Our goal is that fragments are affected by the presence of their partners solely through the use of atomic neighbors. For example, to be more specific, if we break the CuC bond of ethane, one of the simplest hydrocarbons with structure CH3 uCH3 , the fragments actually calculated could be CH3 uC⬘ and C⬘ uCH3 , where the neighbor C⬘ atom is precisely of the same type as the one that was removed. In this form, it is possible to produce the polarization of the electronic charge of the fragment under consideration due to the immediate molecular environment. In spite of this simplification two main problems now arise: 共a兲 some fragments may not permit reaching convergence in the molecular orbitals necessary to obtain the KS-DFT wave function, and 共b兲 the introduction of atomic neighbors involves overlap atoms among fragments, which implies atomic double counting. The first problem can be solved if we consider that there are many ways to perform the partition of a big molecule or, alternatively, if the fragmentation is performed with suitable care. We try to avoid, for instance, partitioning conjugated rings, severing strong bonds, or breaking a great number of bonds. It is possible to saturate broken bonds with special groups such as H, NH2 , CH3 , etc., to obtain molecules that are known to be stable in nature and for which convergence can be achieved. In the example of ethane given above, bond saturation consists in adding to the C⬘ carbon neighbor a hydrogen to get a more stable molecule. In regard to the second problem, this is more difficult and deserves special attention. The inclusion of atomic neighbors to simulate the immediate surroundings leads to an overestimation of the electron density, especially in the region of joining fragments. In order to avoid an overcharged region, for simplicity we scale

n

T⫽

兺具 ␾ i兩 ⫺

i⫽1

1 2

ⵜ 2兩 ␾ i典 .

共7兲

If it is assumed that the big molecule is partitioned into M fragments, we can in principle group together the molecular orbitals that contribute most to the kinetic energy of each fragment by, for example, selecting the orbitals centered at precisely those nuclei that take part in the fragment, plus partial contributions of molecular orbitals located in neighboring nuclei: n1

T⫽

兺具 ␾ i兩 ⫺

i⫽1

n2

1 2

ⵜ 兩 ␾ i典 ⫹ 2

兺具 ␾ i兩 ⫺

i⫽1

1 2

ⵜ 2兩 ␾ i典

nM

⫹•••⫹

兺具 ␾ i兩 ⫺

i⫽1

1 2

ⵜ 2兩 ␾ i典 ,

n⫽n 1 ⫹n 2 ⫹•••⫹n M .

共8兲

共9兲

n i is the number of orbitals considered in the ith fragment. With no loss of accuracy the kinetic energy, which involves all occupied molecular orbitals, can be written as T⫽T 1 ⫹T 2 ⫹•••⫹T M .

共10兲

Our approximation for T consists in separately computing the kinetic energy of each fragment, where atomic neighbors are considered as before to communicate fragments and with the same scaling method on the shells to preserve the total

042501-3


charge. In this form, we can suppose a similar mapping as for the electron density.1 T v irtual ⫽ T 1 ⫹ T 2 ⫹•••⫹ T M ⬇T DFT .

共11兲

We note that it is also possible to scale the molecular coefficients of overlapping atoms to reproduce local kinetic energies accurately instead of, or in addition to, local electron densities. The above expression, where all fragments are supposed to be referred to the same coordinate system, is a model for the kinetic energy of the big molecule in terms of the molecular orbitals of fragments, some of which were modified according to the scaling process already discussed. From now on, we refer to the fragmentation method as the virtual electron density approximation 共VEDA兲. IV. THE MATRIX OF MOLECULAR COEFFICIENTS

The models we have presented for the electron density and kinetic energy have identical forms to their respective KS expressions, with the exception that they are generated from different orbitals 共those of fragments兲: n⫹

␳

v irtual

共 rជ 兲 ⫽

兺兩 ␾ k共 rជ 兲兩 2 , k⫽1

共12兲

n⫹

T v irtual ⫽

兺具 ␾ i兩 ⫺

i⫽1

1 2

ⵜ 2兩 ␾ i典 .

共13兲

n⫹ is the sum of all occupied orbitals of fragments. In principle, we are in the position to compute all energy contributions of the big molecule from the fragments. Nonetheless, from a practical point of view, the problem is not finished yet because we are required to build the matrix of molecular coefficients of the big molecule from the smaller matrices of fragments. The building of a big matrix of molecular coefficients facilitates implementation of the method in most DFT software codes, with a minimum number of changes, making it possible to exploit all the experience gathered on the algorithms that constitute the present ab initio machinery. We build the big molecular coefficient matrix by putting side by side fragment matrices and keeping the order of atoms inside the big matrix as in an ordinary DFT matrix 共refer to Tables I and II as examples兲. Because it happens that many atoms are not in every fragment, their corresponding boxes in fragment matrices that lack such atoms are filled up with zeros. Also, because the matrix of the big molecule must be square, ghost atoms and even additional molecular orbitals can be included, making their molecular coefficients

An alternative expression for the kinetic energy is given in 关16兴: n T⫽ 兰 t(rជ )dr 3 , where t(rជ )⫽ 兺 i⫽1 兩 ␶ i (rជ ) 兩 2 . This last identity is similar n 2 to ␳ (rជ )⫽ 兺 i⫽1 兩 ␾ i (rជ ) 兩 , with the difference that ␶ i ⫽ⵜ ␾ i . Therefore, we can assume that t v irtual (rជ )⫽t 1 (rជ )⫹t 2 (rជ )⫹•••⫹t M (rជ ) in the same way was done for ␳ in Eq. 共6兲. 1

PHYSICAL REVIEW A 64 042501 TABLE II. Matrix of molecular coefficients of a large molecule built from the matrices of fragments. The X, Y, and Z matrices, obtained after DFT-KS calculations on fragments 6, 7, and 8, respectively, are arranged in contiguous form. Some of their rows have been up/down translated to keep the required order of atomic shells imposed by the matrix of the large molecule. The shells of atoms that overlap fragments are scaled by ␭ i to reproduce the correct local density distribution and maintain the total amount of charge. Additional rows have been inserted, those with zeros, for atoms of the big molecule that do not appear in a fragment. Matrix of molecular orbital coefficients

↑ A t o m i c S h e l l s ↓

••• ••• . . . . . ••• . . . . . ••• •••

All occupied molecular orbitals ␾ 14••• ␾ 18 ␾ 19••• ␾ 24 ␾ 25••• ␾ 31 0000 0000 ZZZZ 0000 0000 ZZZZ XXXX 0000 0000 XXXX 0000 0000 ␭1 ␭1 ␭1 ␭2 ␭2 ␭2 0000 0000 YYYY 0000 0000 YYYY 0000 XXXX 0000 0000 0000 0000 ZZZZ 0000 YYYY 0000 0000 YYYY 0000 0000 YYYY ZZZZ ␭3 ␭3 ␭3 ␭4 ␭4 ␭4 0000 XXXX 0000 0000 ␭5 ␭5 ␭5 0000 ␭6 ␭6 ␭6 0000 0000 ZZZZ 0000 0000 ZZZZ Fragment 6 Fragment 7 Fragment 8

•••

••• . . . . ••• . . . •••

•••

zero. Their presence does not affect computations but is intended to accomplish a square matrix. On the other hand, the introduction of overlap atoms to communicate between fragments increases the dimensions of the big matrix in comparison with an ordinary DFT matrix. Nevertheless, if we consider that the total number of electrons determines the total number of occupied molecular orbitals, we can force a DFT code to read the big matrix by simply declaring a fictitious number of additional electrons 共a negative total charge兲. The big matrix virtually produces the same effects as the matrix of an ordinary DFT calculation with the difference that it was not necessary to perform a SCF procedure to find the molecular orbitals of the big molecule. In other words, we have built a wave function for the big molecule; however, it was not obtained from a SCF procedure, but from collecting wave function elements from fragments. Once the matrix for the big molecule is constructed, the evaluation of the different energy terms can be done as usual since we have dealt with molecular orbital coefficients, which are considered the basic entities in a Gaussian approach. It is feasible to apply all those techniques 共like linear scaling methods, multipolar expansions, and others兲 that effectively and rapidly compute integrals. In the computation of interaction integrals all electrons are involved, even those far apart and which belong to different fragments.

042501-4



FIG. 1. The nucleic acids 共adenine, thymine, guanine, and cytosine兲 together with the type of fragmentation used to illustrate the VEDA approach. For the small pyrimidine molecules 共thymine and cytosine兲 the fragmentation was done while trying to maintain the identities of their CH3 and NH2 chemical groups.

V. MOLECULAR CALCULATIONS

We apply the VEDA to some organic compounds of interest in molecular genetics, such as the nucleic acids adenine (A), thymine (T), guanine (G), cytosine (C), and hypoxanthine 共H兲共structures taken from Ref. 关17兴兲. We also test the VEDA for the C60 carbon cluster. The molecules and fragmentation types are sketched in Figs. 1 and 2. For instance, in the case of A the cuts were done across CuN bonds to separate the five-membered ring from the sixmembered ring. In a similar form to the other molecules, only two scale factors were introduced in this case 共as indicated in Fig. 3兲 to get the total charge of each fragment. Undoubtedly, details proliferate about different scaling possibilities; nonetheless, the one presented here is one of the

FIG. 2. The fullerene molecule (C60), together with the type of fragmentation that was used to illustrate the VEDA approach. Fullerene was bisected and each fragment contained 30 atoms. However, the VEDA demands the use of atomic neighbors to polarize those atoms that build the fragment under consideration. This is the reason that each fragment considerably increases its size 共sketch at right兲. Although the VEDA gives good results for such a molecule, fragmentations of this type 共where many bonds are cut兲 are certainly not recommended.

simplest forms to scale the molecular orbitals with a minimum number of factors. We had recourse to the NWCHEM 关18兴 software package to implement our computations. Since KS and VEDA total energies do not have to be necessarily the same, we compare VEDA relative energies with KS relative energies. Relative energies are calculated for each compound and correspond to the energy difference between the ground state structure and a deformed conformation. Most deformations have been done along the partition region of the molecules, so the VEDA is tested under stringent conditions. For instance, in

FIG. 3. The fragmented adenine molecule 共also refer to Fig. 1兲 showing a particular scaling type and the factors ( ␦ 1 and ␦ 2 ) used to scale the valence shells 共in this case P molecular orbitals兲 of carbon and nitrogen atoms located in the cut region of equilibrated 共Eq兲 and deformed 共Def兲 adenine. The core shells (S molecular orbitals兲 of atoms with tags ␦ i , i⫽1,2, except these with tags 1 ⫹ ␦ 1 , were made zero. Each factor was constrained to reproduce the total charge of its fragment. Atoms in black and with labels H or C were considered as real atoms for the computation of the molecular orbital matrix of each fragment, but all their orbitals were made zero when building the matrix of the big molecule to avoid double atomic counting and to achieve a realistic density. Similar scaling types and factors were obtained for the other nucleic acid bases and fullerene.

042501-5



TABLE III. Total and relative energies 共in a.u.兲 of the normal and deformed nucleic acids adenine (A), thymine (T), guanine (G), cytosine 共C兲 and hypoxanthine (H), as well as of fullerene. The energies were computed within the DFT Kohn-Sham scheme and VEDA fragmentation method. Computational times 共in sec兲 are also included.

Species A Time T Time G Time C Time H Time C60

Normal molecule DFT-KS VEDA ⫺461.1237 852 ⫺447.9951 977 ⫺535.2867 1354 ⫺389.6179 692 ⫺480.6841 1779 ⫺2257.1534

⫺461.0775 316 ⫺447.9657 179 ⫺535.2657 416 ⫺389.6797 243 ⫺480.6367 273 ⫺2257.1071

Deformed molecule DFT-KS VEDA

⫺

⫺461.1107 909 -447.9345 1662 ⫺535.2744 1594 ⫺389.5970 750 ⫺480.6734 1076 ⫺2257.1380

the case of adenine the deformations consist in positioning N1 and N7 共see Ref. 关17兴 for the nomenclature兲 above and below the molecular plane, respectively, by an amount of 0.24 Å, with similar deformations for the other model molecules studied here. Table III indicates that guanine shows the largest error, giving a deviation smaller than 2 kcal/mole. If we consider that the KS chemical accuracy lies within 1 kcal/mole, then all VEDA relative energies are sufficiently accurate because they are close enough to KS relative energies. Computational times have been included in Table III. For all nucleic acids the VEDA procedure is considerably faster. Computational times are not shown for fullerene because different numbers and types of processor were used to rush the computations. For larger molecules the VEDA method should eventually become even more competitive due to two main points: 共a兲 the computational effort of present DFT methods is dominated by the SCF calculation and 共b兲 the fragmentation method permits by its very nature calculations

⫺461.0617 338 ⫺447.9055 209 ⫺535.2507 448 ⫺389.6572 226 ⫺480.6234 286 ⫺2257.0921

Relative energies DFT-KS VEDA ⫺0.0130

⫺0.0157

⫺0.0606

⫺0.0603

⫺0.0123

⫺0.0150

⫺0.0209

⫺0.0225

⫺0.0107

⫺0.0133

⫺0.0154

⫺0.0150

in parallel. For instance, different fragments of similar sizes may be assigned to different processors to keep a proper load balance. This parallelization greatly facilitates distributed data administration 关19兴, since each fragment in the VEDA has been made independent of the others, thus avoiding constant communication and signal traffic bottlenecks among slave nodes. Of course, the evaluation of the final energy can be orchestrated by a master node by applying the parallel techniques discussed in the previous section for the evaluation of interaction integrals. In this regard, the VEDA is fully parallelizable. The VEDA permits the calculation not only of relative energies, but also of components of the energy. Thus, in Table IV we show VEDA energy components for adenine and fullerene. They exhibit a similar behavior to the KS terms, but the one-electron energy is the component that deviates the most. There exist several alternatives to improve the VEDA energies. Perhaps the most direct form is to orthogonalize all molecular orbitals of fragments among them-

TABLE IV. One-electron 共one兲, Coulombic (ee), exchange-correlation (xc), and nuclear repulsion (NN) contributions to the total energy 共in a.u.兲 of normal and deformed adenine and fullerene (C60). Energies were computed within the DFT Kohn-Sham scheme and according to the VEDA fragmentation method.

Species Adenine E total E one E ee E xc E NN C60 E total E one E ee E xc E NN

Normal molecule DFT-KS VEDA

Deformed molecule DFT-KS VEDA

Relative energies DFT-KS VEDA

⫺461.1237 ⫺1583.9513 703.9243 ⫺63.0292 481.9325

⫺461.0775 ⫺1584.6474 704.7001 ⫺63.0627 481.9325

⫺461.1107 ⫺1581.5194 702.7540 ⫺63.0076 480.6623

⫺461.0617 ⫺1582.1862 703.5021 ⫺63.0400 480.6623

0.0130 2.4319 1.1703 0.0216 1.2702

0.0158 2.4612 1.1980 0.0227 1.2702

⫺2257.1534 ⫺19729.5676 9434.2851 ⫺324.5457 8362.6748

⫺2257.1071 ⫺19732.1502 9437.1958 ⫺324.8274 8362.6748

⫺2257.1380 ⫺19710.9238 9424.9713 ⫺324.4895 8353.3041

⫺2257.0921 ⫺19713.2859 9427.6618 ⫺324.7721 8353.3041

0.0154 18.6438 9.3138 0.0562 9.3707

0.0150 18.8643 9.5340 0.0553 9.3707

042501-6


FIG. 4. For the case of adenine, VEDA isoelectronic density lines are compared with the corresponding KS-DFT lines. The lines are shown for isodensity values of 0.3, 0.2, 0.1, and 0.05 a.u.⫺3 . Small variations are observed with the VEDA at the center of the five-membered ring and in a localized region along the bonds that were cut 共refer to Figs. 1 and 3兲. Such variations are under control in the VEDA and can be minimized by introducing a better scaling type or even more scaling factors.

selves to achieve an optimum set of molecular orbitals. However, in the case of a fragmentation method the orthogonalization task becomes difficult due to the different dimensions 共sizes兲 of the molecular orbitals that belong to different fragments, so no further orthogonalization is recommended. In this regard, it is preferred to improve the electron density by introducing a better scaling 共for example, by paying special attention to directional effects, i.e., those directions where polarization of the electronic charge occurs in stronger form兲 or by incorporating more atomic neighbors in the calculations. Also, the possibility of working with additional fragments that overlap the fragments of the big molecule should not be discarded as a way to compute better scaling factors. In Fig. 4 we compare a VEDA isoelectronic density surface with the corresponding KS-DFT one. Variations are observed there, which translate into different energies. These variations point out room for improvement in the virtual electron density. In Table V we show energies for the Watson-Crick complexes AT and GC. We have performed partitions across hydrogen bridges in such a way that each nucleic acid was considered a single fragment 共see Fig. 1兲. In this last case no scale factors were used at all since the polarization produced TABLE V. Total and relative energies 共in a.u.兲 for the WatsonCrick complexes computed according to the self-consistent field DFT Kohn-Sham scheme and VEDA fragmentation method. AT and GC denote adenine-thymine and guanine-cytosine, respectively, while dAT and dGC represent deformed AT and GC molecules. Species

E DFT-KS

E VEDA

AT dAT E rel GC dGC E rel

⫺909.1449 ⫺909.0391 ⫺0.1058 ⫺924.9134 ⫺924.8930 ⫺0.0204

⫺909.1402 ⫺909.0323 ⫺0.1079 ⫺924.8908 ⫺924.8711 ⫺0.0197


FIG. 5. The application of the VEDA method to a large molecule. We simulate a big molecule that has been partitioned into several fragments of about the same size. The fragments are affected by their partners through the use of overlap atoms indicated by connectivity symbols 共hinges兲. The STO-3G basis sets are used for the overlap atoms, while bigger sets 共like 3-21G兲 are used for the remaining atoms. The chemically important catalytic pocket of the molecule in this representation 共white color兲 makes use, for instance, of 6-31G* basis sets.

by H bridges is small. In the case of the KS-DFT energy no fragmentation was done at all. Relative energies are very close between the two methods for AT and GC. For instance, for the AT pair, while KS-DFT gives ⫺0.1058 a.u. the VEDA procedure predicts ⫺0.1079 a.u., resulting in a deviation of less than 2 kcal/mole. This result is encouraging given the large number of H bridges in organic compounds like proteins and DNA/RNA chains, because, in principle, we do not deal with scale factors for hydrogen bridges. Although in this work VEDA has been illustrated with the use of STO-3G basis sets, it is possible to work with bigger sets, with the difference that a larger number of molecular orbital coefficients will be involved. We also note the possibility of combining small and big basis sets in one system. In Fig. 5 we simulate a large molecule partitioned into several fragments. The fragments are connected through overlap 共connectivity兲 atoms. While for connectivity atoms we can use STO-3G basis sets in the way we have shown here, bigger basis sets 共like 6-31G*, DZVP, or even TZVP兲 can be used for the other atoms. This procedure in principle permits one to analyze large organic compounds, or simulate environments, like water, where nearby molecules can be explicitly taken into consideration while the remaining bulk is implicitly simulated by electrostatic fields. In order to speed computations it is still possible to freeze selected molecular coefficients, and just work with those that are of particular interest 共remember that in the VEDA it is possible to identify the molecular coefficients assigned to a given fragment兲. Finally, possible extensions of the VEDA to other schemes, like Hartree-Fock based methods, are encouraged. VI. CONCLUSIONS

In this work we have proposed a method 共VEDA兲 based on the fragmentation of a molecule and the construction of an electron density from fragments that essentially does the work of a SCF electron density. The VEDA method was used 042501-7



within the KS-DFT theory, showing the possibility of studying large compounds without the need to reconstruct every atomic orbital from scratch 关20兴. Instead, the method takes advantage of smaller calculations on fragments to propose a matrix of molecular coefficients and proceed to the evaluation of the energy as usual. In this context, molecular partitions are important and they can be saved 共possibly in a data bank兲 for later use, allowing the method to become more efficient. In the VEDA all atoms are treated at the quantum level; however, the method can also be combined with other schemes, like molecular mechanics, for the use of hybrid approaches. The VEDA loses some accuracy in the fragmentation process, especially at fragment joins, but this loss is small and can be further reduced. To our understanding, this is the first time that DFT energies have been reproduced for

the nucleic acids and fullerene within a small error margin, and without having recourse to the SCF process. It is our hope that this research will permit computations for large molecules beyond present methods. Finally, by proposing a method that is mainly based on the electron density, we try to go back to the fundamentals of DFT, where the electron density is supposed to play the central role in defining the molecular properties.

The authors acknowledge the Pacific Northwest National Laboratory for use of NWCHEM, DGSCA for their surpercomputing facilities, and the Mexican Institute of Petroleum for financial support.

关1兴 Modern Electronic Structure Theory, edited by D.R. Yarkony 共World Scientific, London, 1995兲, Vols. 1 and 2. 关2兴 P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 共1964兲; W. Kohn and L.J. Sham, Phys. Rev. 140, A1133 共1965兲. 关3兴 Density Functional Methods in Chemistry, edited by J.K. Labanowski and J.W. Andzelm 共Springer, New York, 1991兲. 关4兴 R. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Vol. 16 of International Series of Monographs on Chemistry 共Oxford University Press, Oxford, 1989兲. 关5兴 G. Monard and K.M. Merz, Jr., Acc. Chem. Res. 32, 904 共1999兲. 关6兴 K.P. Eurenius, D.C. Chatfield, B.R. Brooks, and M. Hodoscek, Int. J. Quantum Chem. 17, 87 共1996兲; X. Assfeld and J.L. Rivail, Chem. Phys. Lett. 263, 100 共1996兲; I. Antes and W. Thiel, J. Phys. Chem. A 103, 9290 共1999兲. 关7兴 I.G. Kaplan, Theory of Molecular Interactions 共Elsevier, Amsterdam, 1986兲. 关8兴 M.M. Hurley, L. Fernandez-Pacios, P.A. Christiansen, R.B. Ross, and W.C. Ermler, J. Chem. Phys. 84, 6840 共1986兲. 关9兴 L. Greengard and V. Rokhlin, J. Comput. Phys. 60, 187 共1985兲; C.A. White, B.G. Johnson, P.M.W. Gill, and M. Head-Gordon, Chem. Phys. Lett. 230, 8 共1994兲; J.M. Pe´rez-Jorda´ and W. Yang, J. Chem. Phys. 107, 1218 共1997兲. 关10兴 R.E. Stratmann, G.E. Scuseria, and M.J. Frisch, Chem. Phys. Lett. 257, 213 共1996兲. 关11兴 E.J. Baerends, D.E. Ellis, and P. Ros, Chem. Phys. 2, 41 共1973兲; H. Sambe and R.H. Felton, J. Chem. Phys. 62, 1122 共1975兲. 关12兴 I.N. Levine, Quantum Chemistry, 4th ed. 共Prentice-Hall, Englewood Cliffs, NJ, 1991兲.

关13兴 B.V. Cheney and R.E. Christoffersen, J. Chem. Phys. 56, 3503 共1972兲; W. Yang, Phys. Rev. A 44, 7823 共1991兲. 关14兴 W.G. Richards and D.L. Cooper, Ab Initio Molecular Orbital Calculations for Chemists, 2nd ed. 共Clarendon Press, Oxford, 1985兲. 关15兴 C.W. Murray, N.C. Handy, and G.J. Laming, Mol. Phys. 78, 997 共1993兲. 关16兴 R.F.W. Bader and P.M. Beddall, J. Chem. Phys. 56, 3320 共1972兲. 关17兴 R. Santamaria, E. Charro, A. Zacarias, and M. Castro, J. Comput. Chem. 20, 511 共1999兲. 关18兴 J. Anchell, E. Apra, D. Bernholdt, P. Borowski, T. Clark, D. Clerc, H. Dachsel, M. Deegan, M. Dupuis, K. Dyall, G. Fann, H. Fruchtl, M. Gutowski, R. Harrison, A. Hess, J. Jaffe, R. Kendall, R. Kobayashi, R. Kutteh, Z. Lin, R. Littlefield, X. Long, B. Meng, J. Nichols, J. Nieplocha, A. Rendall, M. Stave, T. Straatsma, H. Taylor, G. Thomas, K. Wolinski, and A. Wong, computer code NWCHEM, Version 3.2.1, Pacific Northwest National Laboratory, Richland, WA 99352-0999, 1998. 关19兴 M.F. Guest, E. Apra, D.E. Bernholdt, H.A. Fruchtl, R.J. Harrison, R.A. Kendall, R.A. Kutteh, X. Long, J.B. Nicholas, J.A. Nichols, H.L. Taylor, A.T. Wong, G.I. Fann, R.J. Littlefield, and J. Nieplocha, in Applied Parallel Computing. Computations in Physics, Chemistry, and Engineering Science, edited by J. Wasnieski, J. Dongarra, and K. Madsen, Vol. 1041 of Lecture Notes in Computer Science 共Springer-Verlag, Berlin, 1996兲. 关20兴 J.A. Mondragoń-Sańchez, M.Sc. thesis, Instituto de Fisica, UNAM, 2001.

ACKNOWLEDGMENTS

042501-8

Building wave functions for large molecules from their fragments

Building wave functions for large molecules from their fragments

Suggest Documents

Matter-wave interferometer for large molecules

Molecules and their functions in autophagy - CiteSeerX

large molecules

Generation of pseudopotentials from correlated wave functions

Asymmetric wave functions from tiny perturbations

Generation of pseudopotentials from correlated wave functions

Building with molecules - NanoGraz

Bessel Functions for Large Arguments

from fragments of arithmetic to large cardinals ... - American University

Methods for optimizing large molecules - Semantic Scholar

Characterization of Glass-Like Fragments from the 3714 Building

Direct cloning and transplanting of large DNA fragments from ...

Characterization of Glass-Like Fragments from the 3714 Building

Maximum Entropy Wave functions

Coordination and Crystallization Molecules: Their

Fragments from Olympus

Variational wave functions for frustrated magnetic models

On fermionic shadow wave functions for strongly

Integral Transform Technique for Meson Wave Functions

Rapidity resummation for $ B $-meson wave functions

Size-consistent wave functions for nondynamical

Correlated wave functions for the ground and

Bioactive molecules from amphibian skin: Their ... - Semantic Scholar

A New Sampling Strategy for Building Decision Trees from Large ...