Optimized density functionals from the extended G2 ...

2 downloads 0 Views 154KB Size Report
Jun 15, 1998 - Optimized density functionals from the extended G2 test set. Hartmut L. ..... Periodic Table as additional constraints. In general, heats of ...
JOURNAL OF CHEMICAL PHYSICS

VOLUME 108, NUMBER 23

15 JUNE 1998

Optimized density functionals from the extended G2 test set Hartmut L. Schmider and Axel D. Becke Department of Chemistry, Queen’s University, Kingston, Ontario, Canada K7L 3N6

~Received 21 January 1998; accepted 10 March 1998! A recently suggested procedure for the systematic optimization of gradient-corrected exchange-correlation functionals @A. D. Becke, J. Chem. Phys. 107, 8554 ~1997!# has been applied to the extended G2 test set @L. A. Curtiss et al., J. Chem. Phys. 106, 1063 ~1997!#, which consists of the standard heats of formation of 148 molecules. The limit of reproduction of the experimental data in this test set is found to be 1.78 kcal/mol mean absolute error, with a maximum of 8.89 kcal/mol error for the ozone molecule. This compares rather well with previous results for G2 theory itself ~1.58 and 8.2 kcal/mol, respectively!. We show that fair stability can be obtained by our optimization procedure. © 1998 American Institute of Physics. @S0021-9606~98!00623-0#

I. INTRODUCTION

E X5

In the course of the last decade, there has been substantial progress in the refinement of energy density functionals. As a consequence, density functional theory ~DFT! has become a viable alternative to conventional quantum mechanical techniques for the reproduction and prediction of spectroscopic and thermochemical data ~e.g. Refs. 1–6!. Early work in the field was dominated by the local ~spin! density approximation ~LSDA!; however, the severe overbinding of this method makes corrections depending on density derivatives absolutely necessary.2,3 If only the first-order gradient is employed, the resulting functionals are known as ‘‘generalized gradient approximations’’ ~GGAs!. A great variety of such functionals is available today, and their accuracy has been tested and proven time and again. However, the exact form of the exchange-correlation functional E XC@ r # remains elusive. Another step in the direction of greater accuracy was the introduction of hybrid methods,4,5 which use a small proportion of ‘‘exact exchange’’ on the basis of the so-called adiabatic connection formula. Functionals involving the Laplacian of the density have also entered the DFT instrumentarium ~e.g. Refs. 7, 8!, and techniques to obtain ‘‘exact’’ potentials from highly accurate molecular densities are now employed to gain further insights into the nature of E XC ~e.g. Ref. 9!. The search for better approximations continues, as density functional methods become—at least for some applications—compatible in accuracy with rather evolved post-Hartree-Fock methods. In a recent paper,6 the last of a series,2–6 a systematic procedure to optimize approximate exchange-correlation functionals at the ~hybrid-! GGA level was suggested and applied to benchmark systems. Here, we briefly review the method, and assess it further. It is based on a remapping of the dimensionless reduced gradient variable s s5

u “ r su

r s4/3

,

E Cab 5 E Css 5

e Xs ~ r s ! g Xs ~ s s2 ! dr,

~2!

e Cab ~ r a , r b ! g Cab ~ s 2av.! dr,

~3!

e Css ~ r s ! g Css ~ s s2 ! dr,

~4!

E C5E Cab 1

(s E Css .

~5!

The e( r ) in Eqs. ~2!–~4! are local volume energy densities of a uniform electron gas, and the g(s 2 ) denote ‘‘gradient correction factors’’ which are to be determined. The actual e( r ) take different forms for each of exchange, like-spin, and unlike-spin correlation ~see Ref. 6 for details!. We make the simplifying assumption that the unlike-spin correction factor for correlation, g Cab ~Eq. ~3!! depends only on an average gradient variable s 2av.5(s a2 1s b2 )/2. The total correlation energy is just the sum of the like- and unlike-spin contributions ~Eq. ~5!!. To determine each optimum correction factor g(s 2 ), it is expanded in a power series in the remapped variable u(s 2 ): m

g ~ s 2 ! 5g ~ u ! 5

( c iu i~ s 2 ! ,

i50

~6!

where we have omitted the subscripts for brevity. Each correction factor g(u) is treated separately, and each mapping of s 2 to u is specific: u Xs ~ s s2 ! 5

g Xs s s2 , 11 g Xs s s2

u Cab ~ s 2av.! 5

~1!

u Css ~ s s2 ! 5

where s denotes spin a or b, to a new finite variable Exchange ~subscript X! and correlation ~subscript C! functionals are written in the form: u(s s2 ).

0021-9606/98/108(23)/9624/8/$15.00

E E E

(s

~7!

g Cab s 2av. , 11 g Cab s 2av.

g Css s s2 . 11 g Css s s2

~8! ~9!

This functional form of the remapped finite variables u(s 2 ) is derived from simple approximate expressions for gradient 9624

© 1998 American Institute of Physics

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

H. L. Schmider and A. D. Becke

9625

TABLE I. Least-squares residuals D, root-mean-squares deviations, and GoFs for fits to 233 thermochemical data. Order

No. par.

D (kcal2/mol2)

RMS ~kcal/mol!

GoF

0 1 2 3 4 5 6

4 7 10 13 16 19 22

21160 2313 1847 1820 1774 1723 1686

9.53 3.15 2.82 2.79 2.76 2.72 2.69

9.6 3.2 2.9 2.9 2.9 2.8 2.8

Tables II and III! obtained in the course of this work. From the obvious similarity of the overall shape of u(s) and g(s), we see how the mapping of s onto u justifies the power series Eq. ~6! of the functional relationship between g and the independent variable u. Obviously, it is beneficial to choose the value of g such that this shape similarity is observed. A linear exact-exchange mixing parameter c exact may also be introduced. The full energy expression for E XC is then

FIG. 1. Remapping of the reduced gradient s5 u “ r u / r 4/3 onto the finite independent variable u5 g s 2 /(11 g s 2 ). The graph shows both a typical exchange correction factor g(s) and the mapped variable u(s) as a function of s. The example chosen here is the exchange correction for fit 2c in Table II, where g 50.004.

E XC5E X1E C1c exactE exact , X

~10!

making this a ‘‘hybrid’’ scheme. E X and E C are given by Eqs. ~2! and ~5!, respectively, but E X in Eq. ~10! does not now represent the full exchange energy of the system. If the series expansions Eq. ~6! for all three contributions ~exchange, ab- and ss-correlation! are truncated uniformly at order m, we have p53(m11)11 linear parameters to be determined. Recently,11 Curtiss and co-workers have published a compilation of 148 experimentally determined standard heats of formation that were used to gauge the ab initio procedure known as G2 theory.12 These test data are the extension of an earlier compilation12,13 that was centered on zerotemperature atomization energies. The authors also compared commonly used DFT methods with their scheme. They concluded that the performance of standard DFT methods is

corrections.6,10 The parameters g need to be chosen for each contribution ~exchange or correlation, like or unlike spin! independently. Since the g’s are nonlinear parameters, and hence fitting to extensive molecular data is inconvenient, atom-optimized values are chosen instead and kept fixed ~see Ref. 6 for more details; the values used in this work are g Xs 50.004, g Cab 50.006 and g Css 50.2!. In contrast, the expansion coefficients c i are linear parameters and may be determined from experimental thermochemical data directly by means of straightforward linear least-squares fitting. To illustrate the mapping from the reduced gradient s to the finite variable u, we show u(s) for exchange ~Eq. ~7!! in Fig. 1 ~full curve!. The dotted curve shows a typical exchange correction factor g Xs (s s ) for one of the fits ~2c in

TABLE II. Residual mean absolute errors in several properties for a series of 10-parameter linear least-squares fitted functionals. The fit sets are: 1a556 HoFs ~298 K! ~original G2 set!; 1b51a142 ionization pot. 125 electron aff. 18 proton aff.; 1c51b110 tot. energies; 2a5148 HoFs ~298 K! ~extended G2 set!; 2b and 2c like 1b and 1c, respectively. The values in brackets are the largest absolute errors. Property

1a

56 HoFs ~298 K, G2-1! @kcal/mol# 148 HoFs ~298 K, G2-2! @kcal/mol# 42 Ionization Potentials @eV# 25 Electron Affinities @eV# 8 Proton Affinities @kcal/mol# 10 Total Energies @mhartree#

1.50 ~4.69! 2.15 ~12.85! 0.20 ~0.59! 0.24 ~0.58! 1.1 ~3.2! 49 ~180! 2a

56 HoFs ~298 K, G2-1! @kcal/mol# 148 HoFs ~298 K, G2-2! @kcal/mol# 42 Ionization Potentials @eV# 25 Electron Affinities @eV# 8 Proton Affinities @kcal/mol# 10 Total Energies @mhartree#

1.72 ~6.59! 1.78 ~8.89! 0.58 ~1.26! 0.49 ~0.79! 1.7 ~4.7! 90 ~135!

1b 1.77 ~7.10! 2.60 ~14.51! 0.10 ~0.54! 0.08 ~0.26! 1.1 ~4.3! 155 ~379! 2b 1.76 ~5.03! 1.88 ~8.35! 0.12 ~0.65! 0.09 ~0.33! 1.2 ~3.5! 71 ~163!

1c 1.70 ~5.68! 2.15 ~12.84! 0.12 ~0.59! 0.09 ~0.33! 1.2 ~4.9! 3 ~9! 2c 1.75 ~5.89! 1.89 ~9.11! 0.12 ~0.66! 0.09 ~0.34! 1.1 ~4.3! 4 ~7!

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

9626

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

FIG. 2. Exchange correction factors g Xs (u Xs ) as a function of the mapped variable u Xs 50.004s 2s /(110.004s 2s ) ~also see Fig. 1!. The different curves correspond to the fits in Table II. Note that due to the small size of g, values of u Xs close to 1 are never reached.

considerably poorer than G2, showing wider distributions of error, and that hybrid schemes perform better than ‘‘pure’’ Kohn-Sham DFT methods. In this paper, we will use the new G2 test set to assess the optimized DFT of Ref. 6, and to determine the limit of precision possible in this GGA framework by fitting the gradient corrections directly to the full set of available thermochemical information. II. RESULTS AND DISCUSSION

We have used as many data points for the determination of an optimal gradient corrected density functional as possible. We have combined the 148 heats of formation ~HoFs, at standard temperature! given in Ref. 11 with previously published experimental values for 42 ionization potentials ~IPs!, 25 electron affinities ~EAs!,14 8 proton affinities ~PAs, see Ref. 13!, and 10 total atomic energies ~H to Ne, TEs!,6 arriving at a total of 233 thermochemical data as a test set.

FIG. 3. Unlike-spin correlation correction factors g Cab (u Cab ) as a function of the mapped variable u Cab 50.003(s 2a 1s 2b )/(110.003(s 2a 1s 2b )). The different curves correspond to the fits in Table II. Note that due to the small size of g, values of u Cab close to 1 are never reached.

H. L. Schmider and A. D. Becke

FIG. 4. Like-spin correlation correction factors g Css (u Css ) as a function of the mapped variable u Css 50.2s 2s /(110.2s 2s ). The different curves correspond to the fits in Table II.

The required electronic structure calculations were performed by the fully numerical DFT program NUMOL,15–18 employing grids with 40 radial points per principal atomic ‘‘shell’’ on each center, and 302 angular points at each radius ~194 for hydrogen and helium!. We believe this grid to be reasonably saturated, on the basis of comparisons with results obtained on the smaller grids of our previous work2–6 ~i.e., 20 radial points per atomic shell times 194, or 110 for H and He!. The overall statistics of our fits differ insignificantly between the two meshes. Computations were performed ‘‘post-LSDA,’’ namely, the densities computed are LSDA densities, and all gradient-corrected energies are derived from these. Atomic cores were frozen, with the exception of the alkali atoms when compared with their cations. The latter TABLE III. Linear expansion coefficients for the 10-parameter fits in Table II. The coefficients are grouped according to exchange, unlike-spin and like-spin correlation, and exact exchange parameter. Coefficient

1a

1b

1c

c X,0 c X,1 c X,2 c Cab ,0 c Cab ,1 c Cab ,2 c Css ,0 c Css ,1 c Css ,2 c exact

0.845975 0.228183 0.749949 0.975483 0.398379 23.73540 20.817637 20.054676 0.592163 0.229015

0.800103 20.084192 1.47742 0.977621 0.931199 24.76973 1.44946 22.37073 2.13564 0.199352

0.810936 0.496090 0.772385 0.939269 0.898121 24.91276 0.262077 2.12576 22.30465 0.192416

2a

2b

2c

0.749200 0.402322 0.620779 0.964641 0.050527 23.01966 1.26686 1.67146 21.22565 0.232055

0.770587 0.180767 0.955246 0.965362 0.863300 24.61778 0.170473 1.24051 20.862711 0.237978

0.790194 0.400271 0.832857 0.934715 1.14105 25.33398 20.120163 2.82332 22.59412 0.219847

c X,0 c X,1 c X,2 c Cab ,0 c Cab ,1 c Cab ,2 c Css ,0 c Css ,1 c Css ,2 c exact

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

H. L. Schmider and A. D. Becke

TABLE IV. Heats of formation of the 148 molecules in the G2 extended test set ~Ref. 11! in kcal/mol. The column labeled ‘‘present work’’ gives the values obtained with the parameters derived from a fit to the full set of 233 thermochemical data ~fit 2c in Tables II and III!. Mol. No.

System

Present work

Expt.

Dev.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56

H2 LiH BeH CH 3 CH2 1 CH2 CH3 CH4 3 NH NH2 NH3 OH H2O HF 1 SiH2 3 SiH2 SiH3 SiH4 PH2 PH3 H2S HCl Li2 LiF C2H2 C2H4 C2H6 CN HCN CO HCO H2CO CH3OH N2 N2H4 NO 3 O2 H2O2 F2 CO2 Na2 3 Si2 P2 3 S2 Cl2 NaCl SiO CS 3 SO ClO ClF Si2H6 CH3Cl CH3SH HOCl SO2

1.71 34.72 77.87 142.24 93.42 103.78 34.22 217.72 83.43 42.18 211.59 8.56 256.14 263.71 63.87 84.46 45.93 7.32 29.70 0.50 24.80 221.70 53.74 278.31 56.84 12.40 219.81 108.65 33.18 224.25 7.98 226.18 247.60 4.03 19.74 19.91 23.43 231.17 1.12 295.63 28.07 137.16 35.41 26.02 21.69 243.37 221.36 69.08 21.49 21.88 214.01 16.90 220.07 25.72 218.25 266.73

0.00 33.30 81.70 142.50 93.70 102.75 35.00 217.90 85.20 45.10 210.97 9.40 257.80 265.14 65.20 86.20 47.90 8.20 33.10 1.30 24.90 222.06 51.60 280.10 54.19 12.54 220.08 104.90 31.50 226.42 10.00 225.96 248.00 0.00 22.79 21.58 0.00 232.53 0.00 294.05 33.96 139.87 34.31 30.74 0.00 243.56 224.64 66.90 1.20 24.19 213.24 19.10 219.56 25.50 217.80 270.95

1.71 1.42 23.83 20.26 20.28 1.03 20.78 0.18 21.77 22.92 20.62 20.84 1.66 1.43 21.33 21.74 21.97 20.88 23.40 20.80 0.10 0.36 2.14 1.79 2.65 20.14 0.27 3.75 1.68 2.17 22.02 20.22 0.40 4.03 23.05 21.67 23.43 1.36 1.12 21.58 25.89 22.71 1.10 24.72 21.69 0.19 3.28 2.18 22.69 22.31 20.77 22.20 20.51 20.22 20.45 4.22

57 58 59 60 61 62

BF3 BCl3 AlF3 AlCl3 CF4 CCl4

2269.61 297.73 2282.03 2140.03 2223.81 220.85

2271.41 296.30 2289.03 2139.72 2223.04 222.94

1.80 21.43 7.00 20.31 20.77 2.09

9627

TABLE IV. ~Continued.! Mol. No.

System

Present work

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127

OCS CS2 COF2 SiF4 SiCl4 N2O ClNO NF3 PF3 O3 F2O ClF3 C2F4 C2Cl4 CF3CN CH3CCH ~propyne! CH2CCH2 ~allene! C3H4 ~cyclopropene! CH3CHCH2 ~propylene! (CH2) 3 ~cyclopropane! C3H8 ~propane! CH2CHCHCH2 ~butadiene! CH3CCCH3 ~2-butyne! C4H6 ~methylene cyclopropane! C4H6 ~bicyclo-butane! C4H6 ~cyclo-butene! (CH2) 4 ~cyclo-butane! ~CH3!2CCH2 ~iso-butene! C4H10 ~trans-butane! (CH3) 3 CH ~iso-butane! ~C2H4!2C ~spiro-pentane! C6H6 ~benzene! CH2F2 CHF3 CH2Cl2 CHCl3 CH3NH2 CH3CN CH3NO2 CH3ONO CH3SiH3 HCOOH HCOOCH3 CH3CONH2 C2H4NH ~aziridine! NCCN ~CH3!2NH C2H5NH2 H2CCO C2H4O ~oxirane! CH3CHO (CHO) 2 ~glyoxal! C2H5OH CH3OCH3 C2H4S ~thioxirane! ~CH3!2SO C2H5SH ~CH3!2S CH2CHF C2H5Cl CH2CHCl CH2CHCN ~CH3!2CO CH3COOH CH3COF

236.95 23.25 2146.71 2376.87 2157.18 19.15 9.84 235.89 2226.41 41.38 4.98 243.18 2164.53 23.44 2116.80 45.64 42.91 68.95 5.46 14.01 224.30 25.80 36.19 46.20 55.75 39.72 7.25 21.79 229.16 228.12 46.17 16.67 2108.33 2166.51 222.93 223.32 26.52 18.60 218.92 216.90 26.42 290.94 285.00 259.28 29.50 75.52 24.34 212.90 214.84 212.09 239.90 250.97 254.43 242.44 18.93 234.12 211.01 29.61 235.25 227.30 4.46 44.99 252.21 2102.38 2105.42

Expt. 233.08 27.95 2152.70 2385.98 2158.40 19.61 12.36 231.57 2229.07 34.10 5.86 237.97 2157.40 22.97 2118.40 44.20 45.50 66.20 4.78 12.70 225.00 26.30 34.80 47.90 51.90 37.40 6.80 24.00 230.00 232.07 44.30 19.74 2107.71 2166.60 222.83 224.66 25.50 18.00 217.80 215.90 27.00 290.50 285.00 257.00 30.20 73.30 24.40 211.30 211.35 212.57 239.70 250.70 256.21 244.00 19.60 236.20 211.10 28.90 233.20 226.80 8.90 43.20 251.93 2103.40 2105.70

Dev. 23.87 24.70 5.99 9.11 1.22 20.46 22.52 24.32 2.66 7.28 20.88 25.21 27.13 20.47 1.60 1.44 22.59 2.75 0.68 1.31 0.70 20.50 1.39 21.70 3.85 2.32 0.45 2.21 0.84 3.95 1.87 23.07 20.62 0.09 20.10 1.34 21.02 0.60 21.12 21.00 0.58 20.44 20.00 22.28 20.70 2.22 0.06 21.60 23.49 0.48 20.20 20.27 1.78 1.56 20.67 2.08 0.09 20.71 22.05 20.50 24.44 1.79 20.28 1.02 0.28

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

9628

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

H. L. Schmider and A. D. Becke

TABLE IV. ~Continued.! Mol. No. 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148

System

Present work

Expt.

Dev.

CH3COCl C3H7Cl ~CH3!2CHOH C2H5OCH3 ~CH3!3N C4H4O ~furane! C4H4S ~thiophene! C4H4NH ~pyrole! C5H5N ~pyridine! HS CCH CHCH2 CH3CO H2COH CH3O C2H5O CH3S C2H5 ~CH3!2CH ~CH3!3C NO2

258.97 230.37 261.25 252.38 24.87 28.06 26.02 23.44 27.52 33.11 139.23 69.30 24.87 25.67 2.81 20.10 27.28 27.83 21.03 11.03 3.55

258.00 231.52 265.20 251.70 25.70 28.30 27.50 25.90 33.60 34.18 135.10 71.60 22.40 24.08 4.10 23.70 29.80 28.90 21.50 12.30 7.91

20.97 1.15 3.95 20.68 0.83 0.24 21.48 22.46 26.08 21.07 4.13 22.30 22.47 21.59 21.29 3.60 22.52 21.07 20.47 21.27 24.36

cases also served as a test for the validity of the frozen-core approximation ~which they successfully passed!. We use the same MP2 molecular geometries as in G2 theory itself, and scaled Hartree-Fock vibrational frequencies for zero-point and finite-temperature corrections.11 In determining the coefficients of a truncated series expansion such as Eq. ~6!, we need to determine the point of ‘‘diminishing returns’’ for the optimum order m, beyond which the inclusion of further terms is no longer justified. The ‘‘goodness of fit’’ ~GoF, see e.g. Ref. 19!, is an index in the least-squares interpretation of experimental results that measures the quality of a given fit. It takes the number p of fit parameters, and the number n of data points into account, as well as the least-squares residual D: GoF5 AD/ ~ n2p ! , n

D5

~11!

( ~ x mod,i 2x obs,i ! 2 / s 2i .

i51

If the weights 1/s 2i are related to the actual experimental statistical error, the GoF should be close to 1 for an appropriate model. In our case, no such statement can be made since we use unit weights throughout. However, it still stands to reason that the optimum number of parameters has been reached when an increase does not yield a lower GoF. In Table I, the least-squares residuals, root-mean-square deviations and GoFs for fits of the power series ~Eq. ~6!! are listed for various truncation orders m. It may be seen at a glance that the first nonconstant term ~i.e. m51! yields a dramatic improvement in the overall fit, but that after the 2nd-order term (m.2), the value of the GoF levels out and changes insignificantly. This means that the lowering of the RMS deviation is offset by an increase in the number of parameters. We therefore deem that truncation at order m52 is optimal, leaving us with 10 parameters: the expansion co-

efficients c 0 , c 1 and c 2 for each of exchange, like-spin and unlike-spin correlation functionals, and the mixing parameter c exact for the exact exchange. This is consistent with our earlier experience,6 based on visual inspection of the correction functions g(u) for physical sensibility. We performed a variety of fits based on subsets of the available data, to test the effect of parameter determination from one type of information on the reproduction of others. Table II lists mean absolute deviations and maximum errors for six 10-parameter fits ~order m52 plus an exact-exchange term!. Fits of type 1 include the original G2 test set of 56 molecules20 in the form of heats of formation at standard temperature. Fits of type 2 use the extended set of HoFs. These sets were either used alone ~1a,2a!, or augmented by ionization potentials, electron and proton affinities ~1b,2b!. 1c and 2c use total energies for the first ten atoms of the Periodic Table as additional constraints. In general, heats of formation and proton affinities are reproduced rather well, the latter even if not included in the fitting set. On the other hand, electron affinities and ionization potentials are somewhat more problematic if not included in the fit set. It is encouraging that for each fit of the 1-series, the degradation of quality from the small G2 set ~which is used in the fit! to the extended set is not bad. For fit 1c, less than 0.5 kcal/mol deterioration is observed in the average, and the quality of that fit for the whole 148-HoF set is still of the order of 2 kcal/mol mean absolute error. Note that for the heats of formation, the smallest mean error in Table II ~1.78 kcal/mol for fit 2a! is only 0.2 kcal/ mol above that of G2 theory as reported in Ref. 11. It is better than the modified versions of G2 ~Refs. 21–23! tested in that work, and it is a great improvement over standard hybrid methods ~B3LYP, see Ref. 11!, which were found to yield above 3 kcal/mol error on average. Certainly, these are encouraging results. As can be seen from the last entry for each fit, the reproduction of total energies is generally poor if they are not included in the ‘‘training set.’’ If they are, however ~fits 1c and 2c!, their inclusion does not appreciably degrade the reproduction of the other data. In fact, some improve slightly ~HoFs, proton affinities!. We deem it beneficial to include total energies, since they effect a general stabilization of the fits, in the sense that the quality becomes more ‘‘even.’’ The question arises: how strongly do differences in the fit data affect the functions g(u) which are the central objects in these fits? To investigate this, we have plotted the g(u) for exchange ~Fig. 2!, unlike-spin correlation ~Fig. 3! and like-spin correlation ~Fig. 4! for all the fits listed in Table II. While for both the exchange and the unlike-spin ~ab! correlation, the overall shape remains stable for all the fits, the like-spin correlation quantity g Css is excessively sensitive towards the data that were used in the training set. This is also apparent from the values of the expansion coefficients c i themselves. We give them in Table III for the 6 fits listed in Table II. The zero-order value c 0 for each exchange and correlation correction determines the overall position of the g(u) curves, and also the value of the energy functional in the uniform electron gas limit. It can be seen from Table III that, throughout the fits, it varies only be-

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

tween 0.75 and 0.85 for exchange, and between 0.93 and 0.98 for unlike-spin correlation. However, for the like-spin correlation, it varies considerably, in both magnitude and sign. We can only conclude that the like-spin correlation is not well determined by the present data. This is perhaps not surprising, since ss-correlation is by far the smallest part of the exchange-correlation energy. It is also the part most strongly affected by self-correlation error, making it the most difficult to describe appropriately. Note, however, that fits 1c and 2c ~finely dashed and dash-dotted curves, respectively! do show similar behavior, especially for values of u Css larger than 0.6. This argues strongly for the inclusion of total energies in the training set. We believe that the convex behavior of fits 1a and 1b ~full and dashed curves! is unphysical, and that therefore heats of formation alone do not suffice to determine like-spin correlation functionals. From Table III it can also be inferred that the values of the exact-exchange parameter c exact are quite stable ~about 0.19–0.24!, i.e. well-determined by the present scheme. Inclusion of total energies tends to lower them somewhat. Tables IV–VIII, and the bar graph Fig. 5, display deviations from experiment for fit 2c ~i.e. inclusion of all available data in the fit!. It is seen by inspection of Fig. 5 that the distribution of errors in the HoFs is clearly centered around zero, indicating that the mean error arises from a small number of large deviations, compensated by many cases of rather low error. This accumulation around a low mean value is almost as pronounced as in the corresponding G2 graph of Ref. 11. Of course, this is partly due to the least-squares procedure employed, which tends to create a Gaussian distribution of errors around a mean. It is further evidence that our analytical model is appropriate for the reproduction of the data. As may be seen in Table IV, the largest errors are observed for molecules containing several fluorine atoms ~such as AlF3 and SiF4!, and for ozone. Problems with the former were noted previously by Curtiss et al.11 for the G2 scheme. For the most serious cases, their deviations have the same sign as ours, but are usually somewhat smaller. They attempted to remedy the situation by including spin-orbit corrections, and found that while chlorine-containing molecules are improved considerably, fluorine-containing ones are practically unaffected. They concluded that ‘‘apparently, there is some inherent problem in G2 theory with some of the fluorine molecules other than the neglect of the spin-orbit effect.’’ Since we observe similar problems for many of the fluorine molecules, this problem might be not inherent in G2 theory, but of a more general nature. Note that the deviations are not systematic; while the heat of formation of COF2 is overestimated by 6 kcal/mol, it is more than 7 kcal/mol too low for the similar molecule C2F4. Furthermore, we do not encounter the problems with chlorine-containing systems that were observed in Ref. 11 and partly remedied by inclusion of spin-orbit corrections. The most problematic ionization potential ~Table V! is for the oxygen molecule, whose IP is strongly overestimated by 0.66 eV. We believe that better results may be obtained if different geometries for O2 and O1 2 are employed. The MP2optimized geometries used in this work are in particularly

H. L. Schmider and A. D. Becke

9629

TABLE V. Ionization potentials of 42 systems ~Ref. 13! in eV. The column labeled ‘‘present work’’ gives the values obtained with the parameters derived from a fit to the full set of 233 thermochemical data ~fit 2c in Tables II and III!. System 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

1

H→H He→He1 Li→Li1 Be→Be1 B→B1 C→C1 N→N1 O→O1 F→F1 Ne→Ne1 CH4→CH1 4 NH3→NH1 3 1 OH→OH H2O→H2O1 HF→HF1 Na→Na1 Mg→Mg1 Al→Al1 Si→Si1 P→P1 S→S1 Cl→Cl1 Ar→Ar1 SiH4→SiH1 4 PH→PH1 PH2→PH1 2 PH3→PH1 3 SH→SH1 H2S→H2S1 ( 2 B 1 ) H2S→H2S1 ( 2 A 1 ) HCl→HCl1 C2H2→C2H1 2 C2H4→C2H1 4 1 CO→CO 2 N2→N1 2 ( S g) 2 ( P u) N2→N1 2 O2→O1 2 P2→P1 2 S2→S1 2 Cl2→Cl1 2 ClF→ClF1 CS→CS1

Present work

Expt.

Dev.

13.67 24.73 5.47 9.11 8.56 11.44 14.64 13.87 17.54 21.60 12.52 10.11 13.07 12.54 15.99 5.11 7.79 5.97 8.11 10.42 10.45 13.03 15.79 10.91 10.17 9.90 9.81 10.43 10.42 12.63 12.73 11.18 10.30 14.10 15.83 16.53 12.73 10.34 9.58 11.36 12.55 11.42

13.60 24.59 5.39 9.32 8.30 11.26 14.54 13.61 17.42 21.56 12.62 10.18 13.01 12.62 16.04 5.14 7.65 5.98 8.15 10.49 10.36 12.97 15.76 11.00 10.15 9.82 9.87 10.37 10.47 12.78 12.75 11.40 10.51 14.01 15.58 16.70 12.07 10.53 9.36 11.50 12.66 11.33

0.07 0.14 0.08 20.21 0.26 0.18 0.10 0.26 0.12 0.04 20.10 20.07 0.06 20.08 20.05 20.03 0.14 20.01 20.04 20.07 0.09 0.06 0.03 20.09 0.02 0.08 20.06 0.06 20.05 20.15 20.02 20.22 20.21 0.09 0.25 20.17 0.66 20.19 0.22 20.14 20.11 0.09

poor agreement with experiment for this system. Our result for the isovalent molecule S2 is much better. The most problematic cases for the electron affinities ~Table VI! are the CH3 and NH2 radicals, and the chlorine molecule. The radicals are isoelectronic, and the other members of the series ~OH and F! show increasingly better quality. This might be a systematic trend, and therefore indicative of a problem with the model. Proton affinities ~Table VII! and total energies ~Table VIII! are both rather well reproduced by this fit. The PA for acetylene is an exception, as previously observed.4,6 III. SUMMARY, CONCLUSIONS AND OUTLOOK

In this paper, we have demonstrated that a recently proposed method for the systematic optimization of energy den-

Downloaded 17 Jan 2011 to 140.123.79.51. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions

9630

J. Chem. Phys., Vol. 108, No. 23, 15 June 1998

H. L. Schmider and A. D. Becke

TABLE VI. Electron affinities of 25 systems ~Ref. 13! in eV. The column labeled ‘‘present work’’ gives the values obtained with the parameters derived from a fit to the full set of 233 thermochemical data ~fit 2c in Tables II and III!.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

System

Present work

Expt.

Dev.

C←C2 CH←CH2 3 CH2←CH2 2 CH3←CH2 3 NH←NH2 NH2←NH2 2 O←O2 OH←OH2 F←F2 O2←O2 2 NO←NO2 CN←CN2 Si←Si2 P←P2 S←S2 Cl←Cl2 SiH←SiH2 1 SiH2←SiH2 2 SiH3←SiH2 3 PH←PH2 PH2←PH2 2 SH←SH2 PO←PO2 S2←S2 2 Cl2←Cl2 2

1.26 1.22 0.61 20.26 0.22 0.50 1.48 1.70 3.37 0.38 0.16 3.98 1.30 0.82 2.14 3.64 1.19 1.09 1.37 1.01 1.19 2.30 1.20 1.64 2.63

1.26 1.24 0.65 0.08 0.38 0.74 1.46 1.83 3.40 0.44 0.02 3.82 1.38 0.75 2.08 3.62 1.28 1.12 1.44 1.00 1.26 2.31 1.09 1.66 2.39

0.00 20.02 20.04 20.34 20.16 20.24 0.02 20.13 20.03 20.06 0.14 0.16 20.09 0.07 0.06 0.02 20.09 20.04 20.07 0.01 20.07 20.01 0.11 20.02 0.24

TABLE VIII. Total energies of 10 atoms in hartree. The column labeled ‘‘present work’’ gives the values obtained with the parameters derived from a fit to the full set of 233 thermochemical data ~fit 2c in Tables II and III!.

1 2 3 4 5 6 7 8 9 10

~c!

~d!

~e! sity functionals at the GGA level, may be applied with success to a wide variety of thermochemical data. Functionals were optimized using ionization potentials, electron and proton affinities, total energies, and a large set of recently published heats of formation. We have arrived at some basic conclusions: ~a!

The optimum expansion order in Eq. ~6! is m52, as determined by the goodness-of-fit indices given in Table I. Within the present framework, the best reachable reproduction of heats of formation is 1.78 kcal/mol average error ~as compared with 1.58 kcal/mol in G2 theory itself!. Inclusion of all other types of data, including

~b!

~f!

System

Present work

Expt.

Dev.

H He Li Be B C N O F Ne

20.502 22.909 27.485 214.664 224.649 237.840 254.583 275.068 299.738 2128.940

20.500 22.904 27.478 214.667 224.654 237.845 254.589 275.067 299.734 2128.938

20.002 20.005 20.007 0.003 0.005 0.005 0.006 20.001 20.004 20.002

total energies, only mildly increases this value to 1.89 kcal/mol. This implies that a satisfactory stability of the fit has been reached. The most problematic cases of HoFs occur for fluorinecontaining molecules, which is similar to G2 theory. The problems with chlorine compounds occurring in the latter are not observed here. Properties other than heats of formation ~such as ionization potentials!, need to be included in the fit set to obtain a good overall DFT. This is even more strongly indicated since there is no great deterioration of the HoF reproduction. The inclusion of total energies in the fit has a generally beneficial effect. Other properties are only slightly affected, whereas the total energies themselves improve dramatically. It also seems that total energies are required to stabilize the determination of the like-spin correlation part. The determination of exchange-correlation parameters directly from experimental thermochemical data can lead to functionals competitive with the G2 scheme. G2

TABLE VII. Proton affinities of 8 systems ~Ref. 13! in kcal/mol. The column labeled ‘‘present work’’ gives the values obtained with the parameters derived from a fit to the full set of 233 thermochemical data ~fit 2c in Tables II and III!.

1 2 3 4 5 6 7 8

System

Present work

Expt.

Dev.

H2 NH3 H2O C2H2 SiH4 PH3 H2S HCl

99.7 204.8 165.5 156.6 153.9 187.1 168.6 134.0

100.8 202.5 165.1 152.3 154.0 187.1 168.8 133.6

21.1 2.3 0.4 4.3 20.1 20.0 20.2 0.4

FIG. 5. Bar graph of residual deviations of heats of formation at 298 K. The underlying gradient-corrected functional is the one derived from a 10parameter fit to the full set of 233 thermochemical data ~see Table II, 2c!. The graph shows the frequency with which an error e 5HoFfit2HoFexp. occurs in a given interval. The bars are placed in the center of the interval they denote, i.e. the bar with the frequency of an error 22 kcal/mol,e

Suggest Documents