Predicting logP of pesticides using different software

11 downloads 0 Views 121KB Size Report
Dimethoate. 4. 0.76. 0.04. 99607-70-2. Cloquintocet mexyl. 2. 5.03. 0.00. 3060-89-7. Metobromuron. 4. 2.41. 0.04. 56-72-4. Coumaphos. 2. 4.13. 0.00. 298-04-4.
Chemosphere 53 (2003) 1155–1164 www.elsevier.com/locate/chemosphere

Predicting log P of pesticides using different software E. Benfenati

a,*

, G. Gini b, N. Piclin

a,c

, A. Roncaglioni a, M.R. Varı a

a

c

Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri, Laboratory of Environmental Chemistry and Toxicology, via Eritrea 62, 20157 Milan, Italy b Department of Electronics and Information, Politecnico di Milano, piazza Leonardo da Vinci 32, 20133 Milan, Italy Laboratory of Chemometrics and Bioinformatics, University of Orl eans, B.P. 6759, 45067 Orl eans Cedex 2, France Received 1 August 2002; received in revised form 22 May 2003; accepted 9 June 2003

Abstract We compared experimental and calculated log P values using a data set of 235 pesticides and experimental values from four different sources: The Pesticide Manual, Hansch Manual, ANPA and KowWin databases. Log P were calculated with four softwares: HyperChem, Pallas, KowWin and TOPKAT. Crossed comparison of the experimental and calculated values proved useful, especially for pesticides. These are harder to study than simpler organic compounds. Structurally they are complex, heterogeneous and similar to drugs from a chemical point of view. They offer an interesting way to verify the goodness of the different methods. Other studies compared several log P predictors using a single set of experimental values taken as a reference. Here we discuss the utility of the different log P predictors, with reference to experimental data found in different databases. This offers three advantages: (1) it avoids bias due to the assumption that one single data set is correct; (2) a given predictor can be developed on the same data set used for evaluation; (3) it takes account of experimental variability and can compare it with the predictorÕs variability. In our study Pallas and KowWin gave the best results for prediction, followed by TOPKAT. Ó 2003 Elsevier Ltd. All rights reserved. Keywords: Log P ; Octanol–water partition coefficient; Prediction software; Pesticides; Atom/fragment contributions; Database

1. Introduction Lipophilicity is a very important molecular descriptor that often correlates well with the bioactivity of chemicals (Leo and Hansch, 1999; Cronin et al., 2002; Sverdrup et al., 2002). In studies of the environmental fate of organic chemicals, log P , the logarithm of the partition coefficient between n-octanol and water, correlates with water solubility (Ran et al., 2002), soil/ sediment adsorption coefficients (Sabljic et al., 1995) and

*

Corresponding author. Tel.: +39-02-39014420; fax: +39-0239001916. E-mail address: [email protected] (E. Benfenati).

bioconcentration factors for aquatic organisms (Lyman et al., 1990). Lipophilicity can therefore be measured by log P , which reflects the equilibrium partitioning a molecule between an apolar and a polar (aqueous) phase. Partition coefficients can be measured experimentally by several methods, ranging from the simple ‘‘shake flask’’ technique (Lodge, 1999; Edelbach and Lodge, 2000) to popular chromatographic methods (Eadsforth and Moser, 1983; Finizio et al., 1997; Edelbach and Lodge, 2000). However, experimental determination of partition coefficients is time- and material-consuming, it can only be done if the compound is available, and in many cases only an estimate of the lipophilicity of the chemicals is required anyway. Thus the literature provides

0045-6535/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0045-6535(03)00609-X

1156

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

various figures for the same compound which can differ by more than one order of magnitude (Fielding et al., 1992). It is therefore much easier, faster and cheaper to predict log P using just the chemical structure. Not only are there different methods for experimental measurements of log P , but several approaches exist for its theoretical calculation too (Buchwald and Bodor, 1998). The most common ones are classified as ‘‘fragment constant’’ methods in which a structure is divided into fragments (atom or larger functional groups) and the values for each group are summed together (sometimes with structural correction factors) to yield the log P estimate. However, routine application of these approaches requires a continuous check of their validity by comparison with experimental data. In the present study we assessed four softwares for predicting log P . Similar studies have been done in the past, using a single list of values as a reference (Mannhold and Dross, 1996; Mannhold and van de Waterbeemd, 2001; Tetko et al., 2001). We believe that the source of data may be important parameter for this evaluation. If a single data set is used it must be assumed to be the ‘‘true’’ one. However, as we said, experimental determination of log P is complex and experimental values reported for the same compound differ widely. For this reason we evaluated the experimental data for the same compound in different databases. The aim was to avoid potential bias arising from considering one data set as true. We also avoided the risk of obtaining better results by using a predictor simply because it has been built on the basis of the same data set used for evaluation. Finally, we can take into account the variability of experimental data, and compare it with the predictor variability.

2. Methods We created a database of 235 different pesticides (Benfenati et al., 1999) using experimental data from various sources. Most values were from The Pesticide Manual (218 compounds) (Tomlin, 1997) and KowWin database (207 compounds) (http://esc.syrres.com/interkow/kowdemo.htm). Other values were collected from a compilation of the Hansch (Hansch et al., 1995) (144 compounds) and ANPA databases (139 compounds) (Finizio, 1999). Two separate regression analyses were done. The first related the experimental log P values from different databases. In the second, experimental log P from different databases were related to the calculated/predicted log P . Linear regression analyses were carried out using Excel. Calculated/predicted log P were obtained with four softwares, HyperChem (HyperChem 5.1, Hypercube Inc., Gainesville, Florida, USA), Pallas (Pallas 2.0, CompuDrug Chemistry Ltd), KowWin (http:// esc.syrres.com/interkow/kowdemo.htm) and TOPKAT (TOPKAT 5.0, Health Design, Inc., Rochester, New York, USA), using either atom-based or different fragments approaches.

3. Results and discussion We compared the availability of experimental data for the lipophilicity of our compounds in the four databases. At least two values were found only for 218 compounds. Regression analysis was applied for these correlations and the results are reported in Tables 1 and 2. Figs. 1 and 2 show the best and worst regres-

Table 1 R2 of correlation among experimental log P from different sources Square regression coefficient values R2 KowWin database a

0.86 (196)

a

Hansch Manual a

0.86 (134) 0.90 (139)a

ANPA database a

0.89 (137) 0.79 (131)a 0.85 (106)a

Average 0.87 0.85 0.87 0.84

The Pesticide Manual KowWin database Hansch Manual ANPA database

Number of compounds.

Table 2 Regression equations of experimental log P from different sources Regression equations KowWin database

Hansch Manual

ANPA database

y ¼ 0:8725x þ 0:5254

y ¼ 0:8439x þ 0:5704 y ¼ 0:9514x þ 0:0876

y ¼ 0:9692x þ 0:1006 y ¼ 0:9812x  0:0643 y ¼ 0:879x þ 0:4121

The Pesticide Manual KowWin database Hansch Manual

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

1157

logP experimental (139 compounds) 8.00

Hansch manual

6.00

2

R = 0.8998

4.00 2.00 0.00 -2.00 -4.00 -6.00 -6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

8.00

KowWin database

Fig. 1. Experimental log P values from the Hansch Manual and KowWin databases.

logP experimental (131 compounds) 7.00 6.00

2

R = 0.7856 5.00

ANPA database

4.00 3.00 2.00 1.00 0.00 -1.00 -2.00 -3.00 -2.00

-1.00

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

KowWin database

Fig. 2. Experimental log P values from the ANPA and KowWin databases.

sions. Some cases have a y intercept of more than 0.5 log-unit. For several compounds the same log P value was given in two different databases but this apparently high correlation could simply mean that the two databases refer to the same source. This may mean these sources are more reliable, but not necessarily. We

therefore did a second regression analysis after deleting all identical values. Tables 3 and 4 shows these results, reporting the number of compounds considered. The determination coefficient values (R2 ) are obviously lower in Table 3 than in Table 1, and the y intercept values are higher.

Table 3 R2 of correlation among experimental log P from different sources (without the common log P values) Square regression coefficient values R2 KowWin database

Hansch Manual

ANPA database

Average

0.79 (104)a

0.84 (112)a 0.81 (38)a

0.81 (83)a 0.64 (79)a 0.79 (75)a

0.82 0.75 0.81 0.75

a

Number of compounds.

The Pesticide Manual KowWin database Hansch Manual ANPA database

1158

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

Table 4 Regression equations of experimental log P from different sources (without the common log P values) Regression equations KowWin database

Hansch Manual

ANPA database

y ¼ 0:8347x þ 0:6905

y ¼ 0:8333x þ 0:5924 y ¼ 0:8883x þ 0:0816

y ¼ 0:9479x þ 0:1577 y ¼ 0:9576x  0:0745 y ¼ 0:8231x þ 0:5956

The coefficient average is also listed in Tables 1 and 3. The column averages in Table 1 (all values) are similar for all databases. In Table 3, the averages for some databases are lower but this does not mean they are wrong. Table 5 sets out the mean and standard deviation calculated from experimental data for each pesticide. We

The Pesticide Manual KowWin database Hansch Manual

used values from the four databases, when available, but always used at least two. The experimental measurements can be affected by several factors like chemical purity, experimental protocol, ionizable compounds, poor solubility in water, and typing errors. In fact, in Table 5 13 compounds have a standard deviation of one

Table 5 Mean and standard deviation (SD) of experimental log P for the 216 pesticides for which the four databases provided more than one value CAS

n

Mean

SD

CAS

Compound

n

Mean

SD

Dichlorprop-P Metsulfuronmethyl Tribufos Bentazone Chlorsulfuron Thifensulfuronmethyl Rimsulfuron Glyphosate

3 4

2.20 )0.78

2.12 1.99

1582-09-8 74738-17-3

Trifluralin Fenpiclonil

4 3

5.13 4.01

0.26 0.25

2 4 4 4

4.47 1.36 0.19 0.15

1.75 1.49 1.46 1.36

16752-77-5 15972-60-8 17109-49-8 43121-43-3

Methomyl Alachlor Edifenphos Triadimefon

4 4 2 4

0.47 3.30 3.66 2.98

0.25 0.25 0.25 0.25

2 3

)0.59 )3.03

1.24 1.19

67747-09-5 71283-80-2

4 4

4.30 4.47

0.24 0.23

4 3 2 3

1.62 )0.21 1.10 1.84

1.18 1.15 1.07 1.06

62-73-7 467-69-6 133-06-2 34123-59-6

4 2 4 4

1.56 1.16 2.62 2.69

0.23 0.23 0.22 0.21

94593-91-6 111991-09-4 16672-87-0 23103-98-2 13071-79-9 55285-14-8 82558-50-7 59756-60-4 1689-83-4

Dicamba Acephate Imazaquin Bensulfuronmethyl Cinosulfuron Nicosulfuron Ethephon Pirimicarb Terbufos Carbosulfan Isoxaben Fluridone Ioxynil

Prochloraz Fenoxaprop-pethyl Dichlorvos Flurenol Captan Isoproturon

3 3 2 4 4 2 3 3 4

2.62 )0.68 )1.58 2.13 4.05 2.75 3.51 2.73 3.07

1.00 0.90 0.88 0.87 0.85 0.78 0.75 0.74 0.72

330-55-2 834-12-8 7287-19-6 24151-93-7 10071-13-3 950-37-8 298-02-2 2310-17-0 919-86-8

4 4 4 2 3 4 4 4 3

3.04 2.92 3.37 4.17 )0.77 2.35 3.71 4.27 1.12

0.21 0.19 0.19 0.18 0.18 0.18 0.18 0.18 0.17

41483-43-6 88283-41-4 2104-64-5 3383-96-8 14816-18-3 99-30-9 74115-24-5 42509-80-8 51235-04-2 17804-35-2

Bupirimate Pyrifenox EPN Temephos Phoxim Dicloran Clofentezine Isazofos Hexazinone Benomyl

4 3 3 4 4 4 4 3 2 4

3.25 3.20 4.55 5.44 3.89 2.55 3.38 3.54 1.53 2.01

0.64 0.62 0.62 0.61 0.58 0.50 0.49 0.48 0.46 0.45

1897-45-6 21087-64-9 330-54-1 51218-45-2 5915-41-3 54-11-5 41198-08-7 2642-71-9 759-94-4 23950-58-5

4 4 4 4 4 3 4 4 4 3

2.87 1.75 2.67 3.11 3.09 1.09 4.56 3.29 3.27 3.30

0.17 0.17 0.16 0.16 0.16 0.14 0.14 0.13 0.12 0.12

15165-67-0 74223-64-6 78-48-8 25057-89-0 64902-72-3 79277-27-3 122931-48-0 1071-83-6 1918-00-9 30560-19-1 81335-37-7 83055-99-6

Compound 

Linuron Ametryn Prometryn Piperophos Maleic hydrazide Methidathion Phorate Phosalone Demeton-Smethyl Chlorothalonil Metribuzin Diuron Metolachlor Terbuthylazine Nicotine Profenofos Azinphos-ethyl EPTC Propyzamide

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

1159

Table 5 (continued) CAS

Compound

n

Mean

SD

CAS

Compound

n

Mean

SD

82-68-8 1563-66-2 63-25-2 115-32-2 55-38-9 66063-05-6 117-18-0 35400-43-2 112281-77-3 79983-71-4 2921-88-2 41394-05-2 333-41-5 29232-93-7

Quintozene Carbofuran Carbaryl Dicofol Fenthion Pencycuron Tecnazene Sulprofos Tetraconazole Hexaconazole Chlorpyrifos Metamitron Diazinon Pirimiphosmethyl Parathionmethyl Methiocarb Anthraquinone

3 4 4 4 4 4 2 3 2 4 4 3 4 4

4.65 2.12 2.17 4.47 4.30 4.60 4.14 5.29 3.33 3.75 5.05 0.99 3.59 4.07

0.44 0.40 0.39 0.37 0.36 0.35 0.35 0.33 0.33 0.30 0.27 0.27 0.26 0.26

86-50-0 72490-01-8 120928-09-8 13360-45-7 1134-23-2 51338-27-3 24017-47-8 122-14-5 1918-16-7 84332-86-5 23505-41-1 77732-09-3 732-11-6 15545-48-9

Azinphos-methyl Fenoxycarb Fenazaquin Chlorbromuron Cycloate Diclofop-methyl Triazophos Fenitrothion Propachlor Chlozolinate Pirimiphos-ethyl Oxadixyl Phosmet Chlorotoluron

4 4 3 3 4 4 4 4 4 3 3 3 4 4

2.79 4.24 5.57 3.03 3.96 4.65 3.39 3.37 2.26 3.20 4.90 0.75 2.83 2.41

0.12 0.11 0.11 0.11 0.11 0.11 0.10 0.10 0.09 0.09 0.09 0.09 0.08 0.08

4

2.93

0.08

56-38-2

Parathion-ethyl

4

3.83

0.01

4 3

2.96 3.43

0.08 0.08

1113-02-6 301-12-2

3 3

)0.75 )0.75

0.01 0.01

2 4 4 4 3 4

2.65 3.44 )0.77 3.42 )0.90 4.31

0.07 0.07 0.07 0.06 0.06 0.06

35367-38-5 76674-21-0 29091-05-2 84496-56-0 2439-01-2 13457-18-6

4 4 4 3 3 3

3.88 2.30 4.30 4.80 3.78 3.80

0.01 0.00 0.00 0.00 0.00 0.00

10605-21-7 36734-19-7

Benoxacor Benalaxyl Methamidophos Desmedipham Amitrole Chlorthaldimethyl Carbendazim Iprodione

Omethoate Oxydemetonmethyl Diflubenzuron Flutriafol Dinitramine Clomeprop Chinomethionat Pyrazophos

4 4

1.49 3.05

0.06 0.06

73250-68-7 18691-97-9

3 3

3.23 2.64

0.00 0.00

50471-44-8 65907-30-4 19666-30-9 60168-88-9 57646-30-7 1746-81-2 57837-19-1 25311-71-1 122-34-9 886-50-0 2303-17-5 22224-92-6 52-68-6 60-51-5

Vinclozolin Furathiocarb Oxadiazon Fenarimol Furalaxyl Monolinuron Metalaxyl Isofenphos Simazine Terbutryn Tri-allate Fenamiphos Trichlorfon Dimethoate

4 4 4 4 4 4 4 4 4 4 3 4 4 4

3.05 4.65 4.83 3.65 2.66 2.28 1.69 4.08 2.17 3.71 4.58 3.27 0.49 0.76

0.06 0.06 0.05 0.05 0.05 0.05 0.05 0.05 0.04 0.04 0.04 0.04 0.04 0.04

137-26-8 120923-37-7 33089-61-1 64249-01-0 35575-96-3 25059-80-7 1861-40-1 18181-80-1 23184-66-9 95465-99-9 54593-83-8 122453-73-0 81777-89-1 99607-70-2

3 2 4 3 2 2 4 3 2 2 2 2 2 2

1.73 1.63 5.50 3.81 1.05 2.50 5.29 5.40 4.50 3.90 4.59 4.83 2.50 5.03

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

3060-89-7 298-04-4 7085-19-0 33693-04-8 15299-99-7 5598-13-0

Metobromuron Disulfoton Mecoprop Terbumeton Napropamide Chlorpyryfosmethyl Metoxuron Thiodicarb Tolylfluanid Carboxin Propoxur Fonofos

4 4 4 4 4 4

2.41 4.00 3.15 3.07 3.33 4.29

0.04 0.04 0.04 0.03 0.03 0.03

56-72-4 57966-95-7 121552-61-2 66215-27-8 50-29-3 1014-69-3

Mefenacet Methabenzthiazuron Thiram Amidosulfuron Amitraz Anilofos Azamethiphos Benazolin-ethyl Benfluralin Bromopropylate Butachlor Cadusafos Chlorethoxyfos Chlorfenapyr Clomazone Cloquintocet mexyl Coumaphos Cymoxanil Cyprodinil Cyromazine DDT Desmetryn

2 2 2 2 2 3

4.13 0.67 4.00 )0.06 6.91 2.38

0.00 0.00 0.00 0.00 0.00 0.00

4 3 3 4 4 4

1.64 1.68 3.92 2.16 1.53 3.93

0.03 0.03 0.03 0.03 0.03 0.02

1085-98-9 83164-33-4 34205-21-5 50563-36-5 106325-08-0 55283-68-6

Dichlofluanid Diflufenican Dimefuron Dimethachlor Epoxiconazole Ethalfluralin

3 3 2 2 2 4

298-00-0 2032-65-7 84-65-1 98730-04-2 71626-11-4 10265-92-6 13684-56-5 61-82-5 2136-79-0

19937-59-8 59669-26-0 731-27-1 5234-68-4 114-26-1 944-22-9

3.70 0.00 4.90 0.00 2.51 0.00 2.17 0.00 3.44 0.00 5.11 0.00 (continued on next page)

1160

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

Table 5 (continued) CAS

Compound

n

Mean

SD

CAS

Compound

n

Mean

SD

2164-17-2 92-52-4 1194-65-6 13593-03-8 23135-22-0 5259-88-1 87-86-5

Fluometuron Biphenyl Dichlobenil Quinalphos Oxamyl Oxycarboxin Pentachlorophenol Bifenox Bendiocarb Fludioxonil Flumetralin Flurprimidol Formothion Fosthiazate Haloxyfop etotyl Hexythiazox Imazalil Imibenconazole Imidacloprid Lenacil Mepronil Metazachlor Metosulam Myclobutanil Novaluron Ofurace Oryzalin

3 2 4 4 4 4 3

2.41 4.00 2.73 4.45 )0.47 0.75 5.11

0.02 0.02 0.02 0.02 0.02 0.02 0.01

29973-13-5 563-12-2 26225-79-6 13194-48-4 3740-92-9 120068-37-3 63729-98-6

3 2 3 4 3 2 3

2.04 5.07 2.70 3.59 4.17 4.00 3.00

0.00 0.00 0.00 0.00 0.00 0.00 0.00

4 4 2 2 2 2 2 3 2 4 2 2 2 3 3 2 4 2 2 2

4.49 1.71 4.12 5.45 3.34 1.48 1.68 4.33 2.53 3.82 4.94 0.57 2.31 3.66 2.13 3.08 2.94 5.27 1.39 3.73

0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

79622-59-6 86811-58-7 51218-49-6 1610-18-0 139-40-2 86763-47-5 52888-80-9 34643-46-4 84087-01-4 3689-24-5 162320-67-4 107534-96-3 112410-23-8 34014-18-1 116-29-0 640-15-3 112143-82-5 101200-48-0 64628-44-0

Ethiofencarb Ethion Ethofumesate Ethoprophos Fenclorim Fipronil Flamprop-Mmethyl Fluazinam Fluazuron Pretilachlor Prometon Propazine Propisochlor Prosulfocarb Prothiofos Quinclorac Sulfotep SZI-121 Tebuconazole Tebufenozide Tebuthiuron Tetradifon Thiometon Triazamate Tribenuron methyl Triflumuron

2 2 4 2 2 2 2 2 2 3 2 4 2 2 2 2 2 2 3

3.56 5.10 4.08 2.99 2.93 3.50 4.65 5.67 )1.15 3.99 3.30 3.70 4.25 1.79 4.61 3.15 2.69 )0.44 4.91

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

42576-02-3 22781-23-3 131341-86-1 62924-70-3 56425-91-3 2540-82-1 98886-44-3 87237-48-7 78587-05-0 35554-44-0 86598-92-7 138261-41-3 2164-08-1 55814-41-0 67129-08-2 139528-85-1 88671-89-0 116714-46-6 58810-48-3 19044-88-3 

Compounds with a SD greater than one.

log P unit or more. These compounds show acid–basic properties, so the experimental measurements may not have been made under the right conditions to ensure neutrality. Moreover, our aim was not to establish the ‘‘right’’ log P for each compound, but to highlight the variability of experimental measurements even in very common, ‘‘official’’ sources. For several compounds, the values we found were identical in the different databases, because they all refer to the same source. In this case SD ¼ 0.0. In summary, the quality of experimental data is not optimal, and several problems arise.

Tables 6 and 7 show the results of regression analysis for the experimental versus predicted values. The different programs to predict log P use calculation methods including atom-based and fragmental approaches. The structure of the molecule is broken down into fragments of one or two atoms and these are checked one by one. The software does this breakdown in various ways, producing all possible combinations in which the molecule can be assembled on the basis of the fragment database in use, and applying correction rules coupled with the molecular connectivity. In contrast, atom-based

Table 6 Square regression coefficient values R2 of correlation among predicted and experimental log P and root mean squared error (RMS)

HyperChem KowWin software Pallas TOPKAT Average R2 a

The Pesticide Manual (218)a

KowWin database (207)a

Hansch Manual (144)a

ANPA database (139)a

R2

RMS

R2

RMS

R2

RMS

R2

RMS

0.50 0.66 0.69 0.64 0.62

1.32 1.05 1.03 1.03

0.63 0.90 0.82 0.84 0.80

0.99 0.52 0.75 0.63

0.57 0.79 0.81 0.78 0.74

1.06 0.74 0.72 0.72

0.50 0.65 0.74 0.56 0.61

1.24 1.00 0.88 1.08

Number of compounds.

Average R2 0.55 0.75 0.77 0.71

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

1161

Table 7 Regression equations between predicted and experimental log P values Regression equations The Pesticide Manual HyperChem KowWin software Pallas TOPKAT

y y y y

KowWin database

¼ 0:5995x þ 1:6543 ¼ 0:7947x þ 0:8193 ¼ 0:8716x þ 0:5124 ¼ 0:6675x þ 1:0754

y y y y

Hansch Manual

¼ 0:7367x þ 1:0858 ¼ 1:0332x  0:1496 ¼ 1:0512x  0:215 ¼ 0:8495x þ 0:3535

y y y y

ANPA database

¼ 0:6942x þ 1:1864 ¼ 0:9346x þ 0:2566 ¼ 0:9563x þ 0:17 ¼ 0:7687x þ 0:674

y y y y

¼ 0:5893x þ 1:5514 ¼ 0:7323x þ 0:9239 ¼ 0:8747x þ 0:4028 ¼ 0:5813x þ 1:2199

of complex structures and the inability to estimate log P for uncorrelated or unknown fragments. Log P estimates based on atom values alone needed to be improved by inclusion of substructures layer or more complex than ‘‘atom’’. Moreover log P estimates based

procedures avoid correction factors and define huge numbers of atom-types; lipophilicity is quantified simply by summing atom-type values. Both procedures show various inefficiencies, like oversimplification of steric and conformational effects

logP 8.00

R2 = 0.8081

6.00

PALLAS

4.00

2.00

0.00

-2.00

-4.00 -3.00

-2.00

-1.00

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

Hansch manual

Fig. 3. Predicted (Pallas) and experimental (Hansch Manual) log P values.

logP 8.00

6.00

KowWin predicted by calculation

2

R = 0.9034 4.00

2.00

0.00

-2.00

-4.00

-6.00 -6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

KowWin database

Fig. 4. Predicted (KowWin) and experimental (KowWin database) log P values.

8.00

1162

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

and Enslein, 1996). TOPKAT also provides a warning about the reliability of the prediction but this was reported for only 17 compounds in our data set. HyperChem calculates the log P from atomic contributions (Ghose et al., 1988). Pallas, similar to KowWin and TOPKAT, calculates log P values based on atomic and fragmental contributions but uses a weight for the

on atom/fragment values, can usually be improved by inclusion of correction factors able to take into account steric interaction, ionization form, hydrogen bondings, effects from polar functional substructures, and so on. The KowWin and TOPKAT programs calculate log P values based on different atom/fragment contribution approaches (Meylan and Howard, 1995; Gombar

Table 8 Overestimated, underestimated and well-predicted log P values HyperChem d

The Pesticide Manual (218) KowWin database (207)d Hansch Manual (144)d ANPA database (139)d

Pallas

KowWin

TOPKAT

"a

#b

¼c

"

#

¼

"

#

¼

"

#

¼

58 50 53.5 54

32 36 38.2 33

10 14 8.3 13

45 37 37.5 42

38 40 39.6 37

17 23 22.9 21

42 30 33.3 39

33 34 35.4 34

25 36 31.3 27

36 25 27 31.7

41 46 44 41.7

23 29 29 26.6

a

Percentage of overestimated predicted values. Percentage of underestimated predicted values. c Percentage of similar predicted values (within 0.15 log-unit). d Number of compounds. b

Table 9 R2 values of correlation between predicted and experimental log P values for anilines, carbamates, ureas, organophosphorus, heterocyclics and halogenated aromatics compounds Square regression coefficient values R2 The Pesticide Manual

KowWin database

Chemical class Hansch Manual

ANPA database

HyperChem KowWin software Pallas TOPKAT

0.64 0.86 0.72 0.71

(36)a

0.68 0.88 0.74 0.75

(35)a

0.71 0.88 0.78 0.81

(22)a

0.66 0.88 0.86 0.86

(20)a

Anilines (39)a

HyperChem KowWin software Pallas TOPKAT

0.66 0.90 0.85 0.78

(23)a

0.69 0.93 0.90 0.81

(24)a

0.54 0.73 0.55 0.55

(20)a

0.65 0.73 0.61 0.60

(23)a

Carbamates (26)a

HyperChem KowWin software Pallas TOPKAT

0.42 0.39 0.66 0.54

(31)a

0.54 0.84 0.73 0.81

(26)a

0.50 0.57 0.82 0.67

(15)a

0.42 0.47 0.74 0.40

(22)a

Ureas (31)a

HyperChem KowWin software Pallas TOPKAT

0.67 0.88 0.87 0.82

(54)a

0.80 0.96 0.93 0.91

(53)a

0.82 0.91 0.93 0.92

(39)a

0.72 0.91 0.87 0.91

(34)a

Organophosphorus (59)a

HyperChem KowWin software Pallas TOPKAT

0.35 0.51 0.58 0.50

(113)a

0.45 0.80 0.73 0.77

(100)a

0.44 0.63 0.69 0.64

(60)a

0.34 0.51 0.63 0.33

(67)a

Heterocyclics (119)a

HyperChem KowWin software Pallas TOPKAT

0.30 0.49 0.46 0.51

(78)a

0.37 0.80 0.74 0.81

(74)a

0.45 0.67 0.74 0.73

(50)a

0.35 0.42 0.57 0.50

(46)a

Halogenated aromatics (83)a

a

Number of compounds.

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

contributions of atom-based and fragmental parameters (Ghose and Crippen, 1986; Rekker, 1977). The weight used was: log Pcomb ¼ 0:733 log Patomics þ 0:267 log Pcdr , where Pcomb stands for combined value (atomic and fragment based), Patomics uses atomic fragments and Pcdr is based on RekkerÕs collection of hydrophobic fragmental constant (Rekker and de Kort, 1979). Figs. 3 and 4 show two examples of correlations between experimental and predicted data. Table 6 shows R2 and root mean squared error (RMS). The RMS values are quite high. Table 6 indicates a superiority of the fragmental over atom-based methods (HyperChem), in agreement with a previous study (Mannhold and Dross, 1996). Further inspection was done on the basis of the log P values. The fragmental method reportedly tends to overestimate log P and atom-based approaches underestimate it. This underestimation was significantly more pronounced for log P > 4 (Mannhold and Dross, 1996). Our results are shown in Table 8. All software gave overestimates and underestimates. HyperChem, an atom-based method, overestimated the log P values for each database, whatever the value, whereas TOPKAT, an atom/fragment method, tended to underestimate them. For KowWin and Pallas, the situation was variable. Finally, to extract more information from our data, we divided the 235 pesticides into chemical classes: anilines, carbamates, ureas, organophosphorus, heterocyclics and halogenated aromatics. The similarities between several compounds allows some comparative considerations on how the different calculations process these structures (Table 9). The best correlations (R2 ) for anilines and carbamates were given with the KowWin software. Experimental data for ureas and heterocyclics, for three of the four databases, were closely correlated with predicted values using Pallas software. For organophosphorus, the experimental data of three of the databases highly correlated with the values predicted by the KowWin software. For halogenated aromatics, the situation was unclear. In conclusion, reliable log P values for pesticides are still limited and experimental values show a certain degree of variability. It is therefore questionable to rely on a single value. Predictors can be an alternative but, as shown in Table 6, some are better then others. In any case, it is necessary to use values predicted with the same software to have consistent results, because of the predictor variability. Moreover, the different performances of the four softwares depend on the chemical class. Anilines, carbamates and organophosphorus seem to be predicted better using KowWin, whereas ureas and heterocyclics can be predicted better using Pallas. So the quality of results tends to differ depending on the software used and chemical classes of the compounds investigated.

1163

Acknowledgements We acknowledge financial support from The European Commission: project IMAGETOX, HPRN-CT1999-00015. NP was recipient of a fellowship from IMAGETOX.

References Benfenati, E., Pelagatti, S., Grasso, P., Gini, G., 1999. COMET: the approach of a project in evaluating toxicity. In: Gini, G.C., Katritzky, A.R. (Eds.), Predictive Toxicology of Chemicals: Experiences and Impact of AI Tools. AAAI 1999 Spring Symposium Series. AAAI Press, Menlo Park, CA, pp. 40–43. Buchwald, P., Bodor, N., 1998. Octanol–water partition: searching for predictive models. Curr. Med. Chem. 5, 353–380. Cronin, M.T.D., Dearden, J.C., Duffy, J.C., Edwards, R., Manga, N., Worth, A.P., Worgan, A.D.P., 2002. The importance of hydrophobicity and electrophilicity descriptors in mechanistically-based QSARs for toxicological endpoints. SAR QSAR Environ. Res. 13, 167–176. Eadsforth, C.V., Moser, P., 1983. Assessment of reverse-phase chromatographic methods for determining partition coefficients. Chemosphere 12, 1459–1475. Edelbach, D.J., Lodge, K.B., 2000. Insights into the measurement of the octanol–water partition coefficient from experiments with acrylate esters. Phys. Chem. Chem. Phys. 2, 1763–1771. Fielding, M., Barcel o, D., Helweg, A., Galassi, S., Torstensson, L., Van Zoonen, P., Wolter, R., Angeletti, G., 1992. Pesticides in ground and drinking water (Water pollution research reports, No. 27), Commission of the European Communities, Brussels. Finizio, A., 1999. Valutazione del rischio per gli organismi non bersaglio: ANPA DATABASE: LÕ IMPATTO AMBIENTALE DEI PRODOTTI FITOSANITARI. Finizio, A., Vighi, M., Sandroni, D., 1997. Determination of N -octanol/water partition coefficient (kow ) of pesticide critical review and comparison of methods. Chemosphere 34, 131–161. Ghose, A.K., Crippen, G.M., 1986. Atomic physicochemical parameters for three-dimensional structure-directed quantitative structure–activity relationships. I. Partition coefficients as a measure of hydrophobicity. J. Computat. Chem. 7, 565–577. Ghose, A.K., Prichett, A., Crippen, G.M., 1988. Atomic physicochemical parameters for three dimensional structures directed quantitative structure–activity relationships. III. Modeling hydrophobic interactions. J. Comput. Chem. 9, 80–90. Gombar, V.K., Enslein, K., 1996. Assessment of n-octanol/ water partition coefficient: when is the assessment reliable? J. Chem. Inf. Comput. Sci. 36, 1127–1134. Hansch, C., Leo, A., Hoekman, D., 1995. Exploring QSAR Hydrophobic, Electronic and Steric Constants. ACS, Washington, DC. Leo, A., Hansch, C., 1999. Role of hydrophobic effects in mechanistic QSAR. Perspect. Drug Discov. 17, 1–25.

1164

E. Benfenati et al. / Chemosphere 53 (2003) 1155–1164

Lodge, K.B., 1999. Octanol–water partition coefficients of cyclic C-7 hydrocarbons and selected derivatives. J. Chem. Eng. Data 44, 1321–1324. Lyman, W.J., Reehl, W.F., Rosenblatt, D.H., 1990. Handbook of Chemical Property Estimation Methods. ACS, Washington, DC. Mannhold, R., Dross, K., 1996. Calculation procedures for molecular lipophilicity: A comparative study. Quant. Struct. –Act. Relat. 15, 403–409. Mannhold, R., van de Waterbeemd, H., 2001. Substructure and whole molecule approaches for calculating log P . J. Comp. Aided-Mol. Des. 15, 337–354. Meylan, W.M., Howard, P.H., 1995. Atom/fragment contribution method for estimating octanol–water partition coefficients. J. Pharm. Sci. 84, 83–92. Ran, Y., He, Y., Yang, G., Johnson, J.L.H., Yalkowsky, S.H., 2002. Estimation of aqueous solubility of organic compounds by using the general solubility equation. Chemosphere 48, 487–509.

Rekker, R.F., 1977. The hydrophobic fragmental constant. In: Pharmacochemistry Library. Elsevier Amsterdam. Rekker, R.F., de Kort, H.M., 1979. The hydrophobic fragmental constant; an extension to a 1000 data point set. Eur. J. Med. Chem. 14, 479–488. Sabljic, A., G€ usten, H., Verhaar, H., Hermens, J., 1995. Qsar modelling of soil sorption. Improvements and systematics of log KOC vs. log KOW correlations. Chemosphere 31, 4489– 4514. Sverdrup, L.E., Nielsen, T., Krogh, P.H., 2002. Soil ecotoxicity of polycyclic aromatic hydrocarbons in relation to soil sorption, lipophilicity and water solubility. Environ. Sci. Tecnol. 36, 2429–2435. Tetko, I.V., Tanchuk, V.Y., Villa, A.E., 2001. Prediction of noctanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices. J. Chem. Inf. Comput. Sci. 41, 1407–1421. Tomlin, C., 1997. The Pesticide Manual, eleventh ed. British Crop Protection Council, Farnham, UK.

Suggest Documents