Document not found! Please try again

Computer Simulation Models of the Primary Electron ...

2 downloads 0 Views 3MB Size Report
in the field: Drs. Marilyn Gunner, Barry Honig, and Bill Parson. Finally ...... to be discovered by looking at the key contributors to the various gaps that we have.
Computer Simulation Models of the Primary Electron Transfer in Photosynthetic Reaction Centers by Raymond Yee

B. A. Sc. (University of Toronto) 1990

A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biophysics in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA, BERKELEY

Committee in charge: Professor David Chandler, Chair Professor Robert M. Glaeser Professor Kenneth Sauer 1997

The dissertation of Raymond Yee is approved:

Chair

Date

Date

Date

University of California, Berkeley

1997

Computer Simulation Models of the Primary Electron Transfer in Photosynthetic Reaction Centers

Copyright 1997 by Raymond Yee

1 Abstract

Computer Simulation Models of the Primary Electron Transfer in Photosynthetic Reaction Centers by Raymond Yee Doctor of Philosophy in Biophysics University of California, Berkeley Professor David Chandler, Chair We calculated diabatic free energy surfaces for primary electron transfer (et) in the photosynthetic reaction center (prc) of Rps. viridis through molecular dynamics (md) and continuum electrostatic calculations. Specifically, we studied the sensitivity of free energies of et to the charge states of ionizable amino groups and the distribution of environmental dielectric response. Nuclear polarizability of the complex is modeled through md, while the electronic and remaining nuclear polarizability of the membrane and aqueous surroundings are accounted through continuum electrostatic calculations. Some trends hold for every model studied. Primary et along the l branch is favored over that along the m branch. The free energy of P∗ → P+ B− L is large and positive in every model, partially because of the orientation of TyrM208. Models in which ionizable amino groups are set to their pH= 7 value or in which the membrane–embedded ionizable amino groups are neutralized are in greater agreement with experiment than the model in which all residues are neutralized. We applied these models to compare calculated effective dielectric constants with those derived by Steffen et al. (Science 264 pp. 810–816 (1994)) and to simulate for mutant prc systems of Heller et al. (Science 269 pp. 940–945 (1995)). We do not find the strong differentiation between the l and m dielectric constants described by Steffen et al. The diabatic free energy surfaces calculated for the Heller mutants parallel qualitatively some, but not all, of the observed behavior of the mutants.

2 Finally, we examined simplified and statistical models for the prc. We studied the distributions of electrostatic contributors, grouped by amino acid residues, to free energies as statistical distributions. These distributions contain large numbers of statistical outliers and are not normally distributed. For simplified models, low order multipole expansions of charges in a residue are accurate for calculating the electric potential at all chromophores, except for BL , which depends strongly on the detailed distribution of charge in TyrM208. We probed for statistical correlations between the degree of l–m electrostatic symmetry–breaking for homologous l–m pairs and sequence conservation in the residues of the pairs, but did not find such correlations.

Professor David Chandler Dissertation Committee Chair

3

To Mom and Dad, whom I love more than words can say.

ii

Contents List of Figures

iii

List of Tables

iv

1 Introduction 1.1 The photosynthetic reaction center . . . . . . . . . . . . . . . . . . . 1.2 Primary electron transfer . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Is the primary transfer a superexchange or a two–step process? 1.2.2 Why does the transfer proceed down the L branch and not the M branch? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Overall issues, questions, and methodology . . . . . . . . . . . . . . . 1.3.1 The basic physical picture: Marcus theory of electron transfer 1.3.2 Molecular dynamics . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Continuum electrostatics calculations . . . . . . . . . . . . . . 1.4 Overview of the dissertation . . . . . . . . . . . . . . . . . . . . . . . 1.A Appendix: The force field in our molecular simulation . . . . . . . . .

1 2 11 12 15 19 19 26 29 31 34

2 Computer Models of the Wild Type Reaction Center 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Theory and methodology . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Physical model . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Calculating the model parameters . . . . . . . . . . . . . . . . 2.2.3 Dielectric and charge models . . . . . . . . . . . . . . . . . . . 2.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Calculated diabatic free energy surfaces . . . . . . . . . . . . . 2.3.2 Large scale trends in the data and qualitative picture . . . . . 2.3.3 Explaining the trends: The various charge and dielectric models 2.3.4 Large contributors and site–directed mutagenesis . . . . . . . 2.3.5 The energetics of P+ B− L and the role of TyrM208 . . . . . . . 2.3.6 The free energy of primary transfer: ∆G13 . . . . . . . . . . . 2.3.7 L–M asymmetry . . . . . . . . . . . . . . . . . . . . . . . . .

36 36 38 38 41 46 49 52 58 63 77 82 100 103

CONTENTS

iii

2.3.8

Comments on the validity of the partially neutralized and fully neutralized charge model . . . . . . . . . . . . . . . . . . . . . 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A Appendix: Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.1 Electrostatic interaction with the protein complex E (CO) . . . 2.A.2 Reaction field E (RF ) . . . . . . . . . . . . . . . . . . . . . . . 2.A.3 Reorganization energy λ . . . . . . . . . . . . . . . . . . . . . 2.A.4 Vacuum gaps ∆E (0) . . . . . . . . . . . . . . . . . . . . . . . 2.A.5 Overall errors in ∆G . . . . . . . . . . . . . . . . . . . . . . . 3 Steffen/Boxer Experiments 3.1 Introduction . . . . . . . . 3.2 Theory and methodology . 3.3 Results and discussion . . 3.4 Conclusions . . . . . . . .

105 107 109 109 112 114 114 114

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

117 117 118 121 124

4 Mutation Experiments 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 4.2 The Heller double mutant and allied systems . . . . 4.3 Computer simulation of the mutants . . . . . . . . 4.3.1 Simulating mutations . . . . . . . . . . . . . 4.3.2 Calculating the diabatic surfaces for mutants 4.4 Results and discussion . . . . . . . . . . . . . . . . 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

126 126 127 128 129 130 136 141

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

5 Statistical Nature and Structural Biology of the Photosynthetic Reaction Center 144 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 5.2 Statistical model for the residual contributors . . . . . . . . . . . . . 146 5.3 A multipolar analysis of the reaction center . . . . . . . . . . . . . . 155 5.3.1 General methodology . . . . . . . . . . . . . . . . . . . . . . . 155 5.3.2 Multipolar analysis as sensitivity analysis . . . . . . . . . . . . 157 5.3.3 Conclusions for the multipolar analyses . . . . . . . . . . . . . 164 5.4 Electrostatic correlates to L–M homology . . . . . . . . . . . . . . . . 164 5.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 5.4.2 Theory and method . . . . . . . . . . . . . . . . . . . . . . . . 168 5.4.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 174 5.4.4 Discussion and conclusion . . . . . . . . . . . . . . . . . . . . 179 6 Concluding Comments 181 6.1 Detailed summary of our findings . . . . . . . . . . . . . . . . . . . . 181 6.2 Overall reflections on our work . . . . . . . . . . . . . . . . . . . . . . 185

CONTENTS 6.3

iv

A few concrete suggestions for future work . . . . . . . . . . . . . . . 187

Bibliography A A Critical Summary of Some Papers by Warshel and A.1 Methodology, results, and consistent themes . . . . . . A.2 Interpreting the calculations of Warshel, Parson, et al. A.2.1 ∆Egas and ∆E (0) . . . . . . . . . . . . . . . . . A.2.2 TyrM208 and the size of ∆G32 . . . . . . . . . . A.2.3 Modeling of the dielectric environment . . . . .

189 Parson . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

198 198 205 206 207 207

v

List of Figures 1.1 1.2 1.3 1.4 1.5 1.6

A schematic diagram of the photosynthetic apparatus in Rps. viridis . The structure of the prc of Rps. viridis . . . . . . . . . . . . . . . . . Central cofactors of the prc of Rps. viridis . . . . . . . . . . . . . . . The energetics of charge transfers in the prc of Rps. viridis . . . . . Marcus parabolas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The pb methodology in overview . . . . . . . . . . . . . . . . . . . .

4 6 7 10 24 32

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

Position of the membrane and the various dielectric boundary conditions Diabatic free energy surfaces for the S2:2:80 model . . . . . . . . . . . Diabatic free energy surfaces for the P2:2:80 model . . . . . . . . . . . Diabatic free energy surfaces for the N2:2:80 model . . . . . . . . . . . Green’s functions for dielectric models . . . . . . . . . . . . . . . . . (CO) Individual contributors to ∆E13 for the S2:2 dielectric models . . . (CO) Individual contributors to ∆E13 for the S2:2:80 dielectric models . . (CO) Individual contributors to ∆E13 for the S2:80 dielectric models . . . (CO) Cumulative contributions to ∆E13 for the various dielectric boundary conditions (standard charge model) . . . . . . . . . . . . . . . . . Individual contributors to ∆G13 for the S charge model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . Individual contributors to ∆G13 for the P charge model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . Individual contributors to ∆G13 for the N charge model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . (CO) for the various charge models Cumulative contributions to ∆E13 (2 : 2 : 80 dielectric boundary condition) . . . . . . . . . . . . . . . . (CO) Cumulative sum of ∆E13 in the S2:2:80 model sorted by magnitude of individual contributions . . . . . . . . . . . . . . . . . . . . . . . . (CO) in the standard charge model Histogram of contributors to ∆E13 (2 : 2 : 80 dielectric boundary condition) . . . . . . . . . . . . . . . . Individual contributors to ∆G12 for the S model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48 53 54 55 65 67 68 69

2.10 2.11 2.12 2.13 2.14 2.15 2.16

70 72 73 74 75 80 81 83

LIST OF FIGURES

vi

2.17 Individual contributors to ∆G12 for the P model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2.18 Individual contributors to ∆G12 for the N model for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 (CO) for the various charge models 2.19 Cumulative contributions to ∆E12 (2:2:80 dielectric boundary condition) . . . . . . . . . . . . . . . . . . 86 2.20 Definition of Φ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.21 TyrM208 with Φ = 0◦ and Φ = 180◦ . . . . . . . . . . . . . . . . . . . 89 2.22 Φ(t) for the S model . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.23 Φ(t) for the P model . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2.24 Φ(t) for the N model . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 2.25 Electrostatic potential on TyrM208 HH . . . . . . . . . . . . . . . . . 95 2.26 Electrostatic potential on TyrM208 HH (2:2:80) . . . . . . . . . . . . 96 2.27 Electrostatic potential on TyrM208 HH (2:2:80) excluding chromophores 97 (CO) 2.28 Glassiness in the convergence of ∆E13 in S2:2 . . . . . . . . . . . . 111 2.29 Estimating errors in E (RF ) . . . . . . . . . . . . . . . . . . . . . . . . 113 2.30 The convergence of λ13 . . . . . . . . . . . . . . . . . . . . . . . . . . 115 3.1

Convergence of eff for the probe chromophores . . . . . . . . . . . . . 123

4.1 4.2 4.3

A schematic diagram of the energies of interest . . . . . . . . . . . . . 131 The convergence of λ13 . . . . . . . . . . . . . . . . . . . . . . . . . . 134 The convergence of λ13 . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.1 5.2

Normal qnorm plot for contributors to ∆E13 in the S2:2:80 model . . 149 Normal qnorm plot for a truncated distribution of contributors to (CO) ∆E13 in the S2:2:80 model . . . . . . . . . . . . . . . . . . . . . . . 150 (CO) Normal qnorm plot for contributors to ∆E13 in the N2:2:80 model . 151 (CO) Exponential qnorm plot for contributors to ∆E13 in the S2:2:80 model 153 (CO) Exponential qnorm plot for contributors to ∆E13 in the N2:2:80 model 154 The difference between an exact and monopole expansion at BL . . . 159 The difference between an exact and dipole expansion at BL . . . . . 160 The difference between an exact and monopole expansion at PL . . . 161 The difference between an exact and dipole expansion at PL . . . . . 162 The difference between an exact and monopolar expansion at BM . . 163 Sequence alignment for l and m . . . . . . . . . . . . . . . . . . . . . 168 Cumulative distribution for Ci . . . . . . . . . . . . . . . . . . . . . . 175 Cumulative distribution for Ci (without unpaired residues) . . . . . . 176 Ci sorted by pair category . . . . . . . . . . . . . . . . . . . . . . . . 177

5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14

(CO)

A.1 The dielectric regions for Alden (1995) . . . . . . . . . . . . . . . . . 203 A.2 eff for the 2:2:80 boundary condition . . . . . . . . . . . . . . . . . . 211

LIST OF FIGURES

vii (CO)

A.3 Individual contributors to ∆E32 in the S (hdb) model . . . . . . . 214 A.4 eff for the hdb boundary condition . . . . . . . . . . . . . . . . . . . 215

viii

List of Tables 1.1 1.2 1.3

Nomenclature for charge states under consideration . . . . . . . . . . Bendings added for ligand interactions . . . . . . . . . . . . . . . . . Improper torsions added for ligand interactions . . . . . . . . . . . .

9 35 35

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11

Amino acid residues neutralized in the P models . . . . . . . . . . . . Calculated values for ∆E (CO) , E (RF ) , and λ . . . . . . . . . . . . . . Free energies of transfer . . . . . . . . . . . . . . . . . . . . . . . . . (0) Comparison between ∆E1i and the Thompson–Zerner estimates . . . Rough correspondences to our models . . . . . . . . . . . . . . . . . . (CO) (CO) Residual contribution of TyrM208 to ∆E12 and ∆E13 vs Φ . . . Possible scenarios for primary transfer rates in the Y(M208)F mutant ∆G12 with opposite Tyr orientation . . . . . . . . . . . . . . . . . . . (CO) . . . . . . . . . . . . . . . . . . . . . Largest contributors to ∆E13 (CO) Major residual contributors to ∆E3 3 . . . . . . . . . . . . . . . . . Our error estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47 50 51 56 77 87 99 100 102 104 116

3.1 3.2

Local dielectric constants from calculations on slb . . . . . . . . . . . 121 Refined calculation of eff for S2:2:80 . . . . . . . . . . . . . . . . . . . 122

4.1 4.2 4.3 4.4 4.5

λ13 for the four systems . . . . . . . . . . λ13 for the four systems . . . . . . . . . . ∆∆E (CO) for β mutant → double mutants Calculated ∆∆G and ∆λ for the systems . Calculated ∆G and λ for the systems . . .

5.1 5.2 5.3 5.4 5.5

Statistical features of residual contributors to ∆E13 Alignment of l and m in Rps. viridis . . . . . . . . Categorization of residue pairs . . . . . . . . . . . . Alignment pair types . . . . . . . . . . . . . . . . . Ci calculated for pair types . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

(CO)

. . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

133 133 136 137 137

in S2:2:80 . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

147 170 172 173 174

A.1 Free energies of transfer . . . . . . . . . . . . . . . . . . . . . . . . . 200

ix A.2 Comparing effective dielectric constants for Alden (1995) to those for the 2:2:80 models from ∆G32 . . . . . . . . . . . . . . . . . . . . . . . 208 A.3 eff for various dielectric models . . . . . . . . . . . . . . . . . . . . . 212

x

Acknowledgments I owe a debt of gratitude to many people during my years in Berkeley. First of all, I would like to acknowledge the supervision of my thesis advisor, Professor David Chandler, from whom I have learned a great deal about careful, rigorous scientific exploration. He constantly impresses with his passion and dedication to the scientific craft. I thank Professor Bob Glaeser for his contribution to my graduate career at every step, from preliminary exams to the final steps of dissertation writing. Without his kind and consistent encouragement and advice, especially during the writing phase, I am not sure that I would have been able to finish. Likewise, encouragement from Professor Ken Sauer, my third committee member, has helped me to keep working on my dissertation. I have been blessed by the social and scientific companionship of the many other members of the Chandler group over the years. Without the tutelage and foundational work of Massimo Marchi, I would not have even been able to start this work. Massimo taught me how to do molecular dynamics and kindly allowed me to use orac in my research. I thank Michael New for his scientific partnership and mentorship as we have worked together on this research—especially for his interpretation of the Steffen– Boxer experiments. John Gehlen—through his consistent, kind encouragement and cheerful disposition, as well as his solid scientific judgment—has been a great blessing to me during my time here. F´elix Csajka too has been a great lab buddy to me; I have certainly enjoyed numerous stimulating conversations over lunches and coffees with F´elix. Thanks to Alenka Luzar who has given me astute counsel over the many years. I would also like to thank Zoran Kurtovi´c for the conversations, scientific or otherwise, that have sharpened my understanding of science (and politics). Finally, thanks to other members of the Chandler group who have taught and encouraged me in big and little ways: Kevin Leung, Jordi Marti, Joel Bader, Paul Rejto, David Wu, Mike Deem, Ka Lum, Hyung–June Woo, Carlo Carraro, Lothar M¨ uhlbacher, and all the new members whom I have just started to get to know. A number of staff people in the College of Chemistry have been instrumental in my graduate work. Tim Robinson taught me how to use avs and much of what

xi I know about three–dimensional computer visualization. More importantly, he was impressively knowledgeable, courteous, and helpful in dealing with a great deal of my computing problems. I am thankful for Mary Hammond who cheerfully helped me through the bureaucracy and hoops of most of my graduate career. I would like to thank Maureen Kamiya for how she has stepped in to make life more efficient and cheerful in the office. I have been grateful for the flexibility afforded by the Biophysics Graduate Group. Val Adams has helped me to steer my way about this university as much as anyone else has. Professor Lecar has been an encouragement to me also, during both my qualifying exam and the time I was applying for postdoctoral positions. I would like to acknowledge the stimulating conversations with other workers in the field: Drs. Marilyn Gunner, Barry Honig, and Bill Parson. Finally, I come to the most daunting task—that of acknowledging everyone else who has been a blessing—in small or large ways—to me in my time at Berkeley. I used to fantasize around writing this section, about thanking each person in a unique and particular way for what he or she has meant to me. Now, I realize that I cannot even begin to do this task properly. I am a vastly different person today than when I first got here. A big reason has been the many people who have loved and cared for me enough to bring about these changes. I can certainly identify broad constituencies of folks involved: my parents and sisters, old and dear friends from Canada, people from the Graduate Christian Fellowship (gcf), fellow residents from the International House, my housemates of the last two years, members and friends of First Presbyterian Church of Berkeley, co–workers in atdp (the Academic Talent Development Program). May these people (who know who they are, I believe) forgive me for not naming each of them individually. I have tried to live a life of thankfulness and gratitude—may I be given the wisdom, words, and opportunities over the years to come to continue to individually thank everyone of the many tens (even hundreds) of men and women to whom I owe more than they even know.

1

Chapter 1 Introduction In 1984, the X–ray crystal structure of the photosynthetic reaction center (prc) of the purple bacterium Rhodopseudomonas viridis (Rps. viridis), a purple bacterium, was solved by J. Deisenhofer, H. Michel and R. Huber. The prc, a protein complex involved in the first steps of photosynthesis for the organism, is a prototypical system in various contexts. Elucidating the function of this prc has been helpful in understanding photosynthesis in other organisms, including higher plants. Because the prc was the first and remains one of the few membrane protein structures known to atomic level resolution, it is at the heart of the current understanding of membrane proteins in general. Finally, our growing insight into the prc is shedding light into biological electron transfer in general. Solving the crystal structure was, by no means, the first study of the photosynthetic reaction center. Much concerning the structure–function relationship had already been surmised, often through rather ingenious experimentation and reasoning. However, atomic level structural information enabled many of these theories to be tested. Perhaps more importantly, the X–ray structure opened whole new vistas of experiments and theories that had previously been inconceivable. This dissertation is concerned specifically with the primary electron transfer (et), the first of several charge transfers in bacterial photosynthesis. We first construct models for the primary et based on the X–ray crystal structure for the prc and a specific physical theory for et. With these models, we do the following: 1) calcu-

CHAPTER 1. INTRODUCTION

2

late relevant physical quantities of the models through various computational techniques (primarily, molecular dynamics and continuum electrostatics calculations) and 2) study the fundamental physics of the models through examining the models’ statistical and collective properties. Ideally, such an approach would provide physical, computational explanations that would not only be consistent with experimental findings but also explain their physical basis. Hopefully, the work would also stimulate new theoretical and experimental investigations into the primary et. This introductory chapter provides an overview of and relevant background to the dissertation. The first section is a description of the overall photosynthetic process and the prc specifically. The second section describes the primary electron transfer, its kinetics and the relevant energetics. We first review some of the major questions under study concerning the prc. After providing the relevant experimental background, we then sketch the theoretical and computational techniques used in the dissertation: a brief review of Marcus theory which underlies our computations followed by an outline of molecular dynamics and continuum electrostatics calculations. The chapter closes with an outline of the rest of the dissertation.

1.1

The photosynthetic reaction center

Occurring in organisms ranging from bacteria, cyanobacteria to higher plants, photosynthesis is the process by which light energy is converted into biologically useful chemical energy. Light energy fuels electron transfer among light energy– transducing pigments, resulting in the establishment of proton gradients (through the coupling of et with proton transfer.) The energy of these gradients is ultimately translated into the generation of energy rich compounds (such as atp) and high– energy reductants (such as nadp). Although there are myriad variations in the exact mechanisms employed among organisms, two large–scale distinctions can be drawn. Photosynthesis in procaryotes occurs within the membrane of the organism, whereas eucaryotic photosynthesis takes place in specialized organelles. The second distinction is between anoxygenic photosynthesis occurring in photosynthetic bacteria (organisms which cannot oxidize water) vs the oxygenic photosynthesis of cyanobacteria and

CHAPTER 1. INTRODUCTION

3

higher plants, which oxidize water and release oxygen [1]. We focus on photosynthesis in the purple bacterium Rhodopseudomonas viridis, but examine photosynthesis also in two other bacteria, Rhodobacter capsulatus and Rhodobacter sphaeroides. Photosynthesis in purple bacteria is carried out by a series of related transmembrane proteins. (See Figure 1.1) The antenna complex focuses and funnels light energy to the prc. In the prc, this light energy powers the movement of electrons through the complex, from the periplasmic to the cytoplasmic side of the membrane. When two electrons reach QB , the doubly reduced QB picks up two protons, leaving the prc as dihydroquinone, QH2 . The protons are transported across the membrane with the help of the cytochrome bc1 complex to establish a charge gradient that is used to fuel the production of high energy chemical compounds. The prc of Rps. viridis has been determined to 2.3 ˚ A resolution by X–ray crystallography [3]. In this work, we also refer to two other species of purple bacteria. Published structures for the prc of the wild-type strain of Rb. sphaeroides are at 2.65 ˚ resolution [4, 5].1 (Other strains of Rb. sphaeroides have also been crystallized: RA 26 (to 2.8 ˚ A) or 3.2 ˚ A and strain Y (to 3 ˚ A) [7].) Many studies (especially site-directed mutagenesis) have been conducted on Rb. capsulatus although the structure for the prc of this species has yet to be solved. Our models of the prc for Rps. viridis include approximately 12000 atoms, although only 10288 of these atoms are resolved by X-ray crystallography. The prc ˚ in the diis a membrane bound protein complex, measuring approximately 130 A rection perpendicular to the membrane and 70 ˚ A in girth. It comprises four protein subunits (denoted as l, m, h, c) and fourteen cofactors (See Figure 1.2). The cofactors are four bacteriochlorophyll b molecules, two bacteriopheophytin b molecules, one menaquinone 9, one ubiquinone 9, one ferrous iron ion, a carotenoid 1,2–dihydroneurosporene, and four heme groups covalently linked to the cytochrome (c). Two of the bacteriochlorophyll b molecules are closely associated as a dimer, called the special pair (SP). The two other bacteriochlorophyll b (BL and BM ) and the two bacteriopheophytin b molecules (HL and HM ), along with two of the encapsulating proteins (l 1

Douglas Rees, in collaboration with the research group of George Feher, has been studying higher resolution structures [6]. [Private communication, 1997]

CHAPTER 1. INTRODUCTION

4

Periplasm cyt b-c1 complex reaction center H+ e-

H+ light ATPase

SP BM

Antenna BL

HM QB QH 2

HL 2e-

QA

2H+

ATP, etc. Cytoplasm Figure 1.1: A schematic diagram of the photosynthetic apparatus in Rps. viridis This diagram shows the relationship among the various proteins involved in the light and dark cycles of Rps. viridis. The antenna complex focuses and funnels light energy to the prc. In the photosynthetic reaction center (prc), this light energy powers the movement of electrons through the complex, from the periplasmic to the cytoplasmic side of the membrane. Within the prc, two of the bacteriochlorophyll b molecules are closely associated as a dimer, called the special pair (SP). The two other bacteriochlorophyll b (BL and BM ) and the two bacteriopheophytin b molecules (HL and HM ) form two near–C2 symmetric branches. Also shown are the two quinones (a menaquinone, QA , and a ubiquinone, QB ) When two electrons reach QB the doubly reduced QB picks up two protons, leaving the prc as dihydroquinone, QH2 . The protons are transported across the membrane with the help of the cytochrome bc1 complex to establish a proton gradient that is used to fuel the production of atp. Recent studies indicate that the antenna surrounds the prc in the membrane [2].

CHAPTER 1. INTRODUCTION

5

and m) form two near–C2 symmetric branches. The two symmetry–related quinones (QA and QB ) are also part of these branches. Figure 1.3 is a detailed display of the bacteriochlorophyll, bacteriopheophytin, quinone molecules, and non–heme iron ion. In addition, there are 201 crystallographically resolved water molecules in the Rps. viridis structure. We do not include in our models, any water molecules that are not resolved in the X–ray crystal structure. The h–subunit extends from the cytoplasmic membrane surface, while the c–subunit sits on the periplasmic membrane side [8]. In addition to the interior of the prc, we need to model the surroundings of the complex. Functional studies have provided evidence for the close proximity of the prc and lhi (light harvesting complex I) in purple bacterial photosynthetic membranes. Some workers have argued that a single prc unit is surrounded by a ring of lhi units, while others contend that a single prc fits inside the closed ring of a single lhi unit [2]. Because of the computational techniques used to calculate the electrostatics of the prc (see Section 1.3.3), we are limited to modeling the regions surrounding the prc in terms of two homogeneous dielectric regions. In other words, we do not attempt to model the exterior in atomic level detail. Specifically, we approximate the prc as being embedded in a homogeneous lipid bilayer membrane and surrounded by an aqueous environment. Therefore, relevant parameters to the computational modeling are the size and location of the membrane. We turn to two studies for help. Through low-resolution neutron diffraction on the Rps. viridis prc crystals, Roth et al. studied the detergent involved in the packing of the crystals. They found that the detergent was concentrated around the membrane–spanning α helices of the l, m, and h proteins subunits. Roth et al. argued that the width of the contact surface between the helices and detergent (25–30 ˚ A) is a good estimate for the thickness of the membrane in which the prc is embedded. In contrast, Yeates et al. estimated the thickness and location of the membrane surrounding Rb. sphaeroides to be 40–45 ˚ A in width by calculating the membrane position that minimizes the energetic interaction between the prc and a model lipid bilayer [9]. A striking feature of the prc is its near–C2 structural symmetry. There is approximate local two–fold symmetry, whose axis is perpendicular to the membrane. (This symmetry is reviewed by Bixon et al. and Deisenhofer [7, 8].) First, the 216 α–carbons

CHAPTER 1. INTRODUCTION

6

C

M L

Fe H

Figure 1.2: The structure of the prc of Rps. viridis The four protein strands of the prc (h, c, l, and m) are displayed as ribbons. The bacteriochlorophyll, bacteriopheophytin, and quinones, as well as four hemes embedded in c, are drawn in space–filling fashion. Notice the the near–C2 symmetry of the central core (whose axis runs from the special pair to the non–heme iron (represented as a sphere and labelled as fe.)

CHAPTER 1. INTRODUCTION

P

7

P L

M

B

B

M

L

H

H

M

Q

Q

B

L

A

Fe

Figure 1.3: Central cofactors of the prc of Rps. viridis The central cofactors include four bacteriochlorophyll b, two bacteriopheophytin b, and two quinone molecules. Two of the bacteriochlorophyll b molecules are closely associated as a dimer, called the special pair (SP). The two other bacteriochlorophyll b (BL and BM ) and the two bacteriopheophytin b molecules (HL and HM ), along with two of the encapsulating proteins (l and m) form two near–C2 symmetric branches. Also shown are the non–heme iron and the two quinones (a menaquinone, QA , and a ubiquinone, QB )

CHAPTER 1. INTRODUCTION

8

of m can be roughly superimposed onto the corresponding α–carbons on the l-side ˚ [10, 11, 12]. Second, there by a near 180◦ rotation with a rms deviation of 1.22 A is a similar, but not identical, symmetry involving the cofactors. According to Allen et al. [13] and Komiya et al. [10], the cofactor rings of one branch can be mapped to the other within approximately 1 ˚ A by a two–fold rotation. However, as pointed out by Deisenhofer et al. [8], the exact rotation operations to map a given cofactor to the corresponding symmetry–related chromophores differ slightly from one another. Finally, within the prc of Rps. viridis exists a local near–C2 symmetry, unrelated to the symmetry within the core region of the prc [8]. The observed C2 symmetry is generally thought to be a reflection of the homology between the l and m protein strands, derived from a common evolutionary ancestry. (Refer to Chapter 5 for a detailed account of homology as applied to the prc.) [14] The structure of the prc for Rps. viridis has much in common with that of Rb. sphaeroides. Both possess a similar near–C2 symmetry (as described above), the same basic arrangements of chromophores and protein strands. The major differences are that the Rps. viridis prc has the fixed cytochrome unit (with the 4 associated heme groups), and its bacteriochlorophylls are of type b, rather than type a in Rb. sphaeroides, while QA for Rps. viridis is a menaquinone instead of an ubiquinone [7]. The overall function of the prc is to facilitate a series of electron transfers that ultimately fuel a transmembrane proton gradient. The energy required to create this gradient comes from absorbing energy from the antenna complex and powers a series of charge transfers to form various charge states of the prc. In this dissertation, we make use of the nomenclature outlined in Table 1.1 to denote the various charge states of interest. We also will refer to the free energy of electron transfer (∆G). The free energy of et from state i to state j is denoted ∆Gij = Gj − Gi , where Gi and Gj are the free energies of states i and j, respectively. For example, ∆G13 is the ∗ free energy difference between P+ H− L and P . We use a similar convention for other

energy terms mentioned in subsequent chapters. Upon photoexcitation, the special pair (SP) changes from the ground electronic state (P) to the first excited state (the lowest excited singlet state) (P∗ ). On a roughly 3.5 ps timescale, an electron is transferred from SP to HL , to form P+ H− L

CHAPTER 1. INTRODUCTION

9

Table 1.1: Nomenclature for charge states under consideration State number

Charge state

0

P

1

P∗

2

P+ B − L

3

P+ H− L

2

P+ B − M

3

P+ H− M

4

P+ Q− A

5

P+ Q− B

P denotes the ground state of the special pair, while P∗ denotes the photoexcited state of the special pair. The numbering system (1 to 5, 2 to 3 ) is used to refer to the different charge states. P+ B− L denotes the charge–separated state with the electron on the accessory bacteriochlorophyll on the l branch. P+ H− L denotes the charge separated state with the electron on the bacteriopheophytin on the l branch. + − P+ B − M and P HM are the corresponding charge states for the m branch chromophores. + − P+ Q− A represents the charge–separated state with the electron on QA , while P QB

represents the charge–separated state with the electron on QB .

CHAPTER 1. INTRODUCTION

10

(the primary electron transfer). In successive steps, the electron moves from HL to + − QA to form P+ Q− A on a 200 ps timescale and then to QB , forming P QB on a 200 ns

timescale. When a second electron reaches QB in the same series of transfers starting from SP, the two electrons, together with two protons drawn from the cytoplasmic side of the membrane, are shuttled through the membrane to the periplasmic side. This series of charge transfers is illustrated schematically in Figure 1.4.

*

P

3 ps

30.0

+



∆G (kcal/mol)

P HL

200 ps

20.0

+



P QA 10.0

0.0

P

Figure 1.4: The energetics of charge transfers in the prc of Rps. viridis Various electron transfer states of interest in the primary transfer are plotted versus their free energy. (The free energy of P is taken to be the zero of energy.) P∗ is approximately 32 kcal/mol above P [15]. The value of ∆G13 is discussed in Section 1.2. Here, it is displayed as -6 kcal/mol. Estimates for ∆G14 include -18.4 kcal/mol [16], -19.8 kcal/mol [17] and -20.1 kcal/mol [15].

CHAPTER 1. INTRODUCTION

1.2

11

Primary electron transfer

Although much is known about the primary transfer, a number of important and yet unsettled questions remain. In this section, we first review what is known (considered noncontroversial) about the primary transfer and then review the debate over the key questions surrounding the primary transfer. Early experiments on the primary transfer [18], using picosecond pump–probe absorption spectroscopy, supplied evidence that the primary et occurs on a 3 ps exponential timescale (that is, the electronic population of P decays exponentially with a time constant of 3 ps as the electron is transferred to HL ). Later experiments [19] led researchers to consider the transfer kinetics to be actually non-exponential. Roughly speaking, there are at least two relevant timescales, approximately 1 and 10 ps, respectively. Figure 1.4 shows the energetics of various states. For only one of the charge states possibly associated with the primary transfer is the free energy known with some accuracy: ∆G13 , the energy of P∗ → P+ H− L . However, controversy surrounds the measurement of even this number. Estimates for ∆G13 have varied from -2.8 kcal/mol [20] to -4 kcal/mol [21, 16] to -6 kcal/mol [22]. Woodbury et al. measured the ∗ delayed fluorescence amplitude associated with the activated reaction P+ H− L to P [21,

23], while Goldstein et al. measured the free energy change of the back reaction ( 3 P to 3 P+ H− L ), combined with a knowledge of the energy of P directly to obtain ∆G13 [22].

A possible reason for the discrepancy among the various measurements is the time– scale associated with the particular spectroscopic technique. Indeed, Peloquin et al. [20] observe time–dependent behavior associated with their measurement of ∆G13 : the magnitude of ∆G13 grows until it reaches 70% of the -6 kcal/mol measured by Goldstein et al. Boxer argued that this figure must represent a maximal value for the driving force [15]. There are some hints of temperature dependence in ∆G13 also [20, 15]. (A review of the measurement of ∆G13 is given by Peloquin et al. [20].) Knowing the energetics of other charge states is key to understanding the primary transfer. There are no experimentally based estimates of the free energies of transfer down the m–branch. Estimates of ∆G12 are hotly contested because they are intimately tied to the related issue concerning the role of the accessory bacteri-

CHAPTER 1. INTRODUCTION

12

ochlorophyll (See Section 1.2.1 for a full discussion.) There is, however, consensus that ∆G12 > ∆G13 . Estimates based on various assumptions for ∆G12 range from -1.4 kcal/mol [24] to 0.2 ± 0.6 kcal/mol [25] to 19 kcal/mol [26] to 2–9 kcal/mol [27] to roughly -3 kcal/mol [28, 29, 30]. The rate of the primary et was observed to increase as the temperature goes down. At 10 K, the initial decay rate of P∗ is actually three to four times faster than it is at room temperature [18]. (As we elaborate in Section 1.3.1, this temperature– dependent behavior implies that the primary transfer is an almost activationless process.) Finally, within experimental uncertainty, the quantum yield of the primary transfer is unity (at least, more than 98%) [17].2 That is, the photoexcited state P∗ always results in electron transfer (instead of decay back to the ground state). Finally, the primary transfer occurs down the l-branch chromophores (to form P+ H− L ) rather than down to the m-branch (to form P+ H− M ) with a ratio of at least 200 : 1 [31]. Despite the wealth of experimental knowledge concerning the primary electron transfer, a number of important unresolved questions remain. We enumerate them here, describing the specific controversy and the current state of the debate surrounding the questions.

1.2.1

Is the primary transfer a superexchange or a two–step process?

In the primary transfer, the electron is clearly transferred from SP to HL . A natural question is whether this transfer involves any intermediates. A calculation of the expected rate of direct in vacuo electron transfer occurring over a distance of 17 ˚ A (the center to center distance between SP and HL ) shows that such a transfer would be several orders of magnitude slower than the observed experimental kinetics [32, 33]. It is believed that the primary et must therefore involve some intermediary. Various intermediates have been conjectured, including TyrM208 (situated very close to all three of the l side chromophores, PL , BL , and HL ). However, it is thought that the 2

Note that this result applies only to the case of low light intensity, in which reaction centers are mostly “open.” In high light intensity, there can be saturation in the l states, not permitting et to proceed normally along the l states.

CHAPTER 1. INTRODUCTION

13

reduction of TyrM208 would be too energetically costly for the residue to act as the intermediate [34, 35]. BL is now generally thought to be the intermediary in the primary et. The present debate has thus shifted away from the identity of the intermediary to its exact role in the primary transfer. Two dominant competing hypotheses concerning the role of BL have been proposed. The first is the so called two–step mechanism in which the primary transfer is postulated to involve two explicit charge transfer steps: the electron hops to BL , reducing the chromophore and forming a real intermediate electronic state (P+ B− L ), before ultimately arriving at HL . The alternative scenario involves the superexchange mechanism. A more quantitative explanation of superexchange is given in Section 1.3.1. In the superexchange picture, P+ B− L is a virtual state: “The accessory Bchl couples to the initial (P∗ ) and final states (P+ H− L) with some of its higher energy orbitals to mediate the electron transfer.” In other words, “P+ B− L is not a true chemical intermediate.” [36] The debate between the two–step and the superexchange camps continues to be one of the most heated debates concerning the primary transfer. No definitive conclusion has been reached. Knowing ∆G12 (the thermodynamic driving force between P∗ and P+ B− L ) would be helpful in resolving the debate between the two–step and superexchange hypothesis. In order for an explicit intermediate to be formed, the ∗ free energy of P+ B− L must be no more than several kcal/mol above that of P (it can

be below that of P∗ .) In contrast, superexchange is possible even when the energy ∗ of P+ B− L is much higher than that of P . As mentioned above, there is no general

agreement concerning the free energy of P+ B− L . Computational and theoretical efforts have been directed at calculating ∆G12 , but conclusions have been contradictory. The calculations of Warshel, Parson, et al. [28, 29, 30] place ∆G12 at roughly -3 kcal/mol, Marchi et al. [26] calculate it to be +19 kcal/mol, and Gunner et al. estimate ∆G12 to be about 2 to 9 kcal/mol [27]. Warshel, Parson, et al. contend that a two–step mechanism is most likely (although they do not rule out a superexchange mechanism), while the results of Marchi et al. and Gunner et al. definitely point to a superexchange picture. Proponents of the superexchange picture argue that the explicit intermediate state

CHAPTER 1. INTRODUCTION

14

P+ B − L has never been directly observed spectroscopically (that is, no spectrally distinct states between hundreds of fs and 15 ps has been observed at low temperature) [37]. However, others disagree with this assessment (see the next paragraph.) The vibrational coherence of P∗ decays at the same rate as that at which P+ H− L is formed [38, 39]. In the DLL mutant in Rb. capsulatus, there is no functional HL , yet there still is no sign of transient reduction of BL [38]. Woodbury et al. have devised a method for changing the number of hydrogen bonds to SP, thereby shifting the energy of P∗ in Rb. sphaeroides. They constructed a “double mutant” [40] in which the energy of P∗ is estimated to be lowered by 3.2 kcal/mol. They conclude that a sequential mechanism is inconsistent with the pattern of complex decay of P∗ seen at low temperature for the mutant or with the multitudes of timescales in fluorescence and absorption changes. Furthermore, Woodbury et al. constructed a triple mutant (with the addition of 3 hydrogen bonds), which presumably involves a shift of 6 kcal/mol [37]. Such a shift would be expected to make P∗ isoenergetic with + − ∗ P+ H− L and P BL significantly above P . They argue that a sequential mechanism

could not be at work in this mutant and that this mechanism would probably not be operational in wt. Finally, Boxer and coworkers measured the angle between the fluorescence of P∗ and absorption transition moment to measure the state against which P∗ is competing. The observed angle matches more closely that of P+ H− L instead of P+ B − L —hence, they concluded that their experiments pointed to the superexchange mechanism [15, 41, 42]. Supporters of the two–step mechanism for the primary transfer counter that recent femto–second transient absorption experiments clearly reveal transient reduction of the intermediate state P+ B− L [43, 44, 24, 45]. Moreover, Kirmaier et al. claim that recent experiments on the β mutant lend support to the two–step mechanism. They argue that P+ I− in the β mutant, a purportedly quantum mixed state of both P+ B− L ∗ and P+ β − (the analog to P+ H− L in the wt systems), is 1.7 ± 0.8 kcal/mol below P . + − Hence, P+ B− L in the wt system must be close to that of P HL , making the two–step

model a strong possibility [46, 47, 48]. Arlt et al. created the Rps. viridis mutation (H (L168) F),3 resulting in a loss of a hydrogen–bond to BL that they claim decreases 3

The notation used to desribe site–directed mutagensis in this dissertation is of the form original

CHAPTER 1. INTRODUCTION

15

∆G12 by 1.8 kcal/mol. They report a speed-up in the primary et (from 3.5 ps to 1.1 ps), strong bleaching, and clear spectroscopic signs of P+ B− L [25]. Finally, Moser et al. applied external electric fields expected to shift ∆G13 by up to 2.3 kcal/mol. Because the change in the primary transfer they observed was much smaller than what could be expected for a one–step superexchange mechanism, they argue that their results would be more consistent with a two–step mechanism [49].

1.2.2

Why does the transfer proceed down the L branch and not the M branch?

As described in Section 1.1, the prc is characterized by a high degree of near–C2 rotational structural symmetry. At the same time, the primary et is an example of marked C2 asymmetry; the electronic population traveling down the l side is at least 200 times greater than that going down the m side [31]. Moreover, removal of HM does not significantly change the rates or the yield of P+ H− L [18]. Explaining how such extreme functioning asymmetry can exist despite strong structural symmetries is certainly one of the outstanding questions concerning the prc. Many hypotheses purport to explain the l–m functional asymmetry. Some workers have hypothesized that subtle structural asymmetries in the chromophores (rather than the protein environment) are the key cause of the preference for et down the l branch over the m branch. Plato et al. [50] calculated the electronic coupling between adjacent chromophores. With these couplings and estimates for the reorganization energies and free energies of transfer (assuming a superexchange mechanism), they calculated that et down the l branch is favored over that down the m branch in Rps. viridis by a factor of 33. However, this type of calculation could not account for the asymmetry of Rb. sphaeroides. DiMagno et al. [36] conclude that it is unlikely that differences in electronic coupling could be the sole or even dominant mechanism for l–m asymmetry. Perhaps it is small structural differences between l branch and m branch chroamino acid (residue location) new amino acid. In this specific case, the residue #168 of l is mutated from a histidine (H) to a phenylalanine (F).

CHAPTER 1. INTRODUCTION

16

mophores that are key to breaking functional asymmetry. An obvious place to begin looking is the special pair. The van der Waals overlap [51] between PM and BL is larger by a factor of 1.5 than the overlap between the symmetry–related pair [7]. Moreover, based on their INDO/S calculations, Plato et al. [50] argue that the electron is mostly localized to PM in P∗ . The picture that forward et is the ejection of an electron from PM to the l side chromophores is not contradicted by these two observations. This asymmetry was also calculated to be strongly dependent on surrounding amino acids. One would then expect to test the model by altering the amino acid environment of SP. The heterodimer mutants have been one such example. The mutant H (M202) L causes PM to be replaced by a bacteriopheophytin, while the complementary mutant H (L173) L causes PL to be so substituted. Since Bchl is easier to oxidize than BPhe [7], one would expect the electron to leave mostly from PL in the H (L173) L mutant and from PM in the H (M202) L mutant. Because of the asymmetry in the overlap, one would expect then that there should be profound difference in the rate of et between the mutants. The rates differ by a factor of only roughly 3, reflecting the robustness of the system to changes [7]. McDowell et al. also conclude, therefore, that the differential overlap in the chromophores is not the primary mechanism for l–m asymmetry [52]. Others [26, 28, 27] have focused on the protein environment surrounding the chromophores, suggesting that it shifts any intrinsic energetics of et to favor transfer down the l side. Boxer et al. have applied external electric fields to determine whether such + − energy differences could be at work. Since the dipole moments of P+ B− L and P BM

are approximately anti-parallel,

4

electric fields that raise the energy of one state are

expected to lower the energy of the other state in any given reaction center. The isotropic samples used by Lockart et al. would yield a distribution in shifts for the + − energy of P+ B− M with respect to P BL . Specifically, with the applied external field of 106 V/cm = 10−2 V / ˚ A = 4.6 kcal/mol (over a 20 ˚ A distance), et to P+ H− M would

be expected in some number of appropriately oriented reaction centers—if there were only small differences in free energy between the two states. Since P+ H− M has not been 4

The angle between them is estimated to be about 155◦ [53].

CHAPTER 1. INTRODUCTION

17

observed, not even reaction centers with the greatest possible lowering of the energy + − of P+ B− M with respect to P BL (due to the external field) demonstrate m branch et. + − The possibility that there is a large energetic difference between P+ B− L and P BM is

left open by these experiments [15, 53, 16]. By comparing electric fields at the chromophores in the P∗ state with fields existing in the charge–separated P+ Q− A , Steffen et al. [54] estimated an effective dielectric constant at the chromophores. They found a stronger dielectric screening on the l branch over the m branch chromophores and concluded that differential dielectric strength is a possible mechanism for the asymmetry. (Refer to Chapter 3 for a detailed discussion of this mutant.) Various experiments have been directed at locating particular amino acids that either individually or collectively are responsible for the l–m asymmetry. Other investigators have argued, on the contrary, that the l–m asymmetry is not due to the behavior of a small number of amino acids, but rather the result of large–scale collective effects [7]. A fair amount of work has gone into examining the hypothesis that the protein environment is responsible for the asymmetry in et. Site–directed mutagenesis has been used to perform both small scale and large scale alterations. The l and m proteins are considered to be homologous proteins. Sequence alignments have been performed to match residues on one protein with the corresponding residue on the other protein. Pairs of matched residues involving different amino acids are of interest to experimentalists. These amino acid pairs might be the ones that break the symmetry of the protein environment in a way that leads to asymmetric et between the l and m sides of the prc. The mutation Glu (L104) L [55] removes a hydrogen–bond to the l side HL , which is not present on the m side. It is estimated that this mutation would make it 50 meV (approx 1 kcal/mol) harder to reduce [16]. Nevertheless, et continues down the l branch. Perhaps the most amount of work aimed at elucidating the role of a single amino acid is that involving TyrM208 (in Rb. capsulatus and in Rps. viridis) and in TyrM210 (in Rb. sphaeroides). We use the M208 numbering since our studies focus on the Rps. viridis prc. TyrM208 has been of particular interest for two reasons. First, the residue is situated close to all three of the l side chromophores, the sites of the

CHAPTER 1. INTRODUCTION

18

primary et. Second, the difference between the residue and its l side symmetry– related residue, Phe L181, was conjectured to be key source of the functional l–m asymmetry of the prc. Moreover, TyrM208 was identified by calculations of Warshel, ∗ Parson, et al. to be significant in lowering the energy of P+ B− L with respect to P [28].

Consequently, to examine the importance of TyrM208 (and Phe L181) to the primary et, various mutations on either or both residues were made. The tyrosine was changed to phenylalanine [56, 57, 58, 59, 60], tryptophan [61, 62, 56], isoleucine [56, 60], and histidine [57]. Moreover, the symmetry–related pair was swapped [57]. The kinetics of the primary et were changed, but the primary et still proceeded down the l side. Hence, it was demonstrated that these two residues are not solely responsible for the l–m asymmetry in et. (See Section 2.3.5 for further discussion of mutations related to TyrM208.) So many small–scale alterations had been performed without any major changes in the primary kinetics that workers began to lean towards the hypothesis that the l–m asymmetry is a result of global structural features of the prc. A possible exception to that generalization is a set of mutants by Heller, Holten, and Kirmaier [63]. A detailed description of this system is given in Chapter 4. The Heller double mutant involves two amino acid changes: L (M212) H and G (M201) D. Up to 15% of the electronic population of P∗ is thought to decay to P+ H− M. To explore the hypothesis that the functional asymmetry of the prc is the result of the interactions of many (rather than few) amino acids, a number of studies have been aimed at replacing entire α–helices to increase the degree of symmetry between the l and m proteins. The biggest changes reported thus far is a set of nine large– scale symmetry mutants constructed in Rb. capsulatus replacing m subunit genes with homologous parts of l [64, 65]. The amino acid residues in question comprise about 80% of the residues that come into close contact with the central cofactors. The primary kinetics is not strongly changed by the mutations around the cofactors. However, the organism does lose photosynthetic viability in mutations of some amino acids around the quinones.

CHAPTER 1. INTRODUCTION

1.3

19

Overall issues, questions, and methodology

In the previous sections, we describe the structure and function of the prc after first placing the prc in the context of the entire bacterial photosynthetic process. After discussing the well-quantified experimental knowns of the system, we focus on the debates that surround the major open questions. In the next sections, we turn to describing the approaches we take in the dissertation in tackling these questions. In Section 1.3.1, we describe the Marcus theory of electron transfer, the physical theory which provides the conceptual framework for our entire study. The application of Marcus theory to the prc points to the importance of a number of physical quantities to the et process—in particular, the electrostatic potentials and fields within the prc. Calculating these electrostatic properties requires two computational techniques (molecular dynamics and continuum electrostatics calculations), which we describe in Sections 1.3.2 and 1.3.3, respectively. With this physical picture and these computational tools in hand, we hope then to calculate functional properties of the prc which would not only agree with relevant experimental findings but supplement experiments by opening new avenues of exploration.

1.3.1

The basic physical picture: Marcus theory of electron transfer

Electron transfer (et) is one of the most important classes of elementary chemical and biochemical processes. It encompasses a vast range of phenomena, from basic redox reactions (such as rusting and the working of batteries) to vast coordinated movement of electron in exquisitely structured biomolecular complexes. Biological electron transfer itself, of which electron transfer in the prc is a famous example, spans many systems, including cytochrome c oxidase (involved in aerobic respiration) and flavocytochrome b2 (which catalyzes the transfer of electrons from l-lactate to cytochrome c) [66]. Because et encompasses a wide range of phenomena, explaining et in one coherent theoretical framework has been a deep challenge. The single seminal advance has probably been the development of Marcus theory [32], which

CHAPTER 1. INTRODUCTION

20

has revolutionized our understanding of electron transfer. The discussion that follows includes aspects of the theory relevant to et in the prc. (Some of the following argument is close in spirit to that found in John N. Gehlen’s dissertation [67].) In the following discussion, we consider a physical system in which an electron can be in one of two positions: on a “donor” site or on an “acceptor” site. We can, therefore, define two states for the system: the donor (D) state in which the electron is on the donor site and the acceptor (A) state in which the electron is on the acceptor site. If the donor and acceptor molecules are in vacuum, the physics of the system can be elucidated by solving the quantum mechanics of a two–state hamiltonian system. The result is well-known: the electronic density resonates between the two states. In other words, the electronic population is delocalized over the two states. Electron transfer between two redox sites embedded in a polar solvent behaves differently. The polar solvent reorients in response to a charge state, solvating the charge complex and thus lowering the energy of that complex. As the magnitude of this solvation energy increases, the electron becomes increasingly localized. In the strongly solvated regime, a well-defined forward rate constant for et can be defined. Hence, electron transfer occurs along a continuum of conditions, from strong charge delocalization to strong charge localization. Consider the energy of the electron, due to interactions with the rest of the system, in the donor state (ED ) and acceptor state (EA ). As the spatial configuration of the solvent fluctuates, ED and EA vary in time. Marcus postulated that in the strongly solvated regime, only when the two charge states are isoenergetic (i.e., ED = EA ) is there a significant probability for an electron to transfer from one site to the other. The polar solvent shifts the intrinsic in vacuo energy levels for the charge states. When the electron is on the donor, the solvent is polarized to be in an energetically favorable equilibrium configuration with respect to the electron-donor complex. However, as the solvent fluctuates (through nuclear motion of the solvent), it on occasion brings the two charge states into isoenergetic degeneracy—at which time, the electron has an opportunity to hop (with some probability) to the acceptor. If electron transfer does occur, the solvent then re-equilibrates to the newly formed electron-acceptor complex. Hence, the kinetics of et depends on two factors: 1) the rate (or probability) at which

CHAPTER 1. INTRODUCTION

21

the difference between the energy levels of the two states (the so-called energy gap) is zero and 2) the probability factor (p) of actually hopping once the energy gap is zero. Implicit in the first factor is a free energy barrier between D and A which helps determine the probability of the energy gap’s being zero. The second term (p) is dependent on the electronic coupling (K) between the two states. We can identify two different sub–regimes of electron transfer: the so-called diabatic vs. adiabatic regimes. The diabatic regime is the regime of small K. That is, the typical zero-crossing of the energy gap does not involve an electron transfer because p  1. In the diabatic regime, the natural basis set of charge transfer states is the charge localized set. In contrast, the adiabatic regime is that of large K. Below, we concentrate exclusively on the diabatic regime because we believe that diabatic regime adequately describes the et of the primary transfer [67]. As mentioned above, the rate constant of electron transfer is related to the the rate of occurrences for zero crossings for the energy gap between the donor and acceptor state and the hopping probability p. The energy gap is a time–dependent quantity, fluctuating about its mean value with some probability distribution. The heart of Marcus theory is the quantitative statement that this distribution is actually a gaussian (or normal) distribution. (See Equation 1.2 below.) Amazingly enough, this simple assumption implies a formula for the rate of et that has been successfully applied over a wide range of et. (A rough intuitive justification can be given for this gaussian hypothesis. For instance, because the energy gap is the sum of numerous individual contributions, one might expect the energy gap to approach a gaussian distribution because of the central limit theorem.5 ) With a probability distribution for the collective coordinate (the energy gap), one can derive a free energy (F ) for this coordinate. We will do so here. Recall once again the donor and acceptor states that we have mentioned above. The energy gap (∆EDA ) can be written as ∆EDA = EA − ED . When the prc is in the donor state, ∆EDA has some mean value, ∆EDA D , which we call ∆. (We use the notation 5 Let x1 , . . . , xn be n independent random variables n having a common expectation µ and finite variance σ 2 . Form the random variable Xn = i=1 √ xi /n. Roughly speaking, the central limit theorem states that the distribution of (Xn − µ)/(σ n) approaches a normal distribution of mean 0 and variance 1 as n → ∞ regardless of the distribution of xi [68].

CHAPTER 1. INTRODUCTION

22

. . .D to mean the average of the quantity . . . in the donor state.) Instead of using the energy gap, ∆EDA , as the collective coordinate in a Marcus description, we use: E = ∆EDA D − ∆EDA = ∆ − ∆EDA ,

(1.1)

the fluctuation of the energy gap from its ensemble average. A free energy for E can be derived through the relationship F (E) = −β −1 ln(p(E)) where p(E) is the probability distribution for E, the energy gap [26], and β = 1/kB T (kB is the Boltzmann constant; T is the temperature). E obeys a gaussian distribution since it differs by only a constant from the energy gap, which is postulated to be a gaussian variable by Marcus theory. In other words, p(E) =

exp(−E 2 /(2E 2 D ))  , 2 2πE D

(1.2)

where E 2 D is the variance of the energy gap in the donor state. Therefore, FD (the free energy for E when the prc is in the donor state) is, to an additive factor: E2 E2 FD (E) = = , 2α 2βE 2 D

(1.3)

where α = βE 2 D . By analogy, the equation for FA (E) (the free energy for E when the prc is in the acceptor state) can be derived if 1) we note that E also obeys a gaussian distribution in this state and 2) we assume that the size of the fluctuations remain the same E 2 D = E 2 A . Here, we write the equation for FA (E) FA (E) =

E2 − E + ∆ 2α

(1.4)

and show that the equations for FD and FA have the right properties. We can also derive expressions for various key energetic terms. Refer to Figure 1.5 which includes a plot of FD (E) and FA (E). First, it is easy to see that for E = ∆, the two curves intersect (FD (∆) = FA (∆)) and, consistently, the energy gap, ∆EDA (E) = 0. In other words, the two states become energetically degenerate at this point. Second, we note that the definition of FD and FA yields a consistent expression of ∆EDA D . The average value of E in the donor state is zero (ED = 0), and the difference between

CHAPTER 1. INTRODUCTION

23

the free energies for this value of E gives the appropriate value: FA (0) − FD (0) = ∆ (which is ∆EDA D ). Hence, because a gaussian probability distribution corresponds to a parabolic free energy surface, Marcus’ assumption of gaussian fluctuations for the energy gap leads to conceptualizing this collective coordinate as fluctuating on parabolic free energy surface. Moreover, the electron has an opportunity to hop between states (e.g., from the donor to the acceptor state) at the crossing of the curves. A plot of the free energy surface vs the energy gap for a charge localized state is known as the Marcus diabatic free energy surface for that state. Figure 1.5 shows several energies of interest in et theory. The first, the free energy of transfer between the donor and acceptor states (∆GDA ), can be calculated as the minimum of FA minus the minimum of FD : ∆GDA = FA (α) − FD (0) = −α/2 + ∆.

(1.5)

Note that the free energy of transfer is not simply the difference in energy between the donor and acceptor states (∆) when the prc is in the equilibrated donor state. Upon et, the environment equilibrates to the acceptor state by reorganizing its arrangement of charge, thereby lowering the energy of the newly formed acceptor state. Within our formalism, we can calculate the reorganization energy of transfer from donor to acceptor, λDA , the amount by which the energy of the acceptor state is reduced by nuclear reorganization. Quantitatively, λDA is the change in free energy of state A from the donor equilibrated nuclear configuration (E = ED = 0) to the acceptor equilibrated configuration (E = EA = α). Therefore, λDA = FA (0) − FA (α) = α/2.

(1.6)

∆GDA = ∆EDA D − λDA .

(1.7)

and hence,

∗ Finally, we can calculate the activation free energy of et, FDA , needed for the system

in state D to reach a transition state by which a transfer to state A can happen. The activation energy is difference between the free energy at the transition state

CHAPTER 1. INTRODUCTION

24

D

Free energy

A

a d

b c e f collective coordinate

Figure 1.5: Marcus parabolas The Marcus parabolas corresponding to the donor state (D) and acceptor state (A) are displayed. The horizontal axis is the collective coordinate (E = ∆EDA D − ∆EDA ), the spontaneous fluctuation from the ensemble average of the energy gap between the two states. (See Equation 1.1.) The vertical axis is the free energy. Various energetic terms are displayed: (a) is ∆EDA D , the average energy gap between the donor and acceptor state in the donor equilibrated state. (b) is the activation energy to zero– ∗ crossing of the energy gap (FDA ). (c) is the free energy difference between the donor

and acceptor state (∆GDA ). (d) is the reorganization energy (λDA ). (e) represents the difference between the equilibrium value of E in the donor state (ED ) and the value of E for at the transition state. (f) is the the difference in the equilibrium value of E in the donor state (ED ) and the equilibrium value of E in the acceptor state (EA ).

CHAPTER 1. INTRODUCTION

25

(FD (∆)) and the free energy corresponding to the average value of E in the donor state: ∗ FDA

= FD (∆) − FD (0) ∆2 = 2α (∆GDA + λDA )2 = . 4λDA

(1.8)

In the classical, diabatic regime of et, the rate constant of et can be calculated by perturbation theory, combined with the expressions for p(E) (Equation 1.2) and α and Equation 1.8 to give the golden rule expression: kDA = δ(∆EDA )D

2π 2 K h ¯

2πK 2 = p(∆) h ¯ ∗ 2πK 2 exp(−βFDA )  = , h ¯ 4πβ −1 λDA

(1.9)

where δ(∆EDA )D is the probability for ∆EDA = 0, K is the electronic coupling between the donor and acceptor sites, and β = 1/(kB T ). This expression will be the basis of most of our kinetic analysis of the reaction center, even though the actual kinetics of primary transfer turns out to be more complex than implied by the simple expression [67]. As we will make clear in appropriate places throughout this dissertation, this discrepancy does not matter. Essentially, the focus of this dissertation is not on the fine details of the kinetics but on more basic energetic constraints and simple kinetic concerns (such as branching ratios). Note various aspects of the golden rule expression for kDA . First, it is the product of an exponential term containing the activation energy and a prefactor proportional to K 2 , the square of the electronic coupling between the two states. The exponential term can be thought as a transition state theory estimate for the probability to arrive at the transition state (a zero-crossing for the energy gap). The prefactor then reflects the probability of hopping at the transition state. Another striking characteristic of this rate expression is the non-intuitive dependence of the rate on the driving force. A plot of ln k vs ∆G displays three regimes. When ∆G > −λ, et is in the normal

CHAPTER 1. INTRODUCTION

26

regime. As the thermodynamic driving force decreases (and transfer becomes more thermodynamically favorable), the rate increases until it reaches a maximum with an activationless transfer (when ∆G = −λ). What was unanticipated was that the rate would start decreasing as the thermodynamic driving force makes the transfer more exothermic. This regime is called the inverted regime. A final thing to note is that the golden rule rate expression is a classical formulation that breaks down under certain conditions (such as in the highly inverted regime or at very low temperatures) [26]. In Section 1.2.1, we discuss the issue of superexchange vs a two–step mechanism. In superexchange, virtual intermediates are used. It can be shown that the golden rule expression for et between P (state 1) to P+ H− L (state 3) through the superexchange intermediary state P+ B− L (state 2) can be used to describe et in the superexchange regime providing that the prefactor be changed in the following manner. The term K13 is replaced with K12 K23 /|∆G12 | [32]. In summary, to determine the kinetics of the primary transfer, we want to calculate the Marcus diabatic free energy surfaces for all the transfer states relevant to the primary transfer. To do so, we need to calculate such terms as ∆G, ∆EDA D , and λ. The details of how to do this are given in the next two sections.

1.3.2

Molecular dynamics

Molecular dynamics is the computational technique of simulating the dynamical motion of atoms in molecular complexes [69, 70]. Both microscopic properties and macroscopic thermodynamic properties of the molecules can be derived from calculated trajectories of the atoms. Molecular dynamics simulations were first performed on simple liquids to obtain a deeper understanding of both static structure and dynamical properties [71]. Gradually, more complex molecules, including biologically significant molecules, have been simulated by applying and generalizing the liquid state ideas and techniques. Biological md has been used to refine predicted protein structures derived from X-ray or nmr structure determination efforts; to calculate dynamical pathways (e.g., for oxygen diffusion in myoglobin); to calculating spectra from a variety of techniques (e.g., infrared spectroscopy); and to calculate free energy

CHAPTER 1. INTRODUCTION

27

changes in various conformational changes [72]. Molecular dynamics simulations involve various simplifying approximations. Although biological processes are ultimately quantum mechanical in nature, many of them are well described classically. This classical simplification is fortunate since full quantum mechanical simulations of even the simplest of biomolecules are prohibitively expensive. For the remainder of the dissertation, we describe only classical molecular dynamics simulations (no quantum molecular dynamics was performed.) A classical molecular dynamics simulation of N atoms (which are idealized as N point particles) is specified (like all classical models) by the initial positions and momenta of the atoms and the classical potential function that governs the interactions and thus further motion of the atoms. The forces acting on the atoms are calculated as the gradient of the potential function, and the equation of motion is integrated to calculate the dynamics. The atomic positions and identities are taken from the solved X-ray crystal structure for Rps. viridis of which 10288 atoms are specified. Hydrogen atoms, which are not resolved by X-ray crystallography, are inserted and allowed to equilibrate [26]. The initial momenta are also set in a random way consistent with the desired temperature for the simulation. Besides the atoms for which there is explicit accounting, consideration must be made of other material that might have an effect on the energetics and dynamics of the protein complex. The challenge here is to accurately account for the environment without excessive computational demands. The prc is embedded in a membrane and surrounded by water, ions, and other solvated molecules. In addition to the nuclear motion represented by this model, there must also be some accounting of the electronic response. To deal with the electrostatic response of the nuclei and the electrons, we take a dual approach that will more fully explained in Chapter 2. For the molecular dynamics part of our simulations, we partition the roughly 12600 atoms between primary and secondary regions. The primary regions comprise all the atoms located within spheres of 23 ˚ A radius centered about midpoints of the four bacteriochlorophylls and two bacteriopheophytins. These atoms are permitted to move under the molecular dynamics algorithm. The other atoms that comprise the secondary region are fixed to the positions specified by the X-ray crystal structure.

CHAPTER 1. INTRODUCTION

28

Many classical potential energy functions have been proposed for use in molecular dynamics simulations. For the md simulations in this study, we have used the force field described by Model II of Marchi et al [26]. (More details for this force field are given in Section 1.A.) Of the various energetic terms, we will be paying the closest attention to the coulombic term. By Marcus theory, it is the fluctuations of the energetic terms that couple to the charge states (such as the coulombic term) that are important in the et process. Hence, uncertainties or controversies in how the coulombic terms are treated is the most likely to have significant effects on conclusions drawn about et. Specifically, two types of uncertainties will be shown to be important in this study. First, the charge states of possibly ionizable amino groups are not known with any experimental or theoretical certainty. Second, the calculation of the coulombic energy term turns out to be more complex than summing the classic coulombic expression. In vacuum, the coulombic interaction between two charges is described by the standard coulombic expression. However, the classical model defined above does not explicitly include charges that actually can have a profound effect on electrostatic interactions. Electronic polarization and nuclear polarization of atoms outside the protein complex need to be taken into account. These two issues are complicated ones to settle and are the subject of most of Chapter 2. What are the limitations of the classical molecular dynamics used here? One would expect that this md would fail to accurately describe processes involving non-classical physics. For instance, using classical md alone, one would not be able to directly study the process of electron transfer, an inherently quantum process. However, through Marcus theory, we study the mostly classical nuclear motions that mediate et. Furthermore, classical md does not give access to quantum energies (such as gas– phase ionization potential/electron affinity–ipea). However, with classical simulation techniques, one can still calculate the energetics that are classically well-described. We will see in the next section (and in Chapter 2), how classical calculations and input quantum parameters can be combined to calculate the energetics of an et process. Recall that we are interested in calculating the Marcus diabatic free energy surfaces relevant to the primary transfer. The parameters that specify these curves include: the thermodynamic driving forces (∆G) and terms proportional to the re-

CHAPTER 1. INTRODUCTION

29

organization energy (λ). The calculation of ∆G requires the calculation of various terms, including electrostatic interactions. As was mentioned above, electrostatic interactions become rather involved because of inhomogeneous dielectric response. The md technique used here handles the other terms quite well but not electrostatics. Hence, we make use of a better approach to calculate electrostatics and combine it with md (which is good for getting at timescales and fluctuation terms, such as λ). The topic of the next section is this continuum electrostatic methodology.

1.3.3

Continuum electrostatics calculations

In this section, we describe the calculation of electric fields and electric potentials in molecules, specifically the solution of the Poisson-Boltzmann (pb) equation. The problem under consideration is calculating the electric potential and electric field at a given point inside (and outside) a protein complex due to charges inside the protein complex combined with dielectric response both inside and outside the protein. There is much demand for realistic modeling of electric fields because they are thought to be very important in many biological contexts [73, 74, 75]. Perhaps the greatest of the many challenges concerning the calculation of electric fields in proteins is the potentially high computational costs of realistic modeling. Many approaches have already been taken to solving this problem. The TanfordKirkwood (tk) approach involves approximating the protein as a sphere. The dielectric response of the protein is represented by a dielectric constant for the interior, while the response for the exterior, a dielectric constant for the outside [76]. Although the tk method is a conceptually and computationally simple approach (because the formulation is analytically tractable), the method is insufficiently accurate for general applications. Instead of representing the dielectric response as continua, the Protein Dipole-Langevin Dipole (pdld) method uses polarizable dipoles located on the atoms (the protein dipoles) and on all the points of truncated grid to represent the interior and exterior dielectric responses respectively [77]. A number of other methods are available for electrostatic calculations, of varying accuracy and computational complexity [70].

CHAPTER 1. INTRODUCTION

30

The specific technique we have chosen is the finite–difference solution of the Poisson-Boltzmann (pb) equation. This technique has had a long story of success at reliably estimating the electric potential in proteins [74, 78, 79, 80]. Moreover, it not only possesses well-defined and simple assumptions, but also meshes well with the md methodology used in these studies. Basically, the pb method can be conceptualized as a refined tk method. Both approximate the interior and exterior of the protein as continua whose dielectric response is characterized by their respective dielectric constants. The key difference is that the pb methodology involves a detailed construction of the dividing surface between the two regions, while the tk method assumes a spherical surface for the protein. With these boundary conditions in place, the electrostatic potential is then the solution to the Poisson-Boltzmann equation: ∇ · ((r)∇(βqφ(r))) + βqρ(r) − 2βq 2 I sinh(βqφ(r)) = 0,

(1.10)

where φ(r) is the electrostatic potential, ρ(r) is the fixed charge density, (r) is the spatially–dependent dielectric constant, I is the ionic strength of the solution, q is the proton charge, and β = 1/kT (k is the Boltzmann constant). Note that the pb equation accounts for the presence of free ions. In many biochemical circumstances, these ions make a significant difference in the electrostatic potential. In this thesis, we generally ignore this term because the relevant ion concentration is small. (A comparison of calculations with and without accounting for the ionic term shows very little difference.) The pb approach has a number of key advantages. Because of the continuum approximation, we can deal with dielectric response with one number for a region and therefore do not need substantial molecular details. The pb equation, combined with the boundary conditions, is a well-defined boundary value problem. Hence, it can be analyzed through a number of methods, including a finite element and finite difference methodology. We take a finite difference approach in this thesis. That is, we solve the pb equation by discretizing space (splitting it into a grid) and solving the corresponding resultant discrete pb equation. With small enough grid spacing and a method called focusing, accurate computation of the electric potential can be had. (Focusing is a method that combines results from a hierarchy of solutions to the

CHAPTER 1. INTRODUCTION

31

pb equation at different resolutions to allow for a combination of accurate solution about points of interest with reduced computational effort for points of less interest.) To define the boundary value problem of the pb equation, the following elements must be specified: the positions and partial charges of the atoms, their atomic radii, and finally, the exact nature of the boundary conditions (the dielectric constants of the interior and exterior regions and whether there is a membrane involved). The boundary surface separating the interior and exterior dielectric regions is constructed based on the positions and radii of the atoms. The location and magnitude of the charges are, of course, important in determining the quantities that we want to calculate, namely, the electric potentials and electric field at various points of interest. (Figure 1.6 illustrates these concepts.) (Specific details about how these calculations are carried out are given in the relevant chapters of the dissertation.)

1.4

Overview of the dissertation

Recall the two overarching themes of the dissertation: first, the calculation of experimental observables for our models of the prc and, second, the explication of the fundamental physics of the models. Both themes are explored in Chapter 2, in which we describe the details of the md and the pb calculations and theoretical framework upon which the entire dissertation rests. We apply this methodology to the wild-type (wt) photosynthetic reaction center (prc). The primary question under examination is the effect of differing treatments of the dielectric response and the charge states for amino acids on the calculations. Specifically, we construct the diabatic free energy surfaces for the primary et for a range of models with varying treatments for these variables. We examine trends occurring within this space of models, looking for various regularities. Finally, we look at the detailed aspects of the properties of the most relevant models. After examining these models, we turn to studying two experiments in great detail, one of which involves manipulating the wt system and the other, two different mutations. In Chapter 3, we examine an experiment by Steffen, Lao, and Boxer (slb) [54] in which they measure Stark shifts due to internal charge shifts in the prc

CHAPTER 1. INTRODUCTION

+ + + - + + - + + + + + + + - - + + + + + + +

(a)

32

11111111111111111111 00000000000000000000 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 0000000000000 1111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 (b)

Figure 1.6: The pb methodology in overview This diagram provides a schematic for the process by which pb calculations are performed. (a) A physical model in which atoms with a fixed spatial location and charges are specified. (b) The physical system is partitioned into interior and exterior regions, whose dielectric response is specified by different values of a dielectric constant. The electrostatic potential φ(r) is solved on a grid by a finite–difference solution of the Poisson–Boltzmann (pb) equation.

CHAPTER 1. INTRODUCTION

33

to probe changes in internal electric fields. Effective dielectric constants at the various chromophores are derived from these measurements. From these data, the authors hypothesize that a stronger effective response on the l branch over the m branch is a possible explanation for the l–m functional asymmetry of the prc. We simulate the structural responses of the prc to these changes in internal charge states to calculate the internal electric fields. We calculate an effective dielectric constant and compare them to slb. In Chapter 4, we compare the function of the wt system with two related mutants that have been performed by Heller, Holten, and Kirmaier [63]. These particular mutations are of tremendous interest because they are relatively simple mutations in which there appears to be some electron population transfer down the m branch chromophores. We simulate the wt system and these two mutants and calculate the diabatic free energy surfaces (with an approximation we discuss in that chapter). We compare the energetics and kinetics from our simulations with the experimental results. In Chapter 5, we revisit the results from the previous chapters with an eye to developing simpler, reduced descriptions of the prc—essentially, to probe for the underlying physics of the models. Specifically, we take various approaches to identifying the essential details of the prc. Moreover, we attempt to relate different levels or types of descriptions for the reaction center. We examine a statistical model for the residual contributions to various energy gaps, examine the possibilities for an accurate multipolar simplification of the residues for the reaction center, study whether there are correlations between the electrostatic nature of the prc to l–m homology. We describe possibilities of future calculations for statistical models for the physical distribution of charge in the reaction center. In Chapter 6, we review what we have learned in the previous chapters, draw the major conclusions from these studies, and outline the natural next steps to this work. Finally, Appendix A is a critical analysis of the work of Warshel, Parson, et al. concerning the prc, including a detailed comparison to our calculations.

CHAPTER 1. INTRODUCTION

1.A

34

Appendix: The force field in our molecular simulation

This appendix describes in detail the force field used in the molecular dynamics (md) simulations performed for this dissertation. As mentioned in Section 1.3.2, the methodology used is based on that of the work of Marchi et al. [26] Here, we summarize the description given by mgcn and expand on some details that were omitted by Marchi et al. in their paper. Our molecular dynamics simulation is based on Model II in mgcn. For the protein subunits, the parameters for the potential function and the definition of the topology are drawn from CHARMm (version 20) [81]. However, the stretch terms are not included, while the bonds are kept rigid with the shake algorithm. Moreover, non– polar hydrogens are not explicitly included in the model. Further details (concerning the integration algorithm and the cutoff used) are given in mgcn. Also described in mgcn is the treatment for the prosthetic groups.

Marchi

et al. used a parameterization for bacteriochlorophyll, bacteriopheophytin, and the quinones calculated by Treutlein et al. [82, 83] and made available with xplor. However, It was not mentioned specifically by Marchi et al. that the parameters are found in the files param19.rcv and toph19.rcv of xplor on the Cray–YMP at the San Diego Supercomputing Center (sdsc).6 Moreover, Marchi et al. added some molecular interactions to connect the bacteriochlorophylls to their imidazole ligands and the non-heme iron to its ligands. Specifically, bending energies, described by Ebending = Kt (θ − θ0 )2

(1.11)

and listed in Table 1.2 are added. Improper torsional energies, described by Etorsion = |Kd | − Kd cos(nφ)

(1.12)

and listed in Table 1.3 are also added. 6

The two files param19.rcv and toph19.rcv are located on the C90 in the directory /usr/local/apps/xplor/data as of April 11, 1997. [Private communication from Jerry Greenburg ([email protected]), 1997.]

CHAPTER 1. INTRODUCTION

35

Table 1.2: Bendings added for ligand interactionsa Atomsb

Kt c

θ0 d

np

mfe

n5r

5.0

90.0

c5re

n5r

mfe

30.0

124.8

np

mmg n5r

5.0

90.0

c5re

n5r

mmg

30.0

124.8

n5r

mfe

n5r

5.0

100.0

a

The bending interaction is defined by Eq. 1.11.

b

Groups of three atom types that define an angle for the bending interaction. (The

atom types are particular to CHARMm.) c ˚2 ). Kt is given in kcal/(mol A d

θ0 is given in degrees.

Table 1.3: Improper torsions added for ligand interactionsa Atomsb

Kd c

n

n5r x

-0.05

4

mmg n5r x

-0.05

4

x mfe x a

The improper torsional interaction is defined by Eq. 1.12.

b

Groups of four atom types that define a plane for the bending interaction. (The

atom types are particular to CHARMm. “x” denotes any atom.) c ˚2 ). Kd is given in kcal/(mol A

36

Chapter 2 Computer Models of the Wild Type Reaction Center 2.1

Introduction

In this chapter, we develop our atomic level computer models for the primary electron transfer in the photosynthetic reaction center (prc) of Rps. viridis. As discussed in Chapter 1, there are still a number of key unresolved issues concerning the primary et: Why does the electron transfer occur down the l branch and not m branch? Is the primary transfer a superexchange or two–step process? We aim to answer these questions through a computer simulation approach. We take a computational approach for studying the primary electron transfer. Given a detailed atomic–level structure of this prc, and a chosen physical model of the charges and dielectric responses of the complex, the pathway for electron transfer may be computed. (See Chapter 1.) Several calculations of this sort have already been performed. The simulation study performed by Marchi et al., hereafter referred to as mgcn, is one such computation [26]. The heart of the mgcn approach is to use molecular dynamics calculations to calculate diabatic free energy surfaces for the primary transfer. Involved in this approach is the calculation of the thermodynamic driving force, or free energy, (∆G) for electron transfer. From these diabatic surfaces, the kinetics of transfer can be estimated. Marchi et al. concluded that the electric

CHAPTER 2. COMPUTER MODELS . . .

37

fields within the prc directed the electron transfer down the l branch by raising the energy of the m branch charge transfer states above the energy of the neutral initial state. The energy of the charge transfer state P+ B− L was also found to be above the energy of the initial state. Marchi et al. therefore concluded that electron transfer to the accessory bacteriochlorophyll BL was not a true intermediate in the primary charge transfer. As commented upon by others [29, 30, 27], it is not clear that the mgcn model satisfactorily captures all of the essential features of the electrostatic properties of the prc. The Born charging energies associated with the creation of charge–separated final states from the neutral initial state of the prc are absent from their modeling. The assumption that all ionizable residues are in their neutral pH charge states has also been challenged. Furthermore, the use of a single, uniform dielectric constant to account for the electronic response of the protein matrix neglects the effects of dielectric inhomogeneities arising from the presence of surrounding water and/or membrane molecules. In this chapter, we address these criticisms by studying the sensitivity of models for the primary et to variations in 1) the treatment of the charge state of ionizable amino acids and 2) the treatment of dielectric response inside and outside of the protein complex. After first describing the construction of these models, we then discuss the results in various ways. In Section 2.3.1, we examine whether the diabatic free energy surfaces conform to various experimental findings. In Section 2.3.2, we then look at large scale trends within the models by interpreting the results within the framework of a “qualitative” model. In Sections 2.3.3 to 2.3.8, we look at the specifics of these large–scale trends. We conclude this chapter by summarizing the specific lessons of these models for the wt system.

CHAPTER 2. COMPUTER MODELS . . .

2.2 2.2.1

38

Theory and methodology Physical model

Our approach can be understood as a modification of mgcn. As was done in mgcn, we use Marcus theory in the diabatic regime to frame our whole discussion of the primary electron transfer (et). Within this theoretical framework, diabatic free energy surfaces describe the energetics and kinetics of et. The central idea of mgcn (and this work) is therefore to calculate these diabatic surfaces; however, different molecular models and computational techniques are employed to do so. In this section, we first outline the application of Marcus theory that is common to both mgcn and our work. (We adopt the convention outlined in Table 1.1 for labeling the charge states of interest.) In the primary transfer, there are five relevant + − + − + − charge states (P∗ , P+ B− L , P HL , P BM , and P HM ). We consider the l side and

m side transfers separately by conceptualizing each as a three-state system. For instance, the l side system comprises the following three states: P∗ , P+ B− L , and P+ H− L . (Electron transfer on the inactive m side is treated in an analogous fashion.) In Chapter 1, we describe Marcus theory for et between two states. Electron transfer between states is possible when energy levels of states are equal (i.e., levels become degenerate). Degeneracy occurs because of fluctuations in the energy gaps (due to nuclear motion of the environment). Hence, the proper collective coordinate to use in understanding et between two states is the energy gap. Here, we present the generalization (given by mgcn) needed for a three–state system. mgcn showed that the application of Marcus theory requires two collective coordinates because there are two linearly independent energy gaps in a three-state system. Recall the formulation presented in Section 1.3.1 for the two state donor–acceptor system. The collective coordinate used is E (defined in Equation 1.1), the spontaneous fluctuation of the energy gap between the donor and acceptor (∆EDA ) from its

CHAPTER 2. COMPUTER MODELS . . .

39

average. The parabolic diabatic curves for the two states are shown to be: E2 = , 2α E2 − E + ∆, = 2α

FD FA

(2.1)

where α = βE 2 D and β −1 = kB T . In mgcn, the same formulation was used (the donor state was P∗ and the acceptor state, P+ H− L ): the energy gap between states 1 and 3 is written: ∆E13 (t) = −E  (t) + ∆3 .

(2.2)

∆3 = ∆E13 1 ,

(2.3)

Here,

where . . .1 indicates the equilibrium ensemble average with respect to state 1. The dynamical variable −E  (t) is the instantaneous fluctuation of that energy gap from its average. (For the treatment of et between SP and HM on the m branch, the analogous variable E  was defined by mgcn and is used here.) To handle an additional state (P+ B− L ), another collective coordinate (E ⊥ ) was introduced in mgcn. (For m branch transfer, we use the corresponding coordinate E ⊥ .) One might define it as the instantaneous fluctuation of ∆E12 from its ensemble average. However, such a definition would not guarantee that the two collective coordinates would be statistically orthogonal. To ensure this property of statistical orthogonality, E ⊥ is defined in the following way (a Gram-Schmidt orthogonalization process): ∆E12 (t) = −E ⊥ (t) − bE  (t) + ∆2 ,

(2.4)

∆2 = ∆E12 1

(2.5)

with

The constant b is chosen so that at equal times, E  and E ⊥ are statistically orthogonal, E  E ⊥ 1 = 0 .

(2.6)

CHAPTER 2. COMPUTER MODELS . . .

40

The equations for the diabatic free energy surfaces along the l branch (and similarly, for the m branch) become (in analogy to Equation 2.1): F1 = FB , F2 = FB − bE  − E ⊥ + ∆2 , and

F3 = FB − E  + ∆3 ,

(2.7)

where FB =

1 2 1 2 E + E 2α 2α⊥ ⊥

(2.8)

with α = β(δE  )2 1

, α⊥ = β(δE ⊥ )2 1

(2.9)

and β −1 = kB T . In Chapter 1, we derive relationships that exist among the free energy of transfer ∗ (∆GDA ), the reorganization energy (λDA ), the activation energy of transfer (FDA ),

and finally the rate of et (kDA ) (See equations 1.7, 1.6, 1.8, and 1.9): ∆GDA = ∆EDA D − λDA , λDA = α/2, (∆GDA + λDA )2 ∗ FDA = , 4λDA ∗ ) 2πK 2 exp(−βFDA  . kDA = −1 h ¯ 4πβ λDA

(2.10)

Analogous equations hold for the three–state system. That is, ∆G12 = ∆E12 1 − λ12 , ∆G13 = ∆E13 1 − λ13 ,

(2.11)

CHAPTER 2. COMPUTER MODELS . . .

41

and (δ∆E13 )2 1 2 2 E  1 β 2 α /2, (δ∆E12 )2 1 β 2 (−E ⊥ − bE  )2 1 β 2 α⊥ /2 + b2 α /2.

λ13 = β = = λ12 = = =

2.2.2

(2.12)

(2.13)

Calculating the model parameters

We now turn to the issue of actually calculating these quantities through molecular simulation. There are a number of complications that need to be resolved. The first is that the energy gap between the donor and acceptor state (∆EDA D ) is not directly accessible by the classical molecular simulation used by mgcn and in our work. Ideally, we would do a fully quantum mechanical simulation of the system. Since we cannot do this, we partition the full system under study into a “core system” and its “environment.” In this scheme, we conceputalize ∆EDA D as the sum of the (0)

energetics of the core system in vacuo (∆EDA ) with perturbations to ∆EDA D due to the interaction of the core with the environment. In mgcn and the work presented here, the core system is defined as the six core chromophores (PL , PM , BL , HL , BM , and HM ) and their four imidazole ligands (His L153, L173, M180, M200). This definition of ∆E (0) allowed mgcn to make use of Thompson and Zerner’s (hereafter known as tz) semi-empirical quantum mechanical calculation of ∆E (0) for the prc [84]. With this partition of “core” and “environment,” the spontaneous energy gap (∆E(t)) can then be written as: ∆E(t) = ∆E (0) + ∆E (ES) (t)

(2.14)

∆E (ES) (t) ≡ ∆E(t) − ∆E (0) .

(2.15)

where

CHAPTER 2. COMPUTER MODELS . . .

42

The challenge then becomes calculating ∆E (ES) , the energetics of electrostatic interactions of the core system with its environment. In principle, this quantity is well– described by classical electrostatics. In practice, as was discussed in Section 1.3.3, proper accounting is difficult. mgcn made use of the classical model of the prc described in Section 1.3.2 to calculate the ∆E (ES) for the states listed in Table 1.1. In their calculations, it was assumed that the transferred electron is uniformly delocalized over the four macrocycle nitrogens of the acceptor chromophore.1 The positive charge on the post–transfer donor, the SP, is evenly distributed over the eight macrocycle nitrogens of the two bacteriochlorophylls. They consided Wα , the electrostatic energy, as calculated by md, of placing an electron on chromophore α. Specifically, it was equated to the electronic charge multiplied by the average of the md electrostatic potential at the four macrocycle nitrogens (or eight nitrogens for SP), due to the portion of the prc outside the core system itself. In theory, one would expect the classical electrostatic contribution to the energy gap of a donor–acceptor transfer to be ∆WDA = WA − WD (ES)

and ∆EDA = ∆WDA .2 Furthermore, using Equation 2.2, we see ∆E13 (t) = −E  (t) + ∆3 (0)

(0)

∆E13 + ∆W13 (t) = −E  (t) + ∆E13 + ∆W13 1 ∆W13 (t) = −E  (t) + ∆W13 1 . (0)

(2.16)

(0)

where it is assumed that ∆E13 1 = ∆E13 (i.e., no accounting is taken of temporal fluctuations of the vacuum energy term.) In an analogous fashion, E  (t) is also expressed in terms of ∆W12 and ∆W12 (by substituting into Equation 2.4): ∆W12 (t) = −E ⊥ (t) − bE  (t) + ∆W12 1 .

(2.17)

With this definition, it is possible to calculate all the quantities (α , α⊥ , ∆2 , ∆3 , and b) to calculate the diabatic curves. Note that ensemble averages of (. . .1 ) were 1

Gunner et al. found little difference in calculated electrostatic energies between spreading the electron over the nitrogens vs using a more realistic distribution for the electron [27]. 2 Actually, mgcn defines a closely related quantity, Vα (t), defined to exclude only interactions within chromophore α. Another quantity, νj , was used to subtract out the contributions from the the ligands and other chromophores.

CHAPTER 2. COMPUTER MODELS . . .

43

evaluated by averaging the quantity in question over 56 ps md trajectories. However, since Wα was calculated with the prc in vacuo (i.e., the atoms were embedded in a uniform dielectric of 1), such a calculation would inadequately account for the dielectric screening present in the prc. mgcn derived a procedure by which the simulation results could be used to analyze the same physical system embedded in an infinitely fast dielectric responding bath with dielectric constant ∞ . ∆W, α⊥ , and α would be rescaled in the following way: ¯ = ∆W/∞ ∆W → ∆W α⊥

→ α ¯ ⊥ = α⊥ /∞

α

→ α ¯  = α /∞ .

(2.18)

With these rescaled parameters, diabatic free energy surfaces can be calculated for a system with uniform electronic dielectric screening in addition to the nuclear screening implicit in the simulation. In this dissertation, we want to explore alternative calculations for the parameters of the diabatic free energy surfaces. One possible problem inherent in the approach of mgcn is that envisioning the prc as embedded in a uniform sea of electronic polarizability does not account for possibly important sources of polarizability, such as that of the membrane and the aqueous solvent surrounding the prc–membrane complex. Other workers have questioned the treatment by mgcn of the ionizable amino groups in the prc. We address these issues in the methodology that we describe here. To examine the effect of environmental dielectric response on the primary et, we incorporate the use of continuum electrostatic techniques, which are reviewed in Section 1.3.3. These techniques allow for us to model different regions of electrostatic continua, instead of being limited to a single infinite continuum (as was modeled by mgcn). Unfortunately, because they are also somewhat computationally expensive, we approximate ensemble averages of various quantities (such as ∆EDA ) as those quantities calculated for an ensemble–averaged structure. Specifically, the average positions of the atoms over a 4 ps long molecular dynamics trajectory were defined

CHAPTER 2. COMPUTER MODELS . . .

44

to be the position of the atoms for this average structure. In other words, ¯ ∆EDA D ≈ ∆EDA (D)

(2.19)

¯ is ∆EDA calculated for a structure averaged with respect to the where ∆EDA (D) donor (D) state. In the appendix to this chapter, when we examine the errors in the energetics due to this approximation, we find them to be relatively small. In this dissertation work, we adopt the partition used by mgcn of the prc between the “core system” and “environment.” In this framework, let us relate the free energy of et to various energetic terms: ∆GDA = ∆EDA D − λDA ¯ − λDA . = ∆EDA (D)

(2.20)

¯ The first term is the energy of electron transfer in Two terms comprise ∆EDA (D). (0)

the core system (placed in vacuo) (∆EDA ). The second term, the electrostatic contri(ES)

bution to the energy gap, ∆EDA , can itself be partitioned into a contribution from (CO)

all the charges (partial and full) in the environment, ∆EDA , and a part due to the (RF )

dielectric response of the prc and its surroundings—the reaction field—EDA . For the protein matrix within which the core system resides, the dielectric response arises from the electronic polarizability of the proteins. The dielectric response of the material surrounding the prc can include both the electronic and nuclear polarizability of the surrounding media. (Recall that the nuclear polarizability of the protein matrix is accounted in a separate term, λDA .) In other words, (0)

(CO)

(RF )

¯ = ∆E + ∆E ∆EDA D ≈ ∆EDA (D) DA DA + EDA

(2.21)

and therefore, (0)

(CO)

(RF )

∆GDA = ∆EDA + ∆EDA + EDA − λDA .

(2.22)

We now examine the computation of the various terms in detail. For estimating (0)

the vacuum energy gaps, ∆EDA , we use an empirical methodology. As mentioned above, the calculations of mgcn used the results of Thompson and Zerner’s [84] (0)

semi–empirical quantum mechanics calculations as estimates of ∆E1j for the five

CHAPTER 2. COMPUTER MODELS . . .

45

charge transfer states of the prc. In this dissertation, we adopt a different strategy: (0)

we adjust ∆E1j to obtain the best fit to experimental data. Errors in Thompson and Zerner’s (tz) calculation of ∆E (0) are difficult to assess since the authors themselves do not provide any estimate of the errors [84]. Several workers have suggested that the calculations of Thompson and Zerner may have substantial errors. Gunner et (0)

(0)

al. [27] have argued that ∆E13 - ∆E12 should be on the order of 6 kcal/mol (based on a simple electrostatic model and interpretation of experimental data for the ipea difference between bacteriochlorophyll and bacteriopheophytin) instead of being -0.3 kcal/mol as calculated by Thompson and Zerner. We look at the consequences of fitting ∆E (0) to obtain the experimental value of ∆G13 (-6.0 kcal/mol) and to make (0)

(0)

∆E13 = ∆E12 + 6 kcal/mol. We also assume that ∆E (0) is characterized by l–m (0)

(0)

(0)

(0)

symmetry and set ∆E12 = ∆E12 and ∆E13 = ∆E13 . In presenting the results, we compare these values of ∆E (0) to those of tz. The reorganization energy, λDA , was computed in the manner outlined above, that is as the rescaled variances of E ⊥ and E  . (See equations 2.12 and 2.18.) The difference from the method of mgcn is that in mgcn, ∞ was fit to 1.9 (to obtain the experimental value of ∆G13 ) whereas, we make use of an ∞ = 2.0 [85]. Except for the use of different charge models, the methodology for the molecular dynamics is that of mgcn. (CO)

Next, ∆EDA , was calculated by solving the Poisson equation for the electrostatic potential at the four nitrogens of the acceptor and the eight nitrogens of the donor: ∇ · ((r)∇φ(r)) = ρ(r).

(2.23)

The details for the three different dielectric boundaries that were studied in depth are (RF )

given in the next section. In calculating EDA , another dielectric boundary condition was used to represent the “core system” in vacuo: one in which the interior of the “core system” is assigned a dielectric constant of 2 and the exterior, a dielectric constant of 1. The finite–difference solver DelPhi [86] was used to solve the Poisson equation for these boundary conditions. Due to the large size of the prc, a “focussing” procedure [87] was used to calculate the electrostatic potential (φ) at the various atoms within the core system. Three successive grids with the following grid spacing

CHAPTER 2. COMPUTER MODELS . . .

46

were used: 2.5 ˚ A, 1.4 ˚ A and 0.75 ˚ A. Each calculation employed a 1253 grid lattice. In discussing the results of our simulations, we adopt a coordinate system in which the vector pointing from the non-heme iron and the center of mass of the 8 nitrogens of the SP points in the positive-z direction. The origin is at the center of the mass of the nitrogens.

2.2.3

Dielectric and charge models

To determine the sensitivity of our calculations to varying treatments of the 1) different charge states of ionizable amino acids and 2) environmental dielectric response, we construct three different treatments for each of these variables. Three different models of the charge states of the ionizable amino acids were considered. In the “standard charge model,”(S) all the charge states, except for that of Glu L104 [55], are set to their expected values in water at neutral pH. This is the same charge model used in mgcn. In the “fully neutralized model,” (N) all the ionizable amino acids, and the acid groups on the heme molecules, are placed in their neutral state. The ionizable groups are the arginines, aspartates, glutamates, lysines, and N- and C- terminii of the prc. The charge on the non–heme iron is also set to zero. Finally, in the “partially neutralized mode,” (P) all the ionizable amino acids within the “membrane region” are neutralized. All other amino acids were in their aqueous, neutral pH, charge state. Table 2.1 lists the specific amino acid residues that are neutralized in the P model. The “membrane region” is defined ˚ wide and is perpendicular to the line connecting the center–of–mass of to be 43.8 A the sp to the non–heme iron. The membrane region is defined as the region for which ˚ < z < 12.5A ˚. See Figure 2.1. In all three charge models, the partial charges −31.3A of Model II of mgcn are used. These three charge models span a range of possibilities for the charge states of the prc. We solve the Poisson equation for the prc embedded in three different types of dielectric media: • An infinite continuum with a dielectric constant of 2, the same as that assigned to the electronic polarizability of the protein matrix. This set of dielectric

CHAPTER 2. COMPUTER MODELS . . .

47

Table 2.1: Amino acid residues neutralized in the P models ALA-

M1

ARG

M265

GLU

H35

ARG

C15

ASP

H11

GLU

H61

ARG

C216

ASP

H36

GLU

L106

ARG

H33

ASP

L23

GLU

M76

ARG

H34

ASP

L60

GLU

M171

ARG

H37

ASP

L155

GLU

M232

ARG

L103

ASP

L218

GLU

M261

ARG

L135

ASP

M27

LEU- H258

ARG

L217

ASP

M43

LYS

H205

ARG

L231

ASP

M80

LYS

M40

ARG

M86

ASP

M182

LYS

M298

ARG

M130

CYS-

C1

The charge for each of these amino acids is set to zero in the partially neutralized (P) charge model, defined in Section 2.2.3.

CHAPTER 2. COMPUTER MODELS . . .

48

Figure 2.1: Position of the membrane and the various dielectric boundary conditions The amino acids drawn are the possibly ionizable groups. The position of the membrane is denoted by the two parallel planes. The three dielectric models are defined in Section 2.2.3.

CHAPTER 2. COMPUTER MODELS . . .

49

boundary conditions is denoted as 2:2. • An infinite bath of water, modelled as a dielectric continuum of dielectric constant 80. The electronic response of the protein is represented by a dielectric constant of 2. This set of boundary conditions is denoted as 2:80. • An infinitely wide, 43.8 ˚ A thick membrane of dielectric constant 2 surrounded by an infinite continuum of water. The water is taken to have a dielectric constant of 80 and the dielectric constant of the protein matrix is again taken to be 2. This set of boundary conditions is denoted as 2:2:80. Within this framework, the 2:2:80 model should be the one closest to mimicking the experimental systems. However, the other two should shed light on the effects of using either weaker (2:2) or stronger (2:80) dielectric response. Hence, these three models represent a fairly wide range of possible dielectric response.

2.3

Results and discussion

We have calculated the diabatic free energy surfaces for the nine combinations of charge and dielectric models. Figures 2.2, 2.3 and 2.4 show the E ⊥ = 0 and E ⊥ = 0 projections of the diabatic surfaces for the S2:2:80 , P2:2:80 , and N2:2:80 models. The calculated values for ∆E (CO) , E (RF ) , and λ are presented in Table 2.2. The free energies of transfer for the S2:2:80 , P2:2:80 , and N2:2:80 models are shown in Table 2.3. Clearly, there are many data to analyze. In the discussion to follow, we use the following three questions to guide our analysis: 1. To what extent does the calculated diabatic free energy account for the experimental data? 2. What trends or invariants are present among the models? What happens when there is more or less charge neutralization or more or less dielectric screening? What things are generally true in all the models?

CHAPTER 2. COMPUTER MODELS . . .

50

Table 2.2: Calculated values for ∆E (CO) , E (RF ) , and λ state

ib

2 (P+ B− L) 3 (P+ H− L) −  + 2 (P BM ) 3 (P+ H− M) state i 2 (P+ B− L) + 3 (P H− L) 2 (P+ B− M)  + 3 (P H− M) state i 2 (P+ B− L) 3 (P+ H− L) −  + 2 (P BM ) 3 (P+ H− M) a

Standard (S) charge model (CO) (RF ) ∆E1i E1i 2:2 2:80 2:2:80 2:2 2:80 2:2:80 -0.2 1.4 1.9 -6.4 -9.9 -6.3 -22.0 -11.4 -18.3 -13.9 -20.0 -14.3 0.7 -1.7 0.8 -9.3 -12.1 -9.2 -13.8 -7.7 -10.3 -11.9 -16.9 -12.5 Partially neutralized (P) charge model (CO) (RF ) ∆E1i E1i 2:2 2:80 2:2:80 2:2 2:80 2:2:80 1.1 3.9 2.8 -6.4 -9.9 -6.3 -14.0 -6.6 -11.3 -13.9 -20.0 -14.3 -3.3 -1.9 -1.8 -9.3 -12.1 -9.2 -14.0 -5.5 -9.1 -11.9 -16.9 -12.5 Fully neutralized (N) charge model (CO) (RF ) ∆E1i E1i 2:2 2:80 2:2:80 2:2 2:80 2:2:80 4.8 4.5 5.2 -6.4 -9.9 -6.3 -0.9 -2.6 -1.5 -13.9 -20.0 -14.3 0.4 -0.4 0.8 -9.3 -12.1 -9.2 -3.4 -2.6 -2.3 -11.9 -16.9 -12.5

a

λ1i 3.6 4.3 3.0 5.2 λ1i 2.4 4.6 2.9 4.8 λ1i 2.7 3.9 2.2 3.9

Three of four components (∆E (CO) , E (RF ) and λ) of ∆G (Eq. 2.22) are presented for

the nine combinations of dielectric and charge models (Section 2.2.3). A discussion of the energy components of ∆G is given in Section 2.2.2. All energies are given in kcal/mol. b

The charge states of interest are defined in Table 1.1. 3. What is going on specifically in our models that gives us these results? How do we explain trends? What are the ”important” degrees of freedom in the system?

Question 1 is addressed in Section 2.3.1; Question 2, in Section 2.3.2; and Question 3, in sections 2.3.3 to 2.3.8. Before we address these three questions, we need to examine the estimated errors of these calculations. Here, we present the conclusions of a detailed error estimate given in the Appendix of this chapter. We estimate that the statistical uncertainty

CHAPTER 2. COMPUTER MODELS . . .

Table 2.3: Free energies of transfer

51

a

Standard (S) charge model (0) (CO) (RF ) state ∆E1i ∆E1i E1i λ1i − + 2 (P BL ) 24.9 1.9 -6.3 3.6 3 (P+ H− ) 30.9 -18.3 -14.3 4.3 L 2 (P+ B− 24.9 0.8 -9.2 3.0 M) 3 (P+ H− ) 30.9 -10.3 -12.5 5.2 M Partially neutralized (P) charge model (0) (CO) (RF ) state i ∆E1i ∆E1i E1i λ1i 2 (P+ B− ) 18.1 2.8 -6.3 2.4 L − + 3 (P HL ) 24.1 -11.3 -14.3 4.6 2 (P+ B− ) 18.1 -1.8 -9.2 2.9 M 3 (P+ H− ) 24.1 -9.1 -12.5 4.8 M Fully neutralized (N) charge model (0) (CO) (RF ) state i ∆E1i ∆E1i E1i λ1i 2 (P+ B− ) 7.7 5.2 -6.3 2.7 L − + 3 (P HL ) 13.7 -1.5 -14.3 3.9 2 (P+ B− ) 7.7 0.8 -9.2 2.2 M 3 (P+ H− ) 13.7 -2.3 -12.5 3.9 M ib

a

∆G1i 16.9 -6.0 13.6 2.9 ∆G1i 12.3 -6.0 4.3 -2.3 ∆G1i 3.9 -6.0 -2.8 -4.9

The four components (∆E (0) , ∆E (CO) , E (RF ) and λ) of ∆G (Eq. 2.22) are presented

for the S2:2:80 , P2:2:80 and N2:2:80 models (Section 2.2.3). A discussion of the energy components of ∆G is given in Section 2.2.2. All energies are given in kcal/mol. b

The charge states of interest are defined in Table 1.1.

CHAPTER 2. COMPUTER MODELS . . .

52

in ∆E (CO) and E (RF ) is each on the order of 1 kcal/mol, while the uncertainty in λ is approximately 1 kcal/mol. Hence, we estimate overall statistical errors on the order of 2 kcal/mol in our calculation of ∆G.

2.3.1

Calculated diabatic free energy surfaces

Figures 2.2, 2.3 and 2.4 show the E ⊥ = 0 and E ⊥ = 0 projections of the diabatic surfaces for the S2:2:80 , P2:2:80 , and N2:2:80 models. The calculated values for ∆E (CO) , E (RF ) , and λ are presented in Table 2.2. The free energies of transfer for the S2:2:80 , (0)

P2:2:80 , and N2:2:80 models are shown in Table 2.3. Finally, Table 2.4 lists the ∆E1i

for the various charge models. Models with the 2:2:80 dielectric boundary conditions have been explored in greater depth because we think that the 2:2:80 model is the most realistic of the models; the other two are used only to illustrate the effects of varying the strength of dielectric response in our modeling. The one experimentally known free energy (∆G13 ) cannot be used to assess the accuracy of the models because the diabatic free energy gives the right value of ∆G13 by construction. Nevertheless, there are still various pertinent issues concerning the diabatic surfaces. The first issue is whether the values derived for ∆E (0) are realistic. (0)

(0)

Recall that we adjust ∆E13 to give the desired ∆G13 and then set ∆E12 relative to (0)

this value of ∆E13 . Since no direct physical measurement of ∆E (0) has been made, we cannot make a definitive assessment of our calculated ∆E (0) . If we take the estimate of Thompson and Zerner (tz) as a guide for ∆E (0) , we must estimate the uncertainties in the calculation. As mentioned in Section 2.2.2, Gunner et al. raise the possibility (0)

that the tz calculation of ∆E12 is in error by about 6 kcal/mol [27]. However, (0)

Warshel, Parson, et al. argue that the tz estimate for ∆E13 is within 3 kcal/mol of a value consistent with their calculation of a related vacuum energy ∆Egas (See Section A.2.1.) Although these two calculations do not provide a conclusive statement on the validity of the tz ∆E (0) , one can reasonably entertain the possibility that tz ∆E (0) is accurate to about 6 kcal/mol. A comparison between ∆E (0) calculated in our models and the tz numbers is therefore given in Table 2.4. For S2:2:80 , the differences vary from 0.2 to 7.5 kcal/mol,

CHAPTER 2. COMPUTER MODELS . . .

1 2 3

10.0

0.0

−10.0

1 2’ 3’

20.0

Free energy (kcal/mol)

Free energy (kcal/mol)

20.0

53

10.0

0.0

−10.0

−20.0 −40.0 −20.0

0.0

20.0

40.0

−20.0 −40.0 −20.0

60.0

ε|| (kcal/mol)

0.0

20.0

40.0

60.0

ε’|| (kcal/mol)

(a) l

(b) m

Figure 2.2: Diabatic free energy surfaces for the S2:2:80 model Panels (a) and (b) present the l and m diabatic free energy surfaces (Equation 2.1) for the S2:2:80 model. The E ⊥ = 0 and E ⊥ = 0 planes are shown. Not only do these curves incorporate parameters rescaled according to Equation 2.18, but they include (0)

(0)

(0)

(0)

values for ∆E12 , ∆E13 , ∆E12 , and ∆E13 as listed in Table 2.4.

CHAPTER 2. COMPUTER MODELS . . .

1 2 3

10.0

0.0

−10.0

1 2’ 3’

20.0

Free energy (kcal/mol)

Free energy (kcal/mol)

20.0

54

10.0

0.0

−10.0

−20.0 −40.0 −20.0

0.0

20.0

40.0

−20.0 −40.0 −20.0

60.0

ε|| (kcal/mol)

0.0

20.0

40.0

60.0

ε’|| (kcal/mol)

(a) l

(b) m

Figure 2.3: Diabatic free energy surfaces for the P2:2:80 model Panels (a) and (b) present the l and m diabatic free energy surfaces (Equation 2.1) for the P2:2:80 model. The E ⊥ = 0 and E ⊥ = 0 planes are shown. Not only do these curves incorporate parameters rescaled according to Equation 2.18, but they include (0)

(0)

(0)

(0)

values for ∆E12 , ∆E13 , ∆E12 , and ∆E13 as listed in Table 2.4.

CHAPTER 2. COMPUTER MODELS . . .

1 2 3

10.0

0.0

−10.0

1 2’ 3’

20.0

Free energy (kcal/mol)

Free energy (kcal/mol)

20.0

55

10.0

0.0

−10.0

−20.0 −40.0 −20.0

0.0

20.0

40.0

−20.0 −40.0 −20.0

60.0

ε|| (kcal/mol)

0.0

20.0

40.0

60.0

ε’|| (kcal/mol)

(a) l

(b) m

Figure 2.4: Diabatic free energy surfaces for the N2:2:80 model Panels (a) and (b) present the l and m diabatic free energy surfaces (Equation 2.1) for the N2:2:80 model. The E ⊥ = 0 and E ⊥ = 0 planes are shown. Not only do these curves incorporate parameters rescaled according to Equation 2.18, but they include (0)

(0)

(0)

(0)

values for ∆E12 , ∆E13 , ∆E12 , and ∆E13 as listed in Table 2.4.

CHAPTER 2. COMPUTER MODELS . . .

56

(0)

Table 2.4: Comparison between ∆E1i and the Thompson–Zerner estimatesa ∆E (0)b

Valueb

tz valuesb,c

Differenceb,d

S2:2:80 (0)

24.9

23.7

1.2

(0)

30.9

23.4

7.5

∆E12

(0)

24.9

24.7

0.2

(0)

30.9

25.8

5.1

∆E12 ∆E13

∆E13

P2:2:80 (0)

18.1

23.7

-5.6

(0)

24.1

23.4

0.7

∆E12

(0)

18.1

24.7

-6.6

(0) ∆E13

24.1

25.8

-1.7

∆E12 ∆E13

N2:2:80 (0)

7.7

23.7

-16.0

∆E13

(0)

13.7

23.4

-9.7

(0) ∆E12 (0) ∆E13

7.7

24.7

-17.0

13.7

25.8

-12.1

∆E12

a

(0)

Section 2.2.2 explains their calculation. The adjusted ∆E13 is that which gives a (0)

(0)

(0)

∆G13 for the model in question. The adjusted ∆E12 = ∆E13 - 6 kcal/mol. ∆E12 = (0)

(0)

(0)

∆E12 and ∆E13 = ∆E13 . b

Energies in kcal/mol.

c

tz values are from Thompson and Zerner [84].

d

Difference denotes the difference of the calculated ∆E (0) from the tz values.

CHAPTER 2. COMPUTER MODELS . . .

57

while for P2:2:80 , differences vary from 0.7 to -6.6 kcal/mol. If the tz calculations have errors on the order of 6 kcal/mol, then ∆E (0) for the S2:2:80 and P2:2:80 models are not unrealistic. In contrast, various ∆E (0) calculated for the N2:2:80 model differ from the tz values by -9.7 to -17.0 kcal/mol. In the fully neutralized model, the (CO)

magnitude of ∆E13

(0)

is much smaller, thus requiring a dramatic lowering of ∆E13 in

order to make ∆G13 to be -6 kcal/mol. If we assume an error estimate of 6 kcal/mol, these discrepancies then seem unrealistically large, and the N2:2:80 model then does not appear to be a viable model for the prc. The second issue is whether the calculated diabatic free energy surfaces support the experimental finding that the primary et proceeds along the l side instead of the m side? A way to answer the question is to check whether the kinetics implied by the diabatic free energy surfaces matches experimental kinetics. Performing a detailed calculation of the rate constant is not the goal here (primarily because of debates over the kinetic prefactor). Rather, the issue of interest here is whether these diabatic free energy surfaces are consistent with such gross behavior of the prc as the virtual unity transfer down the l branch instead of the m branch. To do a quick check of whether this functional asymmetry is reflected in the diabatic curves for all the models, we model the kinetics of the system as a three–state system (the three states being P, + − P+ H− L , and P HM ). The rate constant for transfer between different states is given

by the golden rule expression (Equation 1.9): kDA =

∗ 2πK 2 exp(−βFDA )  h ¯ 4πβ −1 λDA

(2.24)

(although we assume, for the purpose of this estimate that the prefactor is identical for transfer to all states). With these assumptions, we calculate the equilibrium population for each of these states. We find that almost unity transfer along the l branch for the S2:2:80 and P2:2:80 models. However, for N2:2:80 in which ∆G13 ≈ ∆G13 , the ratio of l to m branch transfer is 85% to 15%. Hence, because the experimentally observed ratio between l side and m side transfer is at least 200 : 1 [31] (see Section 1.2), the N2:2:80 model does not yield the proper kinetics. The third and final matter is whether the models point to a two–step or superexchange mechanism for primary et (see Section 1.2.1). In the cases of the S2:2:80 and

CHAPTER 2. COMPUTER MODELS . . .

58

∗ the P2:2:80 models, state 2 (P+ B− L ) still lies significantly above state 1 (P )—by ap-

proximately 17 kcal/mol and 12 kcal/mol, respectively. In these cases, P+ B− L is still too high with respect to P∗ to support an explicit intermediate. In the fully neutralized charge model, however, ∆G12 falls to approximately 4 kcal/mol. Although ∆G12 is still positive in this case, this model has the smallest ∆G12 . However, the N2:2:80 (0)

(0)

model requires a substantial adjustment in ∆E12 and ∆E13 to bring about this drop in ∆G12 . In summary, there is a definite contrast between the diabatic free energy surfaces constructed for the S2:2:80 and P2:2:80 models, on the one hand, and the N2:2:80 model, on the other. The former models both have realistic values for ∆E (0) , demonstrate strong l–m asymmetry in primary transfer on the order of experimental observations, and point to a superexchange mechanism, with a large ∆G12 . The fully neutralized model has ∆E (0) that differ dramatically from the tz values, supports an unrealistically high population for m side et, and offers definitive evidence for neither a superexchange or a two–step mechanism. This contrast suggests that with realistic modeling of the dielectric boundary conditions (represented by the 2:2:80 model), one can obtain models that fit the experimental value of ∆G13 with plausible values for ∆E (0) with the standard or partially neutralized model (no charge neutralization or very limited amount of charge neutralization). On the other hand, neutralizing all the ionizable groups does not yield results that agree with experimental findings.

2.3.2

Large scale trends in the data and qualitative picture

In this section, we ask the questions: What patterns exist in ∆G and its components? What trends hold in light of the error estimates? If there are robust patterns that hold for a wide range of models, then we can be more confident that these patterns reflect deep essentials of the system. Moreover, is there a way to interpret all these patterns in terms of a simple qualitative picture? By a qualitative picture, we mean one based on simple, plausible assumptions and a few back-of-the-envelope calculations. In this section, we present a qualitative picture for the prc and determine which details can be accounted for by such a picture and which details require more

CHAPTER 2. COMPUTER MODELS . . .

59

other explainations. Here we examine each of the four component energy terms that comprise the thermodynamic driving force, ∆G. For each of these terms, we look for various trends. What is the relative placement of states on a given branch, that is, the + − interrelationship among states 1, 2, and 3 (P∗ , P+ B− L , and P HL ) for the l branch and + − states 1, 2 , and 3 (P∗ , P+ B− M , and P HM ) for the m branch? This question is useful

to answer the issue of superexchange vs. two-step mechanisms. We also examine + − the placement of corresponding states for the two branches (i.e., P+ B− L vs P BM + − and P+ H− L vs P HM ). Studying the relationship between energetics along the two

branches should help in understanding the origins of the l–m functional asymmetry of the prc. We consider also the effects of changing charge states and the treatment of the environmental dielectric boundary conditions on the energetics. Which patterns depend heavily on the exact treatment of the charge states and dielectric boundary conditions? Trends in ∆E (CO) First consider ∆E (CO) , the electrostatic potentials on the various chromophores due to coulombic interaction with the protein. We use the following crude picture as a first attempt to understand the behavior of ∆E (CO) . The chromophores sit in an electrostatic potential generated by the partial charges of the surrounding protein complex. We imagine the potentials to vary uniformly, specifically linearly—at least over the chromophore region. For example, the electric point half way in between SP and HL is the average of the potentials at the two chromophores. Through this picture, we are replacing the granularity of the actual charge distribution with a smooth distribution. In this type of picture, we would expect the following: Because the distance between SP and HL is greater than the distance between SP and BL , we (CO)

might expect that |∆E13

(CO)

| > |∆E12

| (with a similar pattern on the m side. With

increased charge neutralization (S to P to N), we expect to see the magnitudes of ∆E (CO) decrease. Also with increased dielectric screening (2:2 to 2:2:80 to 2:80), we should also see diminishing electrostatic gaps.

CHAPTER 2. COMPUTER MODELS . . .

60

To what extent does this simplistic picture capture the details? The expectation (CO)

that |∆E13

(CO)

| > |∆E12

(CO)

(CO)

| and |∆E13 | > |∆E12 | is indeed confirmed for all the

the S and P charge models for every dielectric boundary condition. That is, electrostatic interactions of the protein give rise to a larger difference in the electric potential between the special pair and the bacteriopheophytin than between the special pair and bacteriochlorophyll of both the l and m branches (at least in the S and P models). This trend continues to hold in the N charge model for the m side but not for the l branch for which the N models show the opposing trend in every dielectric boundary condition. This contrast between the behavior of the S and P charge models, on the one hand, and the N models, on the other is the first of many to be described here. We next consider the role played by increasing dielectric screening moving from the 2 : 2 to the 2 : 2 : 80 and then to the 2 : 80 dielectric models. The prediction (CO)

that increased dielectric screening causes smaller gaps is borne out with |∆E13

|

(CO) (CO) and |∆E13 |, but only in the S and P models. In the N models, |∆E13 | and (CO) (CO) |∆E13 | are already small and remain relatively constant. In contrast to |∆E13 | (CO) (CO) (CO) and |∆E13 |, |∆E12 | and |∆E12 |, already relatively small in magnitude (< 5

kcal/mol), do not change much with varying dielectric screening. What happens when the amount of charge neutralization increases (that is, as we move from S to P to the N models)? There are similar types of contrasts involved + − here (between states 2 (P+ B− L ) and 3 (P HL ), that is). For all dielectric boundary (CO)

conditions, |∆E13 (CO)

example, ∆E13

(CO)

| and |∆E13 | decrease as charge neutralization increases. For

are -18, -11, and -1 kcal/mol in the S2:2:80 , P2:2:80 , and N2:2:80 models, (CO)

respectively. In contrast, while |∆E12

(CO)

| actually increases and |∆E12 | stays fairly

stable with increasing neutralization. There are other trends that do not actually contradict the qualitative picture but which are not predicted in the context of this picture. One such example is the effect of the protein on the relative energetics of the various charge states. In every model (CO)

are the following conditions: E3 (CO) E3


|E12

(RF )

(RF )

| and |E13 | > |E12 | because of

the greater distance that separates the special pair from the pheophytin than from the bacteriochorophylls. To a good zeroth order approximation, the prc has rough symmetry about the axis joining the SP with the non-heme iron, and hence, we (RF )

should expect E12

(RF )

(RF )

to be roughly equal to E12 , and E13

to be roughly equal to

(RF )

E13 . The numbers from the delphi calculations depend on the exact shape of the dividing surface of the prc. There is rough symmetry (within 4 kcal/mol) between l and m, but the E (RF ) terms do work to lowering state 3 with respect to state 3’ and raising state 2 with respect to 2’. Finally, E (RF ) for the 2 : 2 and 2 : 2 : 80 dielectric models are very similar (both are within 1 kcal/mol of each other), while both are significantly smaller in magnitude than the corresponding energies in the 2 : 80 model. Trends in the overall driving force ∆G When we consider the net effect of these individual energy terms as manifested in the overall thermodynamic driving force (Table 2.3), we find that ∆G13 < ∆G13 and ∆G12 > ∆G12 for all three of the models with the 2:2:80 dielectric boundary condition. Before actually performing the detailed calculations, it is difficult to predict

CHAPTER 2. COMPUTER MODELS . . .

63

trends in ∆G. The net driving force ∆G is a balance between various terms: λ and E (RF ) are inherently negative, while ∆E (CO) can be positive, negative, or zero. Hence, it is unclear how ∆G as a whole should behave in the different models. There are clearly limitations on our simple qualitative picture of the prc. A number of reasons exists not to expect great accuracy from such a picture. First of all, the charges in the models are not distributed continously but discretely. It is (CO)

not surprising that ∆E13

(CO)

would follow the predicted trends better than ∆E12

.

The distance involved for the former is larger, and the type of charge continuum type behavior which underlies the qualitative picture becomes more obvious with greater distances. The qualitative picture also fails to explain the contrast between the behavior of the S and P charge models vs the behavior of the N model. Through a detailed analysis of the “individual contributions” to ∆E (CO) (given in the next section (2.3.3)), we will be able to rationalize this failure.

2.3.3

Explaining the trends: The various charge and dielectric models

In the remainder of this chapter, we return again to studying the various gaps we have identified as relevant to key questions about the function/structure relationship of the prc. The qualitative picture explains some trends but not others. How do we account for this? How do we move beyond the qualitative study? The strategy we take here is to examine in detail the prc at the level of amino acid residues and their contributions to the various gaps of interest. The relevant questions then become: Can we explain the trends (and breaks in trends) in terms of individual contributors? How do changing charge neutralization and dielectric response manifest themselves on this level? What are the important degrees of freedom in this system— are they to be uncovered at the level of single residues or are the larger scale collective effects going on? If there are key residues in the function of the prc, are they going to be discovered by looking at the key contributors to the various gaps that we have identified as interesting? We first consider the role played by the dielectric boundary condition. As discussed

CHAPTER 2. COMPUTER MODELS . . .

64

above, we expect that as the amount of dielectric screening is increased (going from 2:2 to 2:2:80 to 2:80), ∆E (CO) should diminish in magnitude. To get an intuitive feel for the behavior of ∆E (CO) under the various dielectric boundary conditions, we visualize and study the Green’s function. In other words, we can look at the electrostatic response to a single point charge. Figure 2.5 shows the electrostatic potential throughout the prc due to a single point charge in the middle of the special pair in the 2:2, 2:80, and 2:2:80 dielectric boundary conditions. The potential is shown at the positions of the atoms of the prc. In the 2:2 dielectric boundary condition, there is a slow 1/r drop in the Green’s function. The decay of the Green’s function is dramatically more pronounced for the 2:2:80 model and even greater for the 2:80 model. It is possible to make this argument more quantitative. So far, we have presented the electrostatic gaps (∆E (CO) ) as a net sum. Of course, these gaps are sums of electrostatic gaps from the individual constituents of the prc. There are many ways to partition the complex. Here, we break the prc into what we would term the “residual level”: the amino acids, chromophores, water molecules, various free-floating (CO)

units. We then study a net electrostatic gap (eg., ∆E13 (CO)

contributions (the part of ∆E13

) as the sum of residual

due to the charge distributions in a particular

residue.) This method of looking at these individual residual contributors provides the basis of most of the following discussion. We use various ways to present these contributors. One way we frequently use is to plot the individual electrostatic contribution of all the residues vs the z–coordinate of the center of the contributor. (See Section 2.2.3 for the definition of the coordinate system we use.) These plots allow us to see roughly the locations and magnitudes of all the contributors at once. One can also pick spatial coordinates other than the z–direction to specify position. We will see, however, that the selection of the z– direction (being perpendicular to the membrane) demonstrates both the role played by the membrane as well as dependence of ∆E (CO) on the distance of separation for contributors. We can now use this type of analysis to better understand various questions we have posed. First consider the role played by the dielectric boundary condition.

CHAPTER 2. COMPUTER MODELS . . .

(a) 2:2

65

(b) 2:80

8.0 6.0 4.0 2.0

(c) 2:2:80

(d) calibration

Figure 2.5: Green’s functions for dielectric models The electrostatic response due to a single point charge in the middle of the special pair (SP) is plotted for the (a) 2:2 (b) 2:80 and (c) 2:2:80 dielectric boundary conditions. The potential is shown at the positions of the atoms of the prc. The spheres are colored according a scheme (d) that maps a color from the magnitude of the electrostatic potential at the sphere (given in units of kcal / e− ˚ A). Potentials greater than 10 kcal / e− ˚ A are plotted as 10 kcal / e− ˚ A. Recall that the prc is 130 ˚ A in length.

CHAPTER 2. COMPUTER MODELS . . .

66

Figures 2.6, 2.7, and 2.8 show the contributors for the standard charge model in the 2:2, 2:2:80, and 2:80 boundary conditions, respectively. In all three cases, there is a general pattern of decay in the contribution as residues move away from the relevant sites (just what was demonstrated in our discussion of Figure 2.5). As the dielectric response increases, the length scale of this decay decreases significantly. For instance, there are still contributions on the order of 1 kcal/mol for z = 50 ˚ A for 2:2 whereas contributions of this magnitude do not exist for z > 25 ˚ A in the 2:2:80 model. The 2:2 dielectric model is essentially the electrostatic model of mgcn. It was argued by Marchi et al. that adding all the amino acid contributions is essential to calculating ∆E (CO) accurately. Indeed, this point is illustrated by the slow decaying contributions to the gaps vs z in Figure 2.6, the graph for the 2:2 model. In contrast, for the 2:2:80 dielectric boundary condition (and in particular the 2:80 dielectric model), the contributions of residues far away from the central chromophores are small regardless of their charge state. Another way to illustrate the differences among the dielectric models is to plot the cumulative sum of the residual contributors (Φ(z)) versus z. (That is, Φ(z0 ) is the sum of all contributions of ∆E (CO) of residues with z 0 (providing a rationale why electron transfer down the l branch (RF )

is favorable over transfer down the m branch.) Note that E13 most interesting matter is the behavior of

(CO) ∆E3 3

(RF )

> E13

. Perhaps the

(To what extent is the asymmetry

located in the electrostatics of the protein complex?). The most striking instance (CO)

of asymmetry occurs in the S charge model (in which ∆E3 3

= 8 kcal/mol). This

asymmetry is small in the P and N charge models, however. We can account for these (CO)

differences by looking at the largest residual contributors to ∆E3 3 . (See Table 2.10). We now turn to ∆G2 2 . We have noted above that ∆G2 2 > 0. There are various (RF )

factors that make ∆G2 2 > 0 (including the fact that E12

(RF )

> E12 ). We have dis-

cussed some of the anomalous behavior of ∆G12 (in terms of following the qualitative picture) that came primarily due to TyrM208. We find that this residue is also a (CO)

major contributor to ∆E2 2 —in fact, the largest one in the P and N charge models. Its contribution is positive, and therefore, TyrM208 is one of the main reasons why ∆G2 2 > 0. Because of the high ∆G12 in our models, we favor a superexchange mechanism rather than a two-step mechanism. Consequently, we would not locate the source of (CO)

functional asymmetry to ∆G2 2 but more to ∆G3 3 . ∆E12

(CO)

and ∆E12

come into

the exponential prefactor of the rate term, whereas ∆G13 and ∆G13 are reflected in the activation barrier in the the exponential factor.

CHAPTER 2. COMPUTER MODELS . . .

104

(CO)

Table 2.10: Major residual contributors to ∆E3 3 S2:2:80

P2:2:80

N2:2:80

residue

value

residue

value

residue

value

Arg M130

6.2

Arg M251

-2.8

Pro L124

-1.4

Arg L103

-5.9

Lys L110

-2.1

Ala M215

-1.4

Arg H33

-3.1

Glu L212

-2.0

Gly M209

1.3

Glu L106

2.5

Glu L6

1.9

Ser L178

-1.2

Asp H36

2.4

Arg M134

1.7

Gly L188

1.2

Asp L218

-2.4

Pro L124

-1.4

Tyr M208

1.1

Glu L212

-2.1

Ala M215

-1.3

Val L117

1.1

Arg M251

-2.0

Glu M244

1.3

Trp M127

1.0

Asp M27

-2.0

Asp M2

1.3

Val L123

-1.0

Lys L110

-2.9

Arg L109

-1.2

Val L182

-1.0

(CO)

The largest residual contributors by magnitude to ∆E3 3

for the S2:2:80 , P2:2:80 and

N2:2:80 models are listed along their electrostatic contributions. All energies are listed in kcal/mol.

CHAPTER 2. COMPUTER MODELS . . .

105

We are commenting on this gap more as a response to the suggestion made by Gunner et al. that the asymmetry is located more at the level of the bacteriochlorophylls than at the bacteriopheophytins. The electrostatic calculations of Gunner et al. are basically the same as the S2:2:80 model, except that the contribution of λ to ∆G is either ignored or accounted for in an “effective” manner [27]. The differences between their conclusions and ours can be traced to the treatment of two different residues. They find “a significant difference in potential at BL and BM and a smaller difference between HL and HM .” and argue that the major source of asymmetry is really at the level of the bacteriocholorphylls and not at the pheophytins. We find a bigger difference between HL and HM . One possible source of difference could be (CO)

that the orientation used for TyrM208. The differences calculated for ∆E3 3

could

result from different treatment of the charge state for the lysine terminus of the c protein. In the S model, it is given a net neutral charge (the positive charge given to the lysine terminus of the c protein as a base is cancelled by the negative charge assigned to it because it is a C–terminus). In Gunner’s model, it is charged.

2.3.8

Comments on the validity of the partially neutralized and fully neutralized charge model

It is important to realize that the partially neutralized model is an ansatz based on two assumptions: 1) that the membrane is actually where it was placed in the calculations and 2) only the ionizable groups in this membrane are neutralized, and all the potentially ionizable groups outside membrane region are assigned their charge state expected in water. Because both the position of the membrane and the actual charge states have not been conclusively determined experimentally, we do not claim that the charge states used in the partially neutralized model are an accurate guide to the actual charge states of amino acids in the prc. We wanted to study a charge model that has an amount of charge neutralization between that of the standard charge model and the completely neutral model—and the partially neutralized model provided a reasonable choice. That the partially neutralized model in the reasonable 2:2:80 boundary condition agrees with much of the experimental data (the value of

CHAPTER 2. COMPUTER MODELS . . .

106

∆G13 and the prevalence of l–side over m–side et) strongly suggests that applying some neutralization (but not total neutralization) is the proper way to go. The question of what to neutralize remains unanswered. If the issue involved is the amount of error incurred in our delphi calculations from having a different membrane location, then we have estimates of the size of this systematic error in the Appendix (it is on the order of 1-1.5 kcal/mol). In the course of the molecular dynamics simulations for the P and N charge models, certain of the crystallographically resolved waters in the complex escaped from the interior of the prc. Specifically, water #464 escapes in both the P and N models, whereas water #197 leaves the N model interior. There are at least two ways to account for this behavior. The first is to argue that the lack of water on the outside of the prc in our simulations has set up a chemical potential gradient to draw out interior water molecules. However, this does not explain why no waters are seen to escape the interior of the S charge models. Another possible explanation is that perhaps the fact that crystallographically resolved waters leaving the complex indicates an error in the modeling of amino groups around these waters—that certain residues neutralized in the P and N charge models should be left charged. We test this hypothesis by determining whether the escaped waters actually originate close to groups which have been neutralized. We determined that water #46 is ˚ of CysC1 (which as a N–terminus within the membrane region, located within 2 A was neutralized in the P model.) Water #197 is closely located to many residues that are neutralized for the N charge model; it is unclear that we can identify one amino acid whose change in charge state would cause the departure of the water. We altered the P charge model by placing the charge state of CysC1 back to its charged state and began the md run. We find that after approximately 50ps, water #46 remains within 2 ˚ A of its starting position. This suggests that CysC1 should not have been neutralized. We could try something similar with Water #197 by charging various residues for the N charge models, but such a procedure would violate the spirit of the fully neutralized model. If it works out that Water #46 does not escape once 4

The numbering for the water molecules is based on the order of the water molecules listed in the pdb file for Rps. viridis in the Brookhaven Protein Data Bank.

CHAPTER 2. COMPUTER MODELS . . .

107

the neighboring amino acid is restored to its charged state, this could indicate that the escape of waters is an indicator of incorrect charge states. The point of the P charge models is to simulate a system with some of the possibly ionizable amino acids uncharged. We would like to figure out which ones. The N charge model represents one extreme, and tinkering with it by charging up a few groups does not provide significant new information.

2.4

Conclusions

In this chapter, we present diabatic free energy surfaces calculated for various models for the prc. We have examined three different treatments for the charge states of ionizable groups (the S, P, and N charge models) and three different treatments for the dielectric response of the environment (the 2:2, 2:2:80, and 2:80 dielectric boundary conditions). For the nine combinations of charge models and dielectric conditions, we calculate three of the four components (∆E (CO) , E (RF ) , and λ) that comprise the free energies of electron transfer (∆G). ∆E (0) is constrained to yield the experimental value of ∆G13 . We focus primarily on the 2:2:80 boundary condition as the most physically realistic of the three boundary conditions used. Since ∆G13 is constructed to have the experimental value, we used other criteria to determine the credibility of the models. No definitive values exist for ∆E (0) . If we assume that the tz calculations of ∆E (0) have uncertainties on the order of 6 kcal/mol, then the S2:2:80 and P2:2:80 models yield values for ∆E (0) are consistent with the tz estimates. Moreover, kinetics calculated for primary et in both these models showed l–m asymmetry in accordance to experiment. In contrast, the N2:2:80 model has unrealistically small values for ∆E (0) (assuming the basic validity of the tz calculations). Furthermore, the N2:2:80 model implies an unrealistically high population for m side et. Hence, it seems that the N2:2:80 model is characterized by excessive charge neutralization. Our calculations provide evidence that using either the standard charge states or neutralizing some amino acids is proper for molecular modeling of the prc. In both the S2:2:80 and P2:2:80 models, ∆G12 is large and positive, lending support to a superexchange, rather

CHAPTER 2. COMPUTER MODELS . . .

108

than two–step, mechanism for primary et. A “qualitative picture” of the prc provides a simple partial explanation for various “patterns” that we see in the models, especially in terms of the effects of the dielectric and charge models on ∆E (CO) . Looking at individual contributors is an effective way to understand the physics of the models and also to identify possibly significant amino acid residues. With varying dielectric screening, the range of significant contributors changes quite a bit. There are a number of what might be larger contributors. However, these gaps may be the results of a collective effect. Experiment will be needed to determine this question. There are loose waters in our partially neutralized and fully neutralized models which may indicate that certain residues should remain charged. ∆G12 is large and positive in the S2:2:80 and P2:2:80 models. Part of the reason is the orientation of TyrM208, but even using an opposite orientation still leaves models favoring a superexchange model. Various amino acids which were found to (CO)

be important in ∆E13

(CO)

and ∆E3 3

are suitable possible targets for site-directed

mutagenesis. The Warshel neutralization ansatz is not borne out in our calculations. mgcn does actually overestimate the range of important amino acids (if the 2:2:80 dielectric model is indeed more accurate than the 2:2 model). It is still difficult to make direct comparisons between our work and that of Warshel et al. The S2:2:80 model generally agrees with the work of Gunner et al. except that they do not treat explicitly λ, the reorganization energy.

CHAPTER 2. COMPUTER MODELS . . .

2.A

109

Appendix: Error analysis

We have estimated the expected errors in each of the four terms that together comprise the total thermodynamic driving force (∆G). Here, we summarize our analysis of the errors for each term.

2.A.1

Electrostatic interaction with the protein complex E (CO)

There is a number of possible errors in our calculations of E (CO) . First are the numerical errors incurred by delphi. The existence of glassiness and long-time motions cause uncertainty in the calculations (that is, long timescale motions of the protein actually shift the electrostatic gaps among different regimes, each with a metaequilibrium gap [97]). The uncertainty in the position and size of the membrane can also affect the value of ∆E (CO) . Finally, we expect some error from approximating average electrostatics over an entire MD trajectory as the electrostatics of an average configuration. We make estimates for the magnitudes of these possible errors. To estimate the size of numerical errors involved in our delphi calculations, we compared an essentially analytic (infinite series) solution to the Tanford-Kirkwood problem to potentials calculated by the delphi methodology we used to calculate ∆E (CO) . That is, we used the same series of focussing grids, in this case, on a 60 ˚ A sphere with an internal dielectric constant of 1 embedded in an external dielectric continuum with constant 80. A unit point charge was placed in the middle and test points were scattered throughout the sphere. In general, the numbers calculated with delphi were quite close (< 0.1 kcal/mol difference) to the exact answers. The largest difference came ˚ away from the charge source (102 vs 104 kcal/mol). Although in a test point 5 A such a discrepancy could be worrisome if many charges were located within 5 ˚ A of the centers of chromophores, other than separation between TyrM208 and BL the smallest separation of a substantial partial charge (0.4 electronic charge) to a chro˚. In fact, once the test point is more than 10 ˚ mophore center is 4.5 A A away from the center, the errors fall to less than 0.1 kcal/mol. Therefore, we expect numerical

CHAPTER 2. COMPUTER MODELS . . .

110

errors in the delphi pb solver to contribute very little to the overall errors. Hence, we estimate errors on the order of 0.5 kcal/mol by a weighted average of this source. (It is complicated to get a “perfect” analysis here because the exact distance between the test charge and the point in space for which the net potential is calculated varies. If all the charges were more than 10 ˚ A away, then we would expect net errors to be more around 0.1-0.2 kcal/mol. However, some charges are closer.) To estimate statistical errors due to protein glassiness, we compared the electrostatic gaps in two md runs, one following the other, each 56ps in length. We found that the differences in the electrostatic gaps ranged up to 0.8 kcal/mol. Is this difference statistically significant, thus indicating some sort of metastability in gap fluctuations? Plotting the cumulative averages of both runs against each other clearly displays this effect. Another approach is to assume that the gaps are characterized by (CO)

one (short) “correlation time.” We calculated the autocorrelation function of ∆E13

(in the 2 : 2 boundary condition) and found the correlation decay time to be approximately 88.5 fs. Assuming that each 88.5 fs segment of the gap is statistically independent and knowing the characteristic fluctuations in the electrostatic gaps, we estimated then that expected errors in comparing two 56 ps trajectories should be on the order of 0.2 kcal/mol. Hence, a difference of 0.8 kcal/mol indicates that we might actually be comparing trajectories that reflect different averages. (see Figure 2.28.) In other words, the trajectories may be “metastable,” indicating glassy behavior of the protein. Based on this calculation, glassiness by itself may result in errors of up to 1 kcal/mol in the calculated potentials. Uncertainty in the position of the actual biological membrane relative to the prc is a possible source of systematic error. To estimate its magnitude, we calculated ∆E (CO) with the membrane in two different configurations: the position upon which all our calculations are based and the location used by Gunner et al [27]. In the standard charge model (S2:2:80 ), we found that the gaps varied by about 1-1.5 kcal/mol between membrane locations. In doing this error analysis, we were determining the direct effect of using a different membrane location. That is, when we change the membrane location, we are not also changing the identity of amino acids that we neutralize in the P models. Hence, if we were actually to charge and uncharge groups

CHAPTER 2. COMPUTER MODELS . . .

111

electrostatic energy gap (kcal/mol)

−21.0

−22.0

−23.0

−24.0

−25.0 0.0

10000.0 20000.0 time ( / 2.4 fs) (CO)

Figure 2.28: Glassiness in the convergence of ∆E13 (CO)

The cumulative average of ∆E13 trajectories.

30000.0

in S2:2

calculated by md for two different equilibrated

CHAPTER 2. COMPUTER MODELS . . .

112

in the P2:2:80 model, we would expect larger errors since according to the intent behind the partially neutralized model, changing the membrane location and size would also lead to altering the charge states of amino acids. We do not look at that source of change in this section and consider that as a source of systematic, rather than random, errors. Finally, we address the errors associated with calculating the electrostatic gaps as the electrostatics of an average configuration instead of averaging the potentials over an entire trajectory of configurations. To estimate the typical magnitude of this error, we calculated ∆E (CO) for ten different configurations for the P2:2:80 model, spread over a 4ps trajectory. The difference between the average of ∆E (CO) and ∆E (CO) for the average position varied between 0.13 kcal/mol to 0.38 kcal/mol (with a mean of 0.25 kcal/mol). Although it possible that these sources of error are interrelated (i.e., statistically correlated), we assume that they are statistically uncorrelated for making an estimate of the overall statistical error for ∆E (CO) . In such a situation, the square of the overall errror is the sum of squares of individual errors. We make an estimate of 2 kcal/mol for the expected errors in ∆E (CO) , based on error estimates of 0.5 kcal/mol for imprecisions in delphi calculations; 1 kcal/mol for protein glassiness; and 0.25 kcal/mol for calculating the electrostatics of the average configurations instead of the average electrostatics for an ensemble of configuration. There remains still the possibility of a 1-1.5 kcal/mol systematic error due to uncertainty in the membrane location, which is not included in the estimate of statistical errors. (Elsewhere we also explore the role played by TyrM208 in the calculations of ∆E (CO) . We treat this source of error as a source of systematic error rather than statistical uncertainty.)

2.A.2

Reaction field E (RF )

The statistical uncertainty in E (RF ) arises from uncertainty in the definition of the surface separating dielectric regions. To get some idea of the sensitivity due to this uncertainty, we calculated E (RF ) in the cases in which the radii defining the size of atoms in the pb calculations were increased by 10% and decreased by 10%. Because

CHAPTER 2. COMPUTER MODELS . . .

113

the surface dividing dielectric region is defined by the outer limits of the atoms of the interior region, changing the atom radii is a quick way of shifting the entire dividing surface. (See Figure 2.29 for a sketch of this procedure.) (The effect of such size shifts is comparable to those due to imprecision in the definition of the boundary.) The values of E (RF ) changed from 0.6 to 1.7 kcal/mol with these radius shifts, with an average shift of approximately 1 kcal/mol, the figure we will use as our estimate of the statistical errors in E (RF ) .

(a) (b) (c) Figure 2.29: Estimating errors in E (RF ) Three calculations for E (RF ) were performed: (a) the first uses the default atom radii in delphi; (b) the second uses radii that are reduced by 10% from the default radii; and (c) the third uses radii that are increased by 10% from the default radii. These sets of radii correspond to different shapes for the dielectric dividing surface and, therefore, different values in E (RF ) , which we use in our error estimates for E (RF ) .

CHAPTER 2. COMPUTER MODELS . . .

2.A.3

114

Reorganization energy λ

Consider λ13 . Recall from Equation 2.12 and Equation 2.18 (which takes into account the rescaling of λ13 by ∞ ) that λ13 = β(δ∆E13 )2 1 /2∞ , where β = 1/kB T . Hence the statistical uncertainty in λ13 is directly proportional to the uncertainty in (δ∆E13 )2 . From a 56 ps trajectory of ∆E13 in the S model, it was calculated that (δ∆E13 )2 was 9.0 (kcal/mol)2 . The autocorrelation function of ∆E13 indicated a correlation time of roughly 90 fs. We estimate the square of the expected error in λ13 to be: β

(δ∆E13 )2 1 9.0(kcal/mol)2 = , 2∞ N 2.4N kcal/mol

(2.27)

where N is the number of independent observations for (δ∆E13 )2 . To estimate N , we assume that each correlation time represents an independent observation. Hence, N = 56000/90, and the expected error in λ13 is therefore 0.13 kcal/mol. Estimates for the P and N models also yield estimates on the order of 0.1–0.2 kcal/mol. However, there seem to be longer timescale motions that render this estimate inaccurate. Figure 2.30 is a plot of the convergence behavior of λ13 vs timesteps in the simulations of various systems, including the wildtype (wt) system studied in this chapter. Consider λ13 for the wt system. The value of λ13 after 20 000 timesteps differs from the final value of λ13 (at about 65 000 timesteps) by about 0.8 kcal/mol. The graph for the convergence of λ13 . (Figure 4.3 shows uncertainties more on the order of 1.0 kcal/mol. We therefore use 1 kcal/mol as our estimate of uncertainties in λ.)

2.A.4

Vacuum gaps ∆E (0)

Since we do not calculate ∆E (0) but fit it so that ∆G13 = −6 kcal/mol, there is no statistical uncertainty associated with ∆E (0) .

2.A.5

Overall errors in ∆G

In conclusion, these error estimates are best summarized in the form of a table, in which the estimates for the individual terms are collected. We refer these estimate

CHAPTER 2. COMPUTER MODELS . . .

115

8.0 WT β charged double neutral double

−1

λ (kcal mol )

6.0

4.0

2.0

0.0

0

20000

40000 60000 timesteps (/2.4 fs)

Figure 2.30: The convergence of λ13

80000

CHAPTER 2. COMPUTER MODELS . . .

116

in stating what conclusions can drawn about my calculations based on these error estimates. An error estimate for ∆G is based on the assumption that all the errors are uncorrelated. The overall estimates are contained in Table 2.11. Table 2.11: Our error estimatesa ∆E (0)

∆E (CO)

E (RF )

λ

∆G b

0

1

1

1

2

a

All energies are listed in kcal/mol.

b

The overall error in ∆G estimated by assuming that uncertainties in ∆E (CO) , E (RF )

2 2 = σ 2 (CO) + σ 2 (RF ) + σλ , where and λ are statistically independent. Hence, σ∆G ∆E E σ∆G , σ ,σ , and σλ are the error estimates for ∆G, ∆E (CO) , E (RF ) and ∆E (CO) E (RF ) λ, respectively.

117

Chapter 3 Steffen/Boxer Experiments 3.1

Introduction

In Chapter 2, we calculate the diabatic free energy surfaces relevant to the primary et. These calculations include those of the electric potentials (∆E (CO) ) within the prc and their effect on the energetics of et. In addition to work aimed at measuring the free energy of transfer (Section 1.2), some experiments have been directed at measuring internal electric fields (the spatial derivatives of the electric potentials). Some workers have measured the effects of externally applied electric fields [98, 53] while others have examined internal fields induced by changes in internal charge distributions [54]. In this chapter, we examine an intriguing experiment of the second type, that of Steffen, Lao, and Boxer [54] (hereafter refered to as slb). Measuring the Stark shifts at various chromophores due to the charge shift P∗ → P+ Q− A , Steffen et al. find that that effective dielectric constants calculated for probe chromophores are higher along the l branch than along the m branch. They suggest that this difference gives rise to greater charge stabilization on the l side, thus promoting electron transfer down l rather than m. Because slb involves changes in internal electric fields arising from charge separation within the complex, we should, in principle, be able to rationalize the experiment through a slight modification of the techniques given in Chapters 1 and 2. One explanation for the l–m asymmetry arising from our studies in Chapter 2 is the difference

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

118

in free energy of et along the two branches. How would this explanation be reconciled with that of slb? Differences in ∆G constitute a fundamental explanation of l–m asymmetry, while slb is not so much a fundamental explanation as a statement of higher level properties of the prc. The question is whether these properties can be derived from the low level specifics of our models for the prc. In the rest of this chapter, we first describe slb and the theoretical framework and computational methodology we use to interpret slb. We then we present our calculations of the effective dielectric constants for the various chromophores.

3.2

Theory and methodology

In this section, we develop a theoretical and computational formalism to analyze slb. The theoretical methodology, specifically, the overall interpretive framework, as embodied in equations 3.4, 3.5 and 3.6, was formulated entirely by Michael H. New. The actual computation of the electric fields and most of the error analysis has been performed by R. Yee. The central aim of slb (and, therefore, of this analysis) is to characterize the dielectric response of the prc in terms of local effective dielectric constants. Dielectric response is the redistribution of nuclear and electronic charge in response to a perturbation (in the case of slb, the perturbation is from a change in the charge state). The system response can be quantified by a dielectric constant—the ratio of the magnitude for a given dielectric effect in the system vs its magnitude in a reference system with no dielectric response (defined as having a constant of unity). A spatially uniform dielectric constant is appropriate when the response everywhere is the attenuation of the electric potential by that constant. In slb, effective dielectric constants are defined and calculated for the spatially inhomogenous prc. There are few conventional definitions for a local dielectric constant but in this context, the effective dielectric constant in slb encapsulates the dielectric response as measured at specific locations in the prc. In slb, the change in charge state is that from P to P+ Q− A . The dielectric effect probed by absorption spectroscopy in slb is the electrochromic shift, the change in

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

119

the Stark shift due to changes in the electric fields arising from shifting charge. The Stark shift can be written as: ∆ν = −|∆µ||E|cosθ,

(3.1)

and the electrochromic shift, as: ∆∆ν = ∆ν P+ Q− − ∆ν P∗

(3.2)

A

Here, E is the electric field at the center of the probe chromophore, while ∆µ is the change in dipole moment upon excitation. Steffen et al. determine ∆µ experimentally. Moreoever, ∆ν P+ Q− is the Stark shift when the prc is in the P+ Q− A charge state, while A

∆ν P∗ is the shift for the excited state, P∗ . Hence, following Steffen et al., the effective dielectric constant at a probe chromophore is defined to be the ratio of the change in the electrochromic shift (due to the change in charge states) calculated for a reference system with unity dielectric constant (∆∆ν calc ) to the change in the electrochromic shift measured in the prc (∆∆ν obs ). In other words, the definition of the effective dielectric constant is: eff =

∆∆ν calc ∆∆ν obs

(3.3)

Steffen et al. calculated the difference in electrochromic shift in the reference system by simple electrostatic calculations using the x-ray crystal structure. We emulate the experimental procedure with our molecular dynamics and electrostatics calculations. We also want to calculate eff for a given site i: 0

0

∆µ(i) P∗ · E(i) P+ Q−A /P∗ − ∆µ(i) P∗ · E(i) P∗ /P∗ ∆∆ν (i) calc eff (i) = = . ∆∆ν (i) obs ∆µ(i) P+ Q− · E(i) P+ Q−A /P+ Q−A − ∆µ(i) P∗ · E(i) P∗ /P∗

(3.4)

A

The subscripts on the dipole moment change vectors (such as ∆µ P∗ and ∆µ P+ Q− ) A

indicate the charge configuration for which the vectors are calculated. Similarly, two subscripts on the electric field vectors indicate first the charge distribution giving rise to the field and then the charge state to which the atomic configuration is equilibrated. 0

The superscript zero (in, for example, E(i) P+ Q−A /P∗ ) indicates that this electric field is calculated with a uniform dielectric constant of one.

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

120

For the numerator, the electrochromic shift for the rigid reference system arises solely from changes in the internal electric field due to the first order effect of setting up the dipole of P+ Q− A , without any second order fields arising from other charges’ moving in response to P+ Q− A . Since there is no nuclear reorganization, we calculate the dipole change vector (∆µ) and the electric field vectors with respect to the configuration equilibrated to P∗ . For the denominator, we are interested in the nuclear reorganization in response to the change in charge state—hence, the dipole change vectors and electric fields must be calculated with respect to the approriate equilibrated configurations, and not just P∗ in both cases. In the case of a system with a uniform dielectric constant, it is easy to see that this formula for eff reduces properly to that constant. We calculate eff for every charge model in the 2:2:80 dielectric boundary condition. For the 2:2 dielectric boundary condition, with the S charge model, we use molecular dynamics to estimate eff , which is an ensemble average by averaging the various terms (such as ∆µ P∗ · E P+ Q− /P∗ ) over a md trajectory. A

md = ∞

∆µ P∗ · E 0P+ Q− /P∗ P∗ − ∆µ P∗ · E 0P∗ /P∗ P∗ A

∆µ P+ Q− · E 0P+ Q− /P+ Q− P+ Q− − ∆µ P∗ · E 0P∗ /P∗ P∗ A

A

.

(3.5)

A

A

The electric fields were calculated by summing over Coulomb’s law. For the other dielectric boundary conditions (2:2:80 and 2:80), we resorted to delphi to solve the Poisson-Boltzmann equation for the electric field: delphi = 2

∆µ P∗ · E 2:2 − ∆µ P∗ · E 2:2 P∗ /P∗ P+ Q− /P∗ A

∆µ P+ Q− · E P+ Q− /P+ Q− − ∆µ P∗ · E P∗ /P∗ A

A

.

(3.6)

A

To reduce the amount of computer time involved, we approximated the ensemble average by calculating eff based on the electric fields and dipole change vector for the average spatial configuration. (See below for a discussion of this approximation.) Values for the dipole moment change vectors for each of the four probed chromophores are needed. The magnitude of ∆µ was measured by Steffen et al. and the location of ∆µ in a molecular reference frame was determined. For each chromophore, the direction of ∆µ was found by Steffen et al. to be ζA degrees from the vector connecting the nitrogens of the a and c rings of the macrocycle. For each

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

121

Table 3.1: Local dielectric constants from calculations on slb model

chromophores BL

BM

HL

HM

experiment

4.7

1.5

4.5

1.6

orac

2.9

-0.9

2.4

1.7

S2:2:80

0.4

0.8

2.2

1.3

P2:2:80

2.2

0.8

2.9

2.1

N2:2:80

2.2

1.3

2.0

5.9

ˆ. chromophore, and for each configuration, this vector was computed, which we call p ˆ  , which is a vector perpendicular to p ˆ but in the plane Consider also the vector p of the macrocycle. For each chromophore then, the direction of the vector ∆µ was computed as  = cos(ζ)ˆ ∆µ p + sin(ζ)ˆ p .

3.3

(3.7)

Results and discussion

To calculate the effective dielectric constants listed in Table 3.1, fifty configurations, stored every 80 fs from a 4 ps orac production run were used. Electric fields were calculated from a configuration whose atomic positions are averages of the positions of these fifty configurations. The production run commenced after a 10ps equilibration run. A uniform dielectric constant of 2.0 was used for the orac computations. Unfortunately, the calculated eff do not follow any easily discernable pattern. Indeed, values which are less than unity are actually unphysical (such as those for BL and BM in S2:2:80 ). Hence, there is no point in comparing these values to the experimental eff . This lack of realism indicates some fundamental problems with the methodol-

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

122

Table 3.2: Refined calculation of eff for S2:2:80 a

a

chromophore

eff

BL

0.56

HL

2.4

BM

2.6

HM

2.3

20 configurations averaged.

ogy. One probably egregious approximation is the use of electric fields from averaged configurations (e.g., ∆µ P+ Q− · E P+ Q− /P+ Q− ), rather than averaging the relevant A

A

A

quantities (such as ∆µ P+ Q− · E 0P+ Q− /P+ Q− P+ Q− ). We examine this possibility by A

A

A

A

calculating the four terms of eff in the right hand side of Equation 3.4 as ensemble averages for one particular model, S2:2:80 . Specifically, for twenty of the fifty configurations, we have calculated the electric fields at the probe chromophores using delphi, which we then used to calculate eff . The results are given in Table 3.2. Figure 3.1 shows the rate of convergence for eff for the various probe chromophores. When examining Tables 3.1 and 3.2 and Figure 3.1, we see a major difference between calculations of eff based on an averaged configurations and the more computational expensive evaluation of the ensemble average. For instance, we see instances of negative eff when few configurations are averaged in Figure 3.2 for the evaluation M M of B (that is, eff for BM ). However, negative values B disappear with more eff eff averaging. We note two trends for eff as presented by slb. First, eff are consistent within the branches; BL and HL are 4.7 and 4.5 and BM and HM are 1.5 and 1.6, respeceff

eff

eff

eff

tively. Second, eff of the l chromophores are significantly higher than eff for the m chromophores. When we turn to studying our calculations of eff , we see that more HM have actually M calculations would be ideal, primarily to ensure that B and eff eff converged to their values around 2.5. Nevertheless, it seems clear that the further calculations would not result in eff that match those of slb. First, comparing eff for the l chromophores, which have converged statistically, we see that BL is not eff

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

123

15.0 BL HL BM HM

13.0 11.0 9.0

εeff

7.0 5.0 3.0 1.0 −1.0 −3.0 −5.0 0.0

5.0 10.0 15.0 number of configurations averaged

20.0

Figure 3.1: Convergence of eff for the probe chromophores

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

124

HL and that neither is comparable to the value of 4.6 found by slb. Second, close to eff although eff for the two m chromophores are roughly equal, they do not seem to be converging to the slb value of 1.5–1.6. Finally, the major conclusion of slb that eff is greater for the l branch chromophores than for the m branch is not borne out in our calculations.

3.4

Conclusions

We now draw a number of conclusions from our calculations. First, there is a striking difference between BL , which is somewhat less than one, and  for the eff

eff

other three chromophores, which are all about 2.5. Without a much closer examination of the configurations involved, one is left to speculate on the reasons for this contrast. One possible rationale is suggested by results in sections 2.3.5. The electrostatic potential at BL differs from that at the other chromophores for being uniquely sensitive to the orientation of one particular amino acid residue—TyrM208. Perhaps, the calculation of BL is dominated by the dielectric response of this one residue, eff

L obscuring the dielectric response of the entire prc complex. In contrast to B eff , re-

sults for eff for the other three chromophores are not unexpected. In fact, effective dielectric constants calculated at various amino acid residues for their contribution (CO)

to ∆E32

(see Section A.2.3, particularly, Table A.3) range from 2.2 to 2.9—not

unlike the 2.3 to 2.6 for eff for the three chromophores. Although it remains to be demonstrated that TyrM208 is indeed responsible for the unusual dielectric constant for BL , we will see in Chapter 5 (Section 5.3) that TyrM208 is demonstrably unique in its relationship to BL . No other residue affects the other chromophores as strongly as TyrM208 affects BL . A number of possible sources exists for the discrepency between the eff calculated for S2:2:80 and those from slb. The first issue may be that the discrepancy indicates that the S model is not the most experimentally accurate of the various charge models; eff calculated for a model in which certain amino acids are neutralized (but not necessarily those specified in the P model) may be in greater accord with the values

CHAPTER 3. STEFFEN/BOXER EXPERIMENTS

125

in slb. Perhaps, it is the neutralization of particular amino acids that leads to a strong asymmetry in eff as seen in slb, but not observed for the S2:2:80 model. Another possible explanation for the lower eff found for the S2:2:80 model as compared to slb is that the 2:2:80 model encompasses insufficient dielectric response. Calculating eff using a model such as the 2:80 model would be instructive to clarifying this issue. Furthermore, the strong l–m asymmetry in the slb observations may actually be reflections of dielectric inhomogeneities that are not directly represented in our models. To address this issue, a relevant calculation would be for Warshel, Parson, et al. to mimic our calculation of eff within their pdld and/or fep/md methodology, which does involve “mobile waters.”

126

Chapter 4 Mutation Experiments 4.1

Introduction

Site-directed mutagenesis has been a popular method to probe structure–function relationships in the prc. Alterations in the structure are introduced, and the resultant changes in the kinetics or energetics of et are measured to study various questions about the prc. Does the primary transfer speed up or slow down? Is primary et down the l branch still favored over that down the m branch? If the mutated residue radically alters the speed or directionality of et, then that residue is identified as important to the function of the prc. Through trial and error, experimentalists have attempted to map out the “significant” vs “insignificant” parts of the prc. However, despite the large number of mutants of the prc that have been experimentally characterized, no conclusive identification of the essential portion of the prc has been accomplished—for either the l–m asymmetry or for the quick transfer. A complementary approach is a computer simulation perspective on site–directed mutagenesis. There are at least two ways to look at the use of computer modeling in this context. The first is explanatory. In theory, one can simulate (to some degree of accuracy) mutations of the prc by adapting the methods we have used to study the wt systems—although the actual practice may be complex. The question is whether computer simulation would be able to simulate these mutants to the desired accuracy. Another use of computer simulation is its potential predictive power. In Chapter 2,

CHAPTER 4. MUTATION EXPERIMENTS

127

we speculate that perhaps the large contributors to ∆E (CO) may be residues, which when mutated, would result in dramatic functional changes in the prc. With the computer simulation techniques deployed in our work, it is simple to identify such large contributors. It then becomes the work of experimentalists to mutate these residues to test the validity of this putatively predictive approach. This chapter focuses primarily on the first of the possible uses of computer mutation experiments—that of explication. Here, we have chosen to simulate what we dub the “Heller” double mutant, as well as a set of related systems. The double mutant is the first mutant of any prc which seems to show m side electron transfer. Our hope is two-fold: 1) by selecting a mutant whose function differs dramatically from the wt system, we may expect to have a better chance at actually simulating such a large effect and 2) a proper simulation of these mutants may really provide some insight into the structural basis of the l–m asymmetry of the wt system and how this asymmetry is different in the double mutant. In this chapter, we first describe the experimental work done on the double mutant and its related system, sketch the interpretation given by the authors for the function of this remarkable mutant and then contrast it to other possible explanations (that are in greater accord with the fundamental emphasis that we have put on the role played by the protein matrix over the chromophores.) A description of our simulations follows along with the results.

4.2

The Heller double mutant and allied systems

Of the many mutation experiments that we can study from our computational perspective, we choose to study a mutation performed by Heller, Holten, and Kirmaier [63]. In their paper, they compare the kinetics of three different systems: the wt Rb. capsulatus system, the so-called β mutant, and a system to which we will refer as the “double mutant.” The β mutant [99] involves replacing an amino acid and results in the replacement of HL , the l side bacteriopheophytin, by a bacteriochlorophyll. This transformation is effected by the mutation L(M212)H in Rb. capsulatus. The “double mutant” involves two mutations from the wild-type system:

CHAPTER 4. MUTATION EXPERIMENTS

128

the β mutant (L (M212) H and the consequent chromophore transformation) and G (M201) D. Heller and coworkers conducted transient absorption spectroscopy on these various mutants, from which they worked out the following kinetic scheme (the results for the wt system was derived from other work): • In the wt system, transfer from P∗ to P+ H− L occurs on a 3 ps timescale, while the secondary transfer to P+ Q− A is on a 200 ps timescale. Population transfer down the l branch is 100%. • In the β mutant, transfer from P∗ to P+ I− (what Heller et al. claim to be + − most likely a quantum or thermal mixture of P+ B− L and P β ) is on a 6 ps

timescale. (P+ β − denotes the charge separated state with the electron on the bacteriochlorophyll that replaces HL in the β mutant.) Subsequent transfer of the electron population is split between two states: 65% of the population goes to P+ Q− A at a 500 ps rate while the remaining 35% returns immediately to the ground state (P) on a 900 ps timescale. • Finally, in the double mutant, the rate for the primary transfer (which involves 70% of the electronic population) from P∗ to P+ I− slows to 21 ps, while the secondary transfer (from P+ I− to P+ Q− A ) returns to the rate typical of the wildtype system (170 ps). The remaining 30% of the electronic population is split between two routes: 15% decays to the ground state (P) from P∗ on a 100ps timescale, while the other 15% is deduced to be transfered down the m-branch on the same timescale (100 ps).

4.3

Computer simulation of the mutants

The approach that we have taken in studying the function of the mutant systems is similar to the one we use in Chapter 2 to study the wild-type system: we determine and compare the diabatic free energy surfaces for the three different systems under consideration: the wild-type system, the β mutant, and the “double mutant.” In

CHAPTER 4. MUTATION EXPERIMENTS

129

addition to calculating the surfaces for the states that we have already been examining + −  + −  + − (states 1 (P∗ ), 2 (P+ B− L ), 3 (P HL ), 2 (P BM ), and 3 (P HM )), we need also to look

at two other states (state 0 (P)—the ground state) and state 4 (P+ Q− A ). From these diabatic free energy surfaces, we can then estimate the kinetics of electron transfer. The question is then whether we will match the kinetics observed by Heller et al. (described in Section 4.2). The basic premise of the methodology used here is the same as for the simulations of the wt systems: use a combination of md and delphi to calculate the relevant λ and ∆G. However, we need to modify the basic procedure for two reasons: 1) we do not have direct crystallographic structures for the mutants and 2) different chromophores are involved in the mutants from those in the wt system. To deal with first problem, we approximate the structure through computer mutation methods. To deal with the second issue, we develop an approximate calculational strategy that focuses on calculating changes in free energies rather than absolute free energies of electron transfer. Details of how we simulate the structure of the mutants are given in Section 4.3.1, while the calculation of the diabatic surfaces Section 4.3.2 are given in Section 4.3.2.

4.3.1

Simulating mutations

In this section, we describe the details of the computer substitution and equilibration algorithm used. Unlike the case for the wt system, no solved X–ray crystal structure exists for either the β mutant or the Heller double mutant. To calculate the diabatic free energy surfaces for these two mutants, we need their atomic structure in order to use molecular dynamics and to calculate the electrostatic energies of charge states. In this study, we computationally derive the structures, performing “computational site-directed mutagenesis.” The approach we take here consists basically of two steps: 1) doing a direct substitution of the relevant atoms in a non-mutated structure and 2) letting this new structure relax to a new equilibrium structure. A complication that needs to be mentioned here is that the experimental mutations were

CHAPTER 4. MUTATION EXPERIMENTS

130

actually performed on Rb. capsulatus, not on Rps. viridis, the species under study. In this study, we assume that the relevant functional properties of these mutants in Rb. capsulatus carry over to Rps. viridis, as does a number of other properties. The β mutant involves two structural substitutions of the wt system. The first is L (M212) H. This replacement of a leucine by a histidine was performed by the amino acid residue substitution algorithm in insight. The second substitution is the replacement of HL by a bacteriochlorophyll. In our atomistic model, this translates specifically into a gain of a magnesium atom and the loss of two hydrogens. To perform this substitution, we literally relabelled HL in the pdb file to a bacteriocholorophyll, removed the two now–extra hydrogens, and added a magnesium positioned at the center of mass of the ring. We then equilibriated this structure through orac molecular dynamics. (It turned out that this md run had to be conducted with relatively small time steps at first to allow closely positioned atoms in the mutated structure to relax to more distant equilibrium positions.) The structure for the double mutant was constructed in a similar manner. We performed the substitution G (M201) D (glycine to aspartate) to the β mutant by operating insight on the pre-equilibrated β structure. Equilibration was then conducted by orac molecular dynamics. Because the charge state of the new aspartate is not experimentally known, we modeled both a double mutant in which the aspartate is charged (which we dub the “charged double” mutant) and one in which it is neutralized (in the “neutral double” mutant).

4.3.2

Calculating the diabatic surfaces for mutants

We calculate the diabatic free energy surfaces for three different systems: the wt system, the β mutant and the double mutant. We assume that the states of interest + − ∗ + − are P∗ , P+ H− L , P QA , and P along l branch et and P and P HM for m branch et.

That is, we explore mechanisms that do not involve P+ B− L and determine whether such mechanisms adequately account for experimental observations. We describe et among these states with one collective coordinate for each branch (namely, the collective coordinates, E  and E ⊥ , used in Chapter 2). With this approach, we are

CHAPTER 4. MUTATION EXPERIMENTS

131

therefore interested in various free energies (∆G) and reorganization energies (λ) that characterize the relevant diabatic surfaces (see Figure 4.1). Instead of directly evaluating ∆G for each system of interest, we estimate the incremental changes in the free energies (∆∆G), first from the wt system to the β mutant, then from the β mutant to the double mutant. Below, we give the specific details of how we calculate ∆∆G. P

*

P

+

+

P HL−

+

P HM−

∆G13

P QA−

*

∆G13’ ∆G34

∆G10 P

ε|| (a) l branch

ε||’ (b) m branch

Figure 4.1: A schematic diagram of the energies of interest We consider first the wt system. In Chapter 2, we have already calculated dia+ − + − + − batic free energy surfaces for P∗ , P+ B− L , P HL , P BM , and P HM . Besides working

with the charge states whose energetics have already been calculated in Chapter 2, we consider two other states: the ground state (P) and P+ Q− A . The energy difference between the ground state and the excited state (∆G10 ) has been experimentally measured to be -32 kcal/mol [15], while ∆G34 is ∆G34 = −14 kcal/mol [17, 15]. Once the relevant ∆G and λ for the wt system are known, the challenge then becomes how to calculate the free energies for the β mutant, in which the l side bacteriopheophytin is replaced by a bacteriochlorophyll (accompanied by L (M212)

CHAPTER 4. MUTATION EXPERIMENTS

132

H to generate the ligating histidine). We follow the algorithm in Section 4.3.1 to determine the structure for the β mutant. Presumably, the structural alterations have the most profound effect on the surface for P+ H− L . In this rough model, we use experimental estimates for ∆G13 . Kirmaier et al. [99] argue that the primary et in the β mutant is the process P∗ → P+ I− . When Kirmaier et al. examined various possible identities for P+ I− (P+ β − , P+ B− L , or a mixture of the two), they concluded that none of these possibilities provides a straightforward explanation of the experimental data [46]. However, despite uncertainties, Kirmaier et al. concluded + − that P+ β − and P+ B− L are close in energy in the mutants and that P BL must be

close in energy to P∗ , in both the wt and β mutant [47]. Moreover, they estimated that in a model in which ∆G13 = −6 kcal/mol, then P+ I− is 1.7 kcal/mol below that of P∗ [47]. In our calculations for the β mutant, we use this estimate for the energy of P+ I− as our estimate for ∆G13 in the β mutant. To estimate the change in the free energy of other states of interest due to the mutation (∆∆G), we calculate the change in the electrostatic interactions (∆∆E (CO) ) coming from the different chromophore and amino acid in the β mutant. (Normally, in order to calculate ∆E (CO) , we exclude the chromophores of the core system. Here, we include the electrostatic contributions of HL (in the wt system) and the new bacteriochlorophyll (in the β mutant), as well as those of the rest of the “environ(CO)

ment.”) Specifically, we calculated ∆∆E13

(CO)

to be -1.4 kcal/mol (with ∆E13

in

the wt system and β mutant to be -10.6 kcal/mol and -12.1 kcal/mol, respectively) (CO)

and ∆∆E14

to be -8.2 kcal/mol (recall from Table 1.1 that state 4 denotes P+ Q− A ).

Finally, in order to calculate the diabatic free energy surfaces, the reorganization energies of transfer (λ13 and λ13 ) must be calculated. As it turns out, calculating these numbers is computationally expensive, and convergence to a definitive value is slow. Tables 4.1 and 4.2 contain the calculated values for λ13 and λ13 , respectively. Figures 4.2 and 4.3 display the cumulative estimate of λ13 and λ13 , respectively. Note that λ13 for the wt, the β mutant, and the neutral double mutant converge basically to the same value of approximately 4.7 kcal/mol, while λ13 for the charged double mutant hovers around 4.2 kcal/mol. In contrast, the convergence of λ13 is slower. Moreover, λ13 is larger than the λ13 of the corresponding model.

CHAPTER 4. MUTATION EXPERIMENTS

133

Table 4.1: λ13 for the four systems λ13 a

a

wt

4.7

β mutant

4.6

double (charged)

4.2

double (neutral)

4.7

Given in kcal/mol.

Table 4.2: λ13 for the four systems λ13 a

a

Given in kcal/mol.

wt

5.8

β mutant

6.8

double (charged)

5.1

double (neutral)

5.1

CHAPTER 4. MUTATION EXPERIMENTS

134

8.0 WT β charged double neutral double

−1

λ (kcal mol )

6.0

4.0

2.0

0.0

0

20000

40000 60000 timesteps (/2.4 fs)

Figure 4.2: The convergence of λ13

80000

CHAPTER 4. MUTATION EXPERIMENTS

135

10.0 WT β charged double neutral double

6.0

−1

λ (kcal mol )

8.0

4.0

2.0

0.0

0

20000

40000 60000 timesteps (/2.4 fs)

Figure 4.3: The convergence of λ13

80000

CHAPTER 4. MUTATION EXPERIMENTS

136

Table 4.3: ∆∆E (CO) for β mutant → double mutantsa (CO) b

state i

a

(CO) c

∆E1i

∆∆E1i

β

double (c) double (n)

β → double (c) β → double (n)

3 (P+ H− L)

-18.2

-21.1

-15.9

-2.9

2.3

3 (P+ H− M)

-11.7

-15.6

-10.3

-3.8

1.5

4 (P+ Q− A)

-33.4

-39.8

-34.0

-6.4

-0.6

“Double (c)” denotes the charged double mutant; “double (n)” denotes the neutral

double mutant. b

Given in kcal/mol.

c

Given in kcal/mol. To calculate ∆G for the double mutant, we look at changes from the β mutant.

Recall that the double mutant differs from the β mutant by one amino acid substitution (G (M201) D). We take an approach similar to the one used for the β mutant in calculating the diabatic surfaces for the double mutant: we calculate ∆∆E (CO) (from the β mutant to the double mutant) and λ for the double mutant. (In contrast to the change from the wt system to the β mutant, no changes in the chromophore (or core) system are involved.) There is one final complication: the charge state of the new aspartate is not known. Hence, we have to run simulations on a double mutant with a charged aspartate (our “charged double” mutant) and with a neutralized aspartate (our “neutral double” mutant). Table 4.3 presents the results for ∆∆E (CO) ; Table 4.4, the calculated ∆∆G and ∆λ for the wt, β mutant, and double mutants; Table 4.5, the net ∆G and λ for the systems under consideration.

4.4

Results and discussion

Even though we have gone through some effort to calculating various energies (or changes in energies) for the wt, β mutant, and Heller double mutants, it is important to remember that we can, at best, present only a qualitative description of the various systems. These calculations are meant more to suggest possibilities of mechanisms at

CHAPTER 4. MUTATION EXPERIMENTS

137

Table 4.4: Calculated ∆∆G and ∆λ for the systemsa wt → β mutant

β mutant → double (c)b

β mutant → double (n)b

∆λ13

-0.1

-0.4

+0.1

∆∆G13

+4.3

-2.5

+2.2

∆∆G14

-8.1

-6.0

-0.9

∆λ13

+1.0

-1.7

-1.7

∆∆G13

-2.4

-2.1

+3.2

a

Energies given in kcal/mol.

b

“Double (c)” denotes the charged double mutant; “double (n)” denotes the neutral

double mutant.

Table 4.5: Calculated ∆G and λ for the systemsa wt

β mutant

double (c)b

double (n)b

λ13

4.7

4.6

4.2

4.7

∆G13

-6.0

-1.7

-4.2

0.5

∆G14

-20.0

-28.1

-34.1

-29.0

∆G10

-32.0

-32.0

-32.0

-32.0

λ13

5.8

6.8

5.1

5.1

∆G13

2.9

0.5

-1.6

3.7

a

Energies given in kcal/mol.

b

“Double (c)” denotes the charged double mutant; “double (n)” denotes the neutral

double mutant.

CHAPTER 4. MUTATION EXPERIMENTS

138

work that can be verified (or discounted) with more accurate modeling. Now that we have tabulated λ13 , λ13 , ∆G13 , ∆G10 , ∆G13 , and ∆G14 (Table 4.5), we are interested in comparing the kinetics implied by these energies with experimental measurements. In particular, we are interested in studying the kinetic changes moving from wt → the β mutant and then from the β mutant → the double mutant. As we analyze each transition, we note how much ∆G and λ would have to change in order to match experimental kinetics and whether such changes are within experimental errors. We first examine the estimated free energies of et for the wt system. For the diabatic surfaces of the wt system, the l surfaces give serial activationless electron transfer. That ∆G13 is 2.9 kcal/mol suggests that transfer down the m transfer (P∗ → P+ H− M ) might actually be endothermic, as argued in mgcn. Primary et proceeds down the l side rather than the m side because ∆G13 < ∆G13 . Here we assume that golden rule expression for the rate of et (Equation 1.9) between a donor state (D) and acceptor state (A): kDA =

2 ∗ 2πKDA exp(−βFDA )  , −1 h ¯ 4πβ λDA

(4.1)

∗ where β = 1/(kB T ) and FDA is the activation barrier to et (given in Equation 1.8)

and cited here: ∗ FDA =

(∆GDA + λDA )2 . 4λDA

(4.2)

If we plug ∆G13 and λ13 (listed in Table 4.5) and fit K13 , the electronic coupling −1 between D and A, so that k13 = 3 ps and assume the same K for et between P∗ and −1 P+ H− M (i.e., that K13 = K13 ), we arrive at k13 = 710 ps. There is no experimentally

measurement for k13 . However, the ratio of 710 : 3 for k13 /k13 is high enough to concur with the experimental result that electronic population traveling from P∗ to + − P+ H− L is greater than population traveling to P HM by at least 200 : 1 [31].

We turn next to the diabatic surfaces calculated for the β mutant. First of all, while we find an insignificant change in λ13 in the transformation from wt → β mutant (4.7 to 4.6 kcal/mol), λ13 does jump by 1 kcal/mol (5.8 to 6.8 kcal/mol). Because ∆G13 is set to the experimental estimate of -1.7 kcal/mol for the β mutant, primary et is still calculated to be roughly activationless from our estimates. The

CHAPTER 4. MUTATION EXPERIMENTS

139

experimental finding is that et slows down from 3 ps in wt to 6 ps in the β mutant. With the calculated λ13 and the experimental value of ∆G13 and the same K13 as in −1 the wt system, we arrive at k13 = 5.6 ps, in good agreement with the experimental

value. For P+ Q− A , the calculated ∆∆E14

(CO)

of -8.2 kcal/mol lowers ∆G34 dramatically,

increasing the activation barrier from 0.0 to 8.0 kcal/mol and making transfer from + − P+ H− L to P QA slower. Although secondary transfer is seen to slow from 200 ps in

wt to 500 ps in the β mutant, our calculations imply a secondary transfer that is several orders of magnitude slower. On the m branch, ∆G13 drops from 2.9 kcal/mol in wt to 0.5 kcal/mol. This negative change in ∆G13 makes transfer down m side −1 more favorable than in wt. Using λ13 , ∆G13 , and the fit value for K13 , k13  in the

β mutant decreases to 74 ps. Although this rate for m branch transfer is faster than in wt, it still competes unfavorably with l side transfer to P+ H− L . This calculation is consistent with the experimental finding that in the β mutant, as for wt, P+ H− M is not observed. Overall, while the placement of ∆G13 at -1.7 kcal/mol combined with our calculation of λ13 is consistent with the experimental result for primary et in the β mutant, our calculations for P+ Q− A are inconsistent with experimental results. We move now to the Heller double mutant. We have calculated diabatic free energy surfaces for two models for the mutant: the “charged double” model, in which the new aspartate is charged and the “neutral double” model. Recall that in the transformation from the β mutant → double mutant, the single amino acid G (M201) is substituted by an aspartate. In our calculations, we are therefore looking at the effects of this substitution on ∆E (CO) and λ. In the charged double model, ∆G13 is -4.2 kcal/mol; the introduction of a negative aspartate lowers ∆G13 . In the charged −1 double system, k13 decreases to 2.4 ps, which is faster than the wt rate. The ex−1 perimental rate for k13 would require either ∆G13 of -8.8 kcal/mol or 0.4 kcal/mol

(keeping the same value of λ13 ). Such an adjustment of 4.6 kcal/mol is larger than what we estimate to be errors in ∆G (2 kcal/mol). Changing λ13 to either 1.3 or −1 10.7 kcal/mol (while keeping other variables constant) would also bring k13 to 21 ps.

However, such adjustments in λ13 of either 3 or 6 kcal/mol are much larger than the estimated error of 1 kcal/mol in λ13 . (see Section 2.A.3.) Hence, the charged double

CHAPTER 4. MUTATION EXPERIMENTS

140

model does not give the right behavior for the primary transfer. (CO)

Moreover, ∆∆E14

is -6.4 kcal/mol. The charged amino acid lowers P+ Q− A with

respect to P+ H− L , thus pushing the secondary transfer further into the inverted regime. This finding is contrary to experimental findings that the secondary transfer should return to wt type rates. On the m branch of the charged double model, ∆G13 is calculated to be -1.6 kcal/mol. Transfer from P∗ to P+ H− M is now exothermic, and −1 the barrier to transfer on the m side is lower. In fact, k13  is calculated to be 6.8 ps.

With ∆G13 calculated to be -4.2 kcal/mol, our calculations would imply that 1% of the electronic population originating on SP should be seen on HM . This proportion is less than the 15% observed experimentally; however, ∆G13 would have to be shifted lower by only another 1.6 kcal/mol to -3.2 kcal/mol to generate this ratio. Hence, given the statistical uncertainties in ∆G, we are not able to demand quantitative accuracy of kinetic branching ratios in the charged double mutant. Although our charged double model is still quantitatively inaccurate, its results suggest a possible mechanism for m side transfer in the double mutant is that the β mutant → double mutant makes transfer to P+ H− M relatively more competitive than it does transfer to P+ H− L. In contrast to the charged double mutant, in which the aspartate lowers ∆G13 , ∆G13 , and ∆G14 , the neutral double model predicts that a neutral asparate would raise ∆G13 and ∆G13 ; ∆G14 is decreased slightly by 0.9 kcal/mol. Moreover, ∆G13 −1 is calculated to be 0.5 kcal/mol , while ∆G13 is 3.7 kcal/mol. Although k13 would

become 30 ps (close to the experimental value of 21 ps), l branch et would be −1 predicted to endothermic, contrary to experimental observations. k13  There is still a

large barrier for the secondary transfer (essentially unchanged from what is calculated for the β mutant). Hence, the neutral double mutant does not show the return to wt rates in the secondary transfer. Moreover, since ∆∆G13 (β mutant → neutral double) is +3.2 and ∆∆G13 is +2.2, et down the m branch is less likely relative to −1 the primary l transfer than it is in the β mutant. (k13  is calculated to be 1800 ps in

the neutral double mutant.) This result is contrary to the experimental finding that P+ H− M is seen in the Heller double mutant. In the previous paragraph, we discussed how much the calculated values of ∆G and λ for the charged double model must

CHAPTER 4. MUTATION EXPERIMENTS

141

change to concur with experimental kinetics. The changes to bring the neutral double model are even larger than in the charged double mutant. Hence, even accounting for statistical errors in ∆G and λ, the neutral double model does not yield a model that corresponds to experimental results. (See the concluding section below for a discussion of experimental work related to the charge state of the Asp M201.) Interestingly, λ13 is consistently greater than λ13 for all of the models. (See Figures 4.2 and 4.3.) Moreover, the convergence of λ13 occurs on a shorter timescale than for that of λ13 . There is no obvious explanation for this difference in convergence for λ13 and λ13 . There definitely seem to be slow motions intrinsic to proteins that affect the convergence of λ13 and λ13 [97]. For calculations at our level of accuracy, the precision with which we have calculated λ13 and λ13 is sufficient. However, there are issues that would be important for more ambitious calculations: How do very long–time scale motions in the prc affect the value of λ? How do these long–scale motions (which are slower than the timescale for primary transfer but slower than that of secondary transfer) enter in the various et reactions? What exactly would be considered a converged value for λ?

4.5

Conclusions

In this chapter, we have discussed simulations of the β mutant and the Heller double mutants in an attempt to account for their kinetic properities. Although our calculations do not provide a full explanation for the workings of the β mutant and the Heller double mutants, they do offer various insights into possible mechanisms at work in the mutants. Underlying this chapter is the assumption that kinetic changes in the prc going from wt to the mutants are fully explicable in terms of changes in ∆G and λ. In contrast, Heller et al. argue that in the β mutant, there is quantum mixing of P+ B− L and P+ β − to form the P+ I− . We do not invoke any quantum mixing in our theory and examine the hypothesis that P+ I− is really P+ β − . Heller et al. might then argue that the failure of our calculations to account accurately for the behavior of the mutants points to the inadequacy of the class of explanation we use.

CHAPTER 4. MUTATION EXPERIMENTS

142

There are a number of weaknesses in our calculational methodology. We do not + − have md calculations of P+ Q− A ; to perform the calculation for P QA more properly,

we would need to generalize the diabatic surfaces used in Chapter 2, which have two collective coordinates to ones with three collective coordinates. Moreover, our (CO)

calculation of ∆∆E14

might not be terribly accurate in the wt → β mutant trans-

formation because of the close proximity of the charges involved to HL . Finally, we are mixing organisms (Rps. viridis and Rb. capsulatus) in our calculations. Not only are the polypeptides and structures different, but also bacteriochlorophyll–a is different from as bacteriochlorophyll–b. Despite these methodological weaknesses, we can nevertheless glean some insight into possible mechanisms at work in the mutants. Specifically, even if the absolute free energies (∆G) calculated for each system may be inaccurate, the changes in ∆G (∆∆G) for each of the various transformations (wt to β mutant, β mutant to Heller double mutant) are useful in comparisons to the experimental data. Even though our models do not fit all the kinetic data put forth by Heller et al., there are points of agreement. Recall again that our estimated diabatic surfaces are very rough, more qualitative than quantitative. With this caveat in mind, we can review some of our suggestive results from our calculations. The diabatic surfaces for the wt system suggests that m side et might not active because it is endothermic. For the β mutant, ∆G13 is shifted up to -1.7 −1 is calculated to be 5.6 ps, in accordance to the experimental meakcal/mol, and k13

surement of 6 ps. Moreover, the decrease in ∆G34 causes slowdown of secondary et (as happens experimentally), but the calculated effect is too strong, perhaps because (CO)

of problems in estimating accurately ∆∆E14

.

For the Heller double mutant, we recall that its most dramatic feature is the −1 −1 observed m side transfer. Moreover, k13 (primary et) slows to 21 ps, while k34

(secondary et) returns to near–wt rates. In our charged double model, the calcu−1 lated k13 is too fast, while the rate for secondary transfer remains too slow. The

latter result may reflect inaccuracies in the calculation for P+ Q− A . However, ∆G13 is well within the value needed for 15% of the electronic population of P∗ to transfer to P+ H− M . Hence, even though the calculations for the charged double model are

CHAPTER 4. MUTATION EXPERIMENTS

143

not quantiatively accurate, its results suggest that a possible mechanism for m side transfer is that the β mutant → double mutation lowers P+ H− M more than it does P+ H− L . Results for the neutral double mutant are less consistent with experimental findings: primary transfer is calculated to be endothermic, secondary transfer does not return to wt rates, and transfer to P+ H− L is not calculated to happen in this model. Moreover, the changes needed in the calculated values for ∆G and λ are larger than allowed by our error estimates. Hence our calculation lends greater support to a picture in which the new aspartate of the double mutant is charged, rather than neutral. Experimental work aimed at determining the charge state of Asp M201 has not been definitive. Heller et al. have examined their measurements of the ground state absorption spectra for changes predicted to arise from the presence of a bare negative charge near BL (which is the case of a charged asparate) [100]. They find signs of bandshifts in agreement with the theoretical predictions. Nevertheless, Heller et al. state categorically that they are neither offering these observations as proof of nor implying preference for the scenario that Asp M201 is indeed charged. They state that a protonated aspartate could well exert similar effects on ground state spectra because of polar or polarizable groups. Overall, our calculations do not provide a quantitative accounting for the behavior of the mutants, in light of the error estimates of the calculations. Certain calculations would be useful to build on the work of this chapter. The most promising advance would involve having access to the proper vacuum energies (∆E (0) ) so that we could reliably ascertain absolute values of ∆G. Another important improvement would be to actually carry accurate md on P+ Q− A . As described above, doing so requires a significant modification of the present md methodology to incorporate greater solvent response around QA . Such md calculations would permit more rigorous estimates of the diabatic free energy surfaces. This effect might be significant in more accurate calculations. Finally, it would be instructive to look at shifts in the energies of P+ B− L. It is possible that P+ B− L does play a significant role in the workings of the β mutant and double mutant, either in shifting the exponential prefactor in the golden rule constant or being involved in quantum mixing with P+ H− L , as suggested by Heller et al.

144

Chapter 5 Statistical Nature and Structural Biology of the Photosynthetic Reaction Center 5.1

Introduction

In previous chapters, we have made highly detailed studies of the structure– function relationships of the prc. That is, we have started with full atomic level models of the reaction center, including the wildtype and mutated forms to calculate the energetics and kinetics of the primary et. From these calculations, we have attempted to address some of the major outstanding questions concerning the prc. In this type of work, we move from a structural description encapsulated in such basic physical quantities as the actual charge distribution and the atomic positions towards reduced functional descriptions. In this chapter, we take a different approach to the same questions about the prc. Instead of concentrating on the properties of the full models, we look for ways to first simplify the models without distorting important properties. Why should we expect a simpler characterization of the prc to be possible? One possible answer is that there is no simpler description than the complete specification of all the atomic coordinates, that altering any detail would change major features of the system. This explanation

CHAPTER 5. STATISTICAL NATURE . . .

145

is clearly inaccurate. We have seen in earlier chapters that certain trends are true for all the models, despite drastic changes in the details of the models. Moreover, the prc has shown a resilient functional asymmetry in experimental mutants which had been designed to manifest changes that do not actually result. Of course, other alterations do bring about profound changes in the prc. In other words, one can change some things and not affect the collective behavior of the model (this is certainly the case for the real prc)—while changing other things dramatically alter the function (and probably, structure) of the prc. Therefore, there must be the “essential” parts of the prc and the “unessential.” The ultimate challenge of modeling work is to distinguish between the two parts. There are many aspects of the prc that may be important to its function. In this chapter, we focus on simplifying the relevant electrostatics. Since the calculation of the electrostatic properties of the prc has been the focus of our studies, we have the most detailed knowledge of its behavior and hence are most likely able to simplify it appropriately. The key methodological question in the simplification of the electrostatics is how to do the physical partitioning of the prc. Different partitioning may shed different perspectives and require different analytical techniques. For instance, we may just look at the distribution of charge in the prc. Is the distribution spatially symmetric or asymmetric and how does it account for the asymmetric electrostatics? In doing this analysis, we would make use of visualization techniques. After looking at the finest level details, we turn to various coarse-grained descriptions of the prc. We revisit the breakdown of energy gaps by residual contributions (used in Chapter 2) by analyzing them in a fuller statistical fashion. Another way to view the residues is to conceptualize each of them as a collection of increasingly higher order multipoles. In the limit of an infinite series expansion, this description is fully accurate. To what degree does a low order expansion capture the details of the prc? We explore this question below. The sensitivity and the robustness of our computational results are closely related issues. That is, distinguishing between the “essential” parts versus the unimportant parts of the model helps to identify parts of the models that require refinement for accurate results. We hope that we might also get some quantitative measure on the

CHAPTER 5. STATISTICAL NATURE . . .

146

extent to which uncertainty in specific details of the model accurately can alter key results. In all cases, we certainly hope that knowledge of about these sensitivities translates into knowledge about how the real system works. Similar to the issue of finding a simpler description of the prc is the task of relating different types of analyses. Structural biologists, in their examinations of the prc (and proteins, in general) not only think in terms of primary, secondary, and tertiary structures but invoke the notion of homology (for example, that between the l and m branches). The issue we explore below is whether homology can be elucidated by descriptions we have already used, namely that of physical energetics. The issue of functional asymmetry in the prc is of special concern in this chapter. Specifically, we look at relating different descriptions (homology to electrostatic contribution) and figuring out the parts of the prc essential in creating the l–m asymmetry of the complex. Are there any simple physical explanations for asymmetries; we will look at spatial charge densities—are there any clearcut differences between l and m? Moreover, are there ways to correlate physical information (electrostatic contributions) with information about higher level descriptions (homologies, secondary structure, conserved residues)? In the rest of this chapter, we will look at the different techniques that we have used to uncover simplified descriptions at varying levels of detail. Some, such as the use of multipole analysis and the alignment studies are explicated in fine detail. Other techniques, which have been subject to only either preliminary or exploratory examination will be discussed briefly.

5.2

Statistical model for the residual contributors

In Section 2.3.4, we examine ∆E (CO) as sums of residual contributions. We touch upon the issue of whether the contributions can sensibly be divided between several exceptionally “large contributors” and many ordinary contributors. In this section, we revisit this issue by conceptualizing the residual contributors as being drawn from some underlying statistical distribution. We can then propose statistical hypotheses (CO)

that can be tested. Specifically, looking at ∆E13

for S2:2:80 , we ask three questions:

CHAPTER 5. STATISTICAL NATURE . . .

147 (CO)

Table 5.1: Statistical features of residual contributors to ∆E13 N

1393

mean (µ)

-0.01311

√ standard deviation ( σ)

a

in S2:2:80 a

0.4356

minimum

-5.818

first quartile

-0.0405

second quartile (median)

0.00013

third quartile

0.04295

maximum

2.721

All energies are in kcal/mol.

1) can the mean contribution deviate from zero in a statistically significant way? 2) is the distribution gaussian? 3) are there statistical outliers and if there are, how many? (CO)

Since ∆E13

in S2:2:80 is -18.3 kcal/mol and there are 1393 residual contributors,

the mean contribution is then -0.0131 kcal/mol; the standard deviation is 0.4356 kcal/mol (See Table 5.1). Applying a two–tail t–test shows that the null hypothesis that the mean of the distribution is zero cannot be rejected (α = 0.1%). In other words, if one thinks of the observed contributors as resulting from some underlying random distribution, one would not be able to claim with statistical certainty that the distribution is skewed away from zero (in a negative direction in this case). On (CO)

the other hand, if we ask the question of whether the net ∆E13 is a statistical N anomaly, we get a different answer. If y = i=1 xi , where xi are drawn from a random distribution with mean of 0 and variance of δx2  = σ, then y should be a (CO)

random variable with zero mean and a variance of δy 2  = N σ. If ∆E13

were drawn

from the random distribution for y, where N = 1393 and σ = 0.43562 (kcal/mol)2 , (CO)

then the expected standard deviation of ∆E13 case, the figure of -18.6 kcal/mol for

(CO) ∆E13

would be 16.26 kcal/mol. In such a

would not be a clear statistical outlier,

but could be the result of statistical fluctuations of an underlying random process. After examining the statistical nature of the net value of ∆E (CO) , we look now

CHAPTER 5. STATISTICAL NATURE . . .

148

in greater detail at the distribution of individual contributions. Specifically, we ask whether the distribution is gaussian and whether there are any statistical outliers to the distribution. Figure 2.15 displays the histogram of these contributors. A gaussian model is a natural one to consider, not only because a gaussian distribution would result from the mass averaging of random variables (the central limit theorem), but also because the fluctuations in the electrostatic gaps closely follow gaussian statistics. Of the various statistical tests of normality available, we use the “qnorm” test [101] because it is more informative than alternative tests (eg., Shapiro–Wilks) which provide only a single test statistic. Consider a data set {x1 . . . xN }, ordered from smallest to greatest, with N data points, calculated mean (µ) and variance (σ). Let c(x) be the cumulative distribution of a normal distribution with the same mean, µ, and variance σ: c(x) = √

1 2πσ



x

−∞

e(y−µ)

2 /2σ

dy

(5.1)

and d(x) = c−1 (x), an inverse cumulative normal distribution function. The qnorm plot is the plot of the ordered pairs (xi , yi ) for i = 1 . . . N , where: yi = d(

i ). N +1

(5.2)

The degree to which the plot deviates from a straight line is a measure of deviation from normality for the data set. (CO)

Figure 5.1, the qnorm plot for contributors to ∆E13

in the S2:2:80 model, shows

marked deviation from a straight line. There is a relatively straight, highly sloped segment centered roughly about x = 0, bordered on both sides by points ordered along lines with smaller slopes. This sigmoidal shape indicates that the actual contributors are much larger in magnitude than would be expected from a gaussian distribution of (CO)

mean µ and variance σ. Hence, the contributors to ∆E13

in the S2:2:80 model

are not distributed in a gaussian manner. However, the straight segment about x = 0 suggests the possibility that although the entire distribution is not gaussian, perhaps the smaller contributors are normally distributed and the larger contributors are then outliers on top of a normal distribution. Moreover, it may be that these outliers are the charged amino acid residues, that a totally neutralized charge model

CHAPTER 5. STATISTICAL NATURE . . .

149

would exhibit contributors that are more normally distributed. Figure 5.2 is a qnorm plot of a truncated distribution, one in which the 196 contributors with magnitude greater than 0.25 kcal/mol are removed. The distribution is still sigmoidal, although the truncated distribution does seem to deviate less from normality than the full (CO)

distribution. Figure 5.3 is the qnorm plot for all contributors to ∆E13

in the N2:2:80

model. Note that neutralizing all the charged residues still leaves the distribution of contributors with a non–gaussian distribution, though with a smaller variance.

ideal gaussian correspondents (kcal/mol)

2.0

1.0

0.0

−1.0

−2.0 −6.0

−4.0

−2.0 0.0 electrostatic contributions (kcal/mol)

(CO)

Figure 5.1: Normal qnorm plot for contributors to ∆E13

2.0

4.0

in the S2:2:80 model

Because the distributions of residual contributors to ∆E (CO) deviate from normality in having more large contributors than expected for a normal distribution, we

CHAPTER 5. STATISTICAL NATURE . . .

150

0.40

ideal gaussian correspondents (kcal/mol)

0.30

0.20

0.10

0.00

−0.10

−0.20

−0.30

−0.40 −0.40

−0.30

−0.20 −0.10 0.00 0.10 0.20 electrostatic contributions (kcal/mol)

0.30

0.40

(CO)

Figure 5.2: Normal qnorm plot for a truncated distribution of contributors to ∆E13 in the S2:2:80 model

CHAPTER 5. STATISTICAL NATURE . . .

151

ideal gaussian correspondents (kcal/mol)

1.0

0.5

0.0

−0.5

−1.0 −2.0

−1.0 0.0 1.0 electrostatic contributions (kcal/mol)

(CO)

Figure 5.3: Normal qnorm plot for contributors to ∆E13

2.0

in the N2:2:80 model

CHAPTER 5. STATISTICAL NATURE . . .

152

consider a symmetric exponential distribution as possibly more accurate: eλx , x 0. (5.3) 2λ A symmetric exponential distribution approaches 0 more slowly as x → ±∞ than =

does the normal distribution. Hence, such a distribution might characterize the large number of contributors that lie far away from the mean. To test this hypothesis, (CO)

we applied the exponential qnorm test to the distribution of contributors to ∆E13

in the S2:2:80 model (Figure 5.4) and to N2:2:80 (Figure 5.5). The graphs show that the exponential qnorm plots have the same qualitative features as the normal qnorm plots. We conclude, therefore, that the exponential distribution is also an inadequate description of the contributors to ∆E (CO) because it, like the normal distribution, approaches 0 too quickly. A related question is whether there are statistical outliers in the distribution of electrostatic contributors. As discussed in Section 2.3.4, the motivation for this question stems from a search for individual amino acid residues that may be significant to the function of the prc. Perhaps the statistical outliers are just such residues. In testing the normality of the distributions, we already note that the distribution has many more contributors of large magnitude than expected for a gaussian distribution of the same mean and variance. Another way to think of outliers is to use the heuristic rule—the so–called “1.5–iqr” rule [101]. If I3 is the third quartile of the distribution and I1 , the first quartile of the distribution, and let ∆ = 1.5(I3 − I1 ), then by this rule, all points that are either greater than I3 + ∆ or less than I1 − ∆ are considered to be outliers of the distribution.1 When this rule is applied to the distribution of (CO)

contributors to ∆E13

of S2:2:80 , the 179 contributions larger than 0.1265 kcal/mol

and the 190 contributions less than than -0.1240 kcal/mol are identified as outliers. In conclusion, we have attempted, in this section, to model the contributors to (CO)

∆E13 1

statistically. We have found that since the average contribution does not

Applying the 1.5–iqr rule to a normal distribution would be classifying all points greater than 2.7 standard deviation points apart from the mean (or 0.7% of the points) as outliers.

CHAPTER 5. STATISTICAL NATURE . . .

153

ideal exponential correspondents (kcal/mol)

10.0

5.0

0.0

−5.0

−10.0 −6.0

−4.0 −2.0 0.0 2.0 electrostatic contributions (kcal/mol)

(CO)

Figure 5.4: Exponential qnorm plot for contributors to ∆E13

4.0

in the S2:2:80 model

CHAPTER 5. STATISTICAL NATURE . . .

154

ideal exponential correspondents (kcal/mol)

10.0

5.0

0.0

−5.0

−10.0 −2.0

−1.0 0.0 1.0 electrostatic contributions (kcal/mol)

(CO)

Figure 5.5: Exponential qnorm plot for contributors to ∆E13

2.0

in the N2:2:80 model

CHAPTER 5. STATISTICAL NATURE . . .

155 (CO)

differ from zero in any statistically significant way, that the net sum of ∆E13

can

then be thought to be a statistical fluctuation of random distribution. Moreover the distribution of contributions is flatter than a gaussian distribution, with more statistical outliers. Because the number of outliers numbers in the hundreds (rather than tens), identifying several amino acids out of the thousands based purely on the size of the contributions is not statistically legitimate. In order to build a good statistical model for the contributions to ∆E (CO) , one needs to move beyond a gaussian model to consider the effect of actual geometry and distribution of charge in the prc.

5.3 5.3.1

A multipolar analysis of the reaction center General methodology

In Chapter 2, we calculate the contribution to the energy gap due to electrostatic interactions with the surrounding protein complex (∆E (CO) ). In general, calculating ∆E (CO) (in our methodology) involves solving the pb equation. In the case of a uniform 2:2 dielectric boundary condition, calculating the potential at a point is tantamount to summing the coulombic potential from each of the partial charges on all the relevant atoms from the system. This calculation involves a complicated representation, composed of roughly 12620 different atoms, each with partial charges. Perhaps there is a simpler method that renders similar accuracy. Because the calculation of ∆E (CO) involves external electrostatic potentials, one might be able to exploit a well known approximation, the multipole expansion. A multipole expansion is a representation of a charge distribution suitable for calculating the potentials at points lying outside of the charge distribution. The charge distribution is replaced with an infinite series of point multipoles (monopole, dipole, etc.) which produces a net electrostatic potential equal to that arising from the original charge distribution. An infinite multipole series expansion is not necessarily a simpler representation of the charge distribution than the original. A multipole expansion is useful when a low–order truncation of the series gives sufficiently accurate results. A low order truncation often gives good results. As the distance between the charge and the

CHAPTER 5. STATISTICAL NATURE . . .

156

reference point increases, the contributions of higher order multipoles decay faster than those of lower order multipoles. So far we have discussed the calculation of potentials at points outside the region of charge. In our work, we are interested in electrostatic potentials at the chromophores, which are embedded inside the prc. Hence taking the entire prc to be an object of a multipole expansion is inappropriate. Instead, we can break up the prc into smaller units, each of which can be expressed as multipole expansions. A natural way to subdivide the system is into amino acid residues (and other chemical units such as the water molecules). Residues are a natural unit not only from a molecular biological point of view but also viewpoint of electrostatics. Residues are conveniently thought of as collections which either have integral charge (usually positive or negative unit electronic charge) or are neutral. Indeed, this property is exploited in the following analysis. We conceptualize each residue as a set of multipoles (monopoles, dipoles, etc.) positioned at the center of the residue.2 In spirit, this method is similar to the Fast Multipole Method (fmm) although the motivation for fmm is computational speed, whereas our focus is finding a simplified representation. As argued above, this distribution of multipoles becomes a simpler representation than the original only if a low-order truncated multipole expansion is adequately accurate. For instance, replacing every residue with its respective zeroth-order multipole (monopole) reduces the charge representation of the prc to 1403 monopoles (one for each of the amino acid residues and other chemical groups) from the partial charges of approximately 12620 atoms. This monopole expansion is a considerably less complex representation than the original. The original involves 12620 × (3 + 1) = 50480 numbers (three for the (x, y, z) coordinates and one for the charge). The monopole expansion involves 1403 × (3 + 1) = 5612 numbers. Likewise, an expansion that includes also the dipolar term (first order multipole) would require 1403 × (3 + 1 + 3) = 9821 numbers (20% of the original representation). Of course, such simplified representations of the charge distribution result in dis2

The center of the residue is calculated as the mean position of all the atoms in the residue; all the atoms are weighted equally.

CHAPTER 5. STATISTICAL NATURE . . .

157

crepancies in calculated potentials. The pertinent issue, that of the size of the discrepancies, is explored below. Note that this type of multipole expansion is a controlled approximation. That is, the algorithm provides a way of attaining any desired level of accuracy. Moreover, one is not constrained to using the same level of multipole expansion for each residue for calculating a given potential; some regions require more detailed treatment than other areas. Once we obtain a satisfactory reduced description, we can set about obtaining a deeper description. Below, we describe the idea of looking at physical distributions of the charges. Likewise, instead of looking directly at the partial charges themselves, we can look at the distribution of multipoles. The advantage is that there should be fewer multipoles to analyze in a sufficiently accurate, reduced description.

5.3.2

Multipolar analysis as sensitivity analysis

A useful application of this “multipolar” analysis is determining sensitive dependencies for an electrostatic energy gap (∆E (CO) ). Following the procedure outlined above in which we compare the exact potentials to those derived from multipole expansions of a given order, we focus on the individual residual contributors to ∆E (CO) . The premise for this study is that residual contributions that require a high order multipole expansion for an accurate assessment are those for which there is sensitive dependence. In other words, they are the residues whose details have an important effect on the potential and which, therefore, must be modeled with greater attention than required for most other residues. We compare the electrostatic potentials in the S2:2 model calculated from the full charge distribution to those derived from multipole expansions of different order (0 = monopole, 1 = dipole, etc) at PL , BL , and BM .3 Figure 5.6 shows the discrepancy between exact potential and the monopole expansion at BL for all the residual contributors. The net discrepancy (the sum of the discrepancies for all the residues) is -6.8 kcal/mol. Note, however, that the discrepancy from one particular residue is 3

By the chromophores, we mean the geometrically averaged position of the four central nitrogens for the chromophore

CHAPTER 5. STATISTICAL NATURE . . .

158

much larger than it is for the other residues. The discrepancy for the exceptional residue, TyrM208, is -8.8 kcal/mol, and therefore, the net discrepancy without counting TyrM208 is 2.0 kcal/mol. Overall, the discrepancy from the monopolar expansion is relatively small; the standard deviation of the distribution (without counting TyrM208) is 0.18 kcal/mol. Figure 5.7 shows the next order approximation (an expansion to first order for the potential at BL ). Notice that the standard deviation for the discrepancies (without including TyrM208) has decreased, falling to 0.04 kcal/mol. However, the discrepancy for TyrM208 still remains large at -9.8 kcal/mol. Hence, although the discrepancies decrease dramatically from the monopolar to the dipolar expansion, that for TyrM208 actually increases. Figure 5.8 is a graph of the discrepancies for the monopole approximation for PL . The cumulative discrepancy is 2.37 kcal/mol and the standard deviation of the discrepancies is 0.20 kcal/mol. However, with a maximum individual residual discrepancy of 1.57 kcal/mol and the minimum discrepancy of -1.75 kcal/mol, there are no dramatically large individual discrepancies as there are for BL . Figure 5.9 is the next order expansion. Although the standard deviation is now 0.05 kcal/mol, much smaller than that for the monopolar expansion, the cumulative discrepancy is actually larger than for the less accurate expansion at 4.4 kcal/mol. This discrepancy may be surprising but easily explainable. Even if the magnitude of individual residue discrepancies decreases, the net discrepancy does not necessarily decrease since individual improvements do not scale uniformly. As the order of the approximation increases, the accuracy increases on average and certainly asymptotically. Figure 5.10 is a plot of the discrepancies in the monopolar expansion for BM . This plot is more similar to that for PL than for BL . Hence, the behavior seen for TyrM208 and BL is not present for the m side symmetry related partner to BL . Similarly, monopolar expansions on HL , PM , and HM demonstrate that the potentials at these chromophores are not sensitively dependent on any residue as is BL on TyrM208.

CHAPTER 5. STATISTICAL NATURE . . .

159

Comparison between exact and monopole expansion BL 2.5 5.0

Difference in potential (exact − monopole) (kcal/mol)

0.0

1.5

−5.0

−10.0 0.0

500.0

1000.0 1500.0

0.5

−0.5

−1.5

−2.5 0.0

500.0 1000.0 Residue number

1500.0

Figure 5.6: The difference between an exact and monopole expansion at BL

CHAPTER 5. STATISTICAL NATURE . . .

160

Comparison between exact and order 1 expansion BL 2.5 5.0

Difference in potential (exact − order 1) (kcal/mol)

0.0

1.5

−5.0

−10.0 0.0

500.0

1000.0 1500.0

0.5

−0.5

−1.5

−2.5 0.0

500.0 1000.0 Residue number

1500.0

Figure 5.7: The difference between an exact and dipole expansion at BL

CHAPTER 5. STATISTICAL NATURE . . .

161

Comparison between exact and monopole expansion PL

Difference in potential (exact − monopole) (kcal/mol)

2.5

1.5

0.5

−0.5

−1.5

−2.5 0.0

500.0 1000.0 Residue number

1500.0

Figure 5.8: The difference between an exact and monopole expansion at PL

CHAPTER 5. STATISTICAL NATURE . . .

162

Comparison between exact and order 1 expansion PL

Difference in potential (exact − order 1) (kcal/mol)

2.5

1.5

0.5

−0.5

−1.5

−2.5 0.0

500.0 1000.0 Residue number

1500.0

Figure 5.9: The difference between an exact and dipole expansion at PL

CHAPTER 5. STATISTICAL NATURE . . .

163

Comparison between exact and monopole expansion BM

Difference in potential (exact − monopole) (kcal/mol)

2.5

1.5

0.5

−0.5

−1.5

−2.5 0.0

500.0 1000.0 Residue number

1500.0

Figure 5.10: The difference between an exact and monopolar expansion at BM

CHAPTER 5. STATISTICAL NATURE . . .

5.3.3

164

Conclusions for the multipolar analyses

In conclusion, for calculating the potential at every chromophore except for BL , representing residues purely as its monopole term is a reasonable approximation. Multipolar analysis rigorously identifies the unique situation of the immense sensitivity of the electrostatic potential at BL to the exact conformation of TyrM208. Other workers have previously identified this type of sensitivity but have not given a systematic method for looking for such sensitivities. Our method manages to identify TyrM208 and to show that it is a unique case. Moreover, we have a reasonably quantitative characterization of this sensitivity. All of these analyses have been carried out in the 2:2 dielectric boundary condition. There is no one definitive way to extend this work to the boundary condition of greatest interest, the 2:2:80 model. The multipole expansion is based on the electrostatic potential’s being expressed in terms of a 1/R coulombic potential. That analytic form is lost in the more complicated electrostatics of the inhomogeneous boundary conditions. A possible approach to working within the 2:2:80 model is to assume that if we are calculating the potential at a particular chromophore, then over the spatial extent of a residue in question, a coulombic form—scaled by an appropriate screening factor—can be used to describe the electrostatics on that scale. In that way, we would be able to recover a type of multipolar description (which would the same description as for the 2:2 boundary condition except that each residue would carry along a scaling factor.)

5.4 5.4.1

Electrostatic correlates to L–M homology Motivation

The issue of the l–m functional asymmetry has been discussed in various contexts throughout this dissertation. In Chapter 2, our calculations of the energetics and kinetics for electron transfer in the wt system show that et down the l branch is favorable over that down the m branch in every charge and dielectric model. The issue of l–m asymmetry is also a major one in the chapters on the proposed dielectric

CHAPTER 5. STATISTICAL NATURE . . .

165

asymmetry model of Steffen et al. and the mutation experiments of Heller et al. The essential quandary is to explain the presence of pronounced functional asymmetry in the prc in spite of structural near–symmetries. As we have discussed elsewhere, the ultimate explanation must boil down to aspects of the charge distribution in the prc. The mystery is not that the prc is perfectly symmetric and that there is nevertheless functional asymmetry (which would be impossible) but that there is substantial symmetry and it is not yet clear what actually breaks that symmetry. Sorting out how this mix of symmetry and asymmetry gives rise to the functional asymmetry is one of the major goals of research into the prc. Although connecting the basic physical description of the prc to the function of the complex is an ultimate goal, there are various other levels of description employed in studying the prc, particularly, in the context of explicating the l–m asymmetry. One example is the notion of evolutionary and sequence homology (similarities between proteins arising from common evolutionary ancestry). It has been conjectured that the l and m proteins evolved from a single protein, at which time the prc was indeed fully symmetric, at least on the level of having identical protein strands about the axis of C2 symmetry [14]. In the intervening time since the evolutionary split, l and m have followed their own respective evolutionary pathways, changing in different ways. That is to say, amino acids residues were mutated, entire sections added and deleted in the two proteins strands in different ways. Sequence alignments of l and m in various species (Rps. viridis, Rb. sphaeroides, and Rb. capsulatus) have been performed. These are reconstructions of the likely original relationship between the two proteins (within and among species), in which corresponding residues from the two proteins are matched, while those residues that have no corresponding partner are identified. In Rps. viridis, l and m share 73 identical amino acids, which comprise 27% of the 273 amino acids in l. Between Rb. sphaeroides and Rb. capsulatus (the most closely related pair among the three species), there are 78% and 76% homology for l and m respectively. The degree of homology between either Rb. sphaeroides or Rb. capsulatus with Rps. viridis is 59% for l and 50% for m [102]. The notion of homology is largely a qualitative concept used in structural biology.

CHAPTER 5. STATISTICAL NATURE . . .

166

So far, we have been relating physical models of the prc (which specify such things as the location and quantities of charges) to the electrostatics (by which we mean not only the net contribution to energy gaps (∆E (CO) ) but also the residual components). The question we want to address in this section is how to incorporate the third description of homology. There are two specific questions that can be tackled: how does the physical description relate to homology and how does homology relate to the electrostatic contributions. We address the second issue in the remainder of this section. We first look more closely at what exactly we mean by the relationship between electrostatic contributions and homology. If the evolutionary scenario described above is correct, then there was, at one point, physical symmetry between l and m. It is thought that for these ancient reaction centers, the primary et was equally likely down both branches [7]. For the sake of argument in this section, we postulate that the breaking of symmetry between the l and m alone caused the breaking of symmetry in the primary electron transfer—specifically, that the change of structure led to changes in energetic contributions (specifically in ∆E (CO) ) that in turn caused the favoring of transfer down the l branch. We know from Chapter 2 that the free energy of et is composed of several terms. Hence, the assumption that l and m are the basic electrostatic determinants of l–m energetic asymmetry can be true if other energetic terms contribute nothing to the asymmetry. (The sums of ∆E (CO) grouped by net contributions by protein strands show that this assumption does not seem to hold for the present prc. However, in the same way that l–m structural asymmetry does not necessarily imply that l and m were not descended from symmetric pairs, earlier reaction centers were possibly more symmetric with respect to other components, such as c and h.) In such a hypothesized symmetric prc, the free energy of et would be exactly the same on the l and m branches. (that is, not only ∆G3 3 = 0 and, similarly, ∆G2 2 = 0 (CO)

but also, ∆E3 3

(CO)

and ∆E2 2

would be 0.) More specifically, not only should the

electrostatic contributions of the l and m branches to these gaps be exactly opposite to one another because of this symmetry, but also the contributions of corresponding

CHAPTER 5. STATISTICAL NATURE . . .

167

residues. In other words, ∆E3 3 (L ) + ∆E3 3 (M  ) = 0 , (CO)

(CO)

∆E3 3 (Li ) + ∆E3 3 (Mi ) = 0. (CO)

(CO)

(5.4)

for all residues i where ∆E3 3 (L ) and ∆E3 3 (M  ) are the net electrostatic contri(CO)

(CO)

butions to ∆E3 3 (CO) ∆E3 3 (Li )

(CO)

of the hypothesized symmetric l and m branches respectively and

and ∆E3 3 (Mi ), the contributions of their ith residue. (CO)

Obviously, Equation 5.4 represents idealized relationships in a hypothesized perfectly symmetric prc. In this picture, the prc is no longer structurally or functionally symmetric because of mutations that have occurred since the time of perfect symmetry. The net effect of these mutations has been to move the system away from this perfect symmetry to a regime that might be hypothesized to be well optimized for its function. Yet we still expect to see this history of symmetry reflected in the prc. Indeed, there is enough symmetry left in the two strands to suggest common evolutionary ancestry. The evolutionary process has been an interplay between genetic drift and natural selection. The prc has been shaped by a background sea of random mutations. Ones that promote reproductive fitness have a greater chance in being retained, while ones that cause a fatal loss of important function are selected against. Presumably, we would not see such fatally deleterious mutations in the present prc. In general, however, many, if not most, mutations, have had more subtle, perhaps neutral, influences. Most of them have also contributed to the breaking of symmetry in the prc (since the original system was presumably symmetric). In this section, we want to test this scenario, searching for concrete signatures of such a picture. Specifically we probe for correlations between “structure” and “function” with regards to symmetry or broken symmetry. By structure we mean that related to homology, and by function, we mean that related to the electrostatics of the prc. To discern some of the evolutionary history of the two related proteins l and m, we will make use of sequence alignments—the pairing up of residues thought to correspond to one another. Sequence alignment is often a tricky process because there have been different deletions and insertions of residues in proteins being compared. (If l and m are descended from the same original protein, then there have certainly

CHAPTER 5. STATISTICAL NATURE . . .

168

been differing modifications for the two proteins since they have different numbers of residues.) Figure 5.11 summarizes the sequence alignment for l and m. With this alignment, we have a way to characterize homology. In the following section, we develop a method for examining statistically whether there is any significant correlation between the alignment of residues on the l and m strands and their electrostatic contributions to symmetry breaking or making. (numbering: R.vir. L) 1 10 20 30 =========== === == ======== ====== R.vir. L ALLSFERKYRVRGGTLIGGDLFDFWVGPYFVGFFGVSA R.sph. L ALLSFERKYRVPGGTLVGGNLFDFWVGPFYVGFFGVAT R.caps. L ALLSFERKYRVPGGTLIGGSLFDFWVGPFYVGFFGVTT | | | || | | | R.vir. M ADYQTIYTQIQARGPHITVSGEWGDNDRVGKPFYSYWL--GKIGDAQIGPIYLGASGIAA R.sph. M AEYQNIFSQVQVRGPADLGMTEDVNLANRSGVGPFSTL-LGWFGNAQLGPIYLGSLGVLS R.caps. M AEYQNFFNQVQVAGAPEMGLKEDVDTFERTPAGMFNIL--GWMGNAQIGPIYLGIAGTVS = == = = = = = = = ====== = 1 10 20 30 40 50 (numbering: R.vir. M)

Figure 5.11: Sequence alignment for l and m This figure is an excerpt of the sequence alignment for l and m for three species (Rps. viridis, Rb. sphaeroides, and Rb. capsulatus) given in Figure 2 of Michel et al. [103]. Specifically, residues 1–38 of l and residues 1–58 for m are shown. Residue numbering is based on the sequence for Rps. viridis. For each of l and m, an ’=’ marks sequence conservation for a residue. For example, residues 1–11 are conserved in l and residues 1, 3, and 4 are conserved in m. Furthermore, l–m equivalence (the corresponding pairs of residues for l and m for Rps. viridis match) is indicated by ’|’. For example, residue #16 on l and its partner (residue #38 on m) are both leucine.

5.4.2

Theory and method

In this section, we develop and apply quantitative methods to examine whether there are correlations between electrostatic symmetry/asymmetry and evolutionary

CHAPTER 5. STATISTICAL NATURE . . .

169

history of l and m. In Chapter 2, we present calculations of electrostatic contributions (CO)

of residues to the free energies measuring l–m symmetry (eg., ∆E3 3 ) In this section, we follow the following procedure: 1) organize the residual contributions from the l and m branches into pairs of corresponding residues through a sequence alignment; 2) calculate, for each pair i, a “cancellation factor” Ci , a measure aimed at quantitating the degree of electrostatic cancellation between the two residues of the pair; and finally, 3) sort Ci into the pair categories and types and look for statistical correlations between the value of Ci and the categories of the pairs. To determine the corresponding residues, we used a sequence alignment for Rps. viridis given by Michel et al. [103] and described in Table 5.2. In Rps. viridis, l comprises 273 residues, and m, 323 residues. Because 5 residues on l has no corresponding partner on m, there are therefore 328 “pairs.” Sixty of these are actually unmatched single residues (55 of them are unmatched m residues and 5 are unmatched residues on l). In addition to providing alignments between l and m within species, Michel et al. present inter-species alignments among three species (Rps. viridis, Rb. sphaeroides, and Rb. capsulatus), information we use below. With the sequence alignment in hand, we devise a parameter to quantitate the degree to which the electrostatic contributions of corresponding pairs cancel each other out (thereby, maintaining l–m symmetry) or reinforce each other (thereby, breaking l–m symmetry). Let (Lai , Mai ) be the electrostatic contributions to an energy gap (CO)

that would have l–m symmetry in the perfectly C2 –symmetric system (eg., ∆E3 3 ). The “cancellation factor” Ci of the corresponding pair (Lai , Mai ) is defined as |Lai + Mai | Ci = 1 − 2 max(|Lai |, |Mai |)

(5.5)

(when both Lai and Mai are non–zero) and Ci is undefined when both are zero. The factor Ci can vary between 0 and 1. When Lai = −Mai (the perfectly symmetric case), Ci = 1; When Lai = Mai (the case of “perfect reinforcement”), then Ci = 0. When only one of the partners exists, then Ci = 0.5. The last preparation to make before the statistical calculations is forming categories for the residue pairs. Table 5.3 summarizes the five criteria used: 1) whether the l partner exists for the pair, 2) whether the m partner exists for the pair, 3)

CHAPTER 5. STATISTICAL NATURE . . .

170

Table 5.2: Alignment of l and m in Rps. viridisa

a

Pair number (i)

m start m endb

l start l endb

1–22

1

22





23–38

23

38

1

16

39–71

39

71

19

51

72

72

72





73–77

73

77

52

56

78–81

78

81

58

61

82–88

82

88





89–102

89

102

62

75

103

103

103





104

104

104

76

76

105–230

105

230

78

203

231–237

231

237





238–289

238

289

204

255

290–306

290

306

257

273

307–323

307

323





324–325





17

18

326





57

57

327





77

77

328





256

256

This table is based on Figure 2 of Michel et al. [103]. There are 328 pairs because m,

the longer of the two proteins, is composed of 323 residues and there are five residues on l that have no corresponding partner on m. The numbering for the pair numbers is based primarily on the numbering for m. b

The numbers for start and end indicate the first and last residue of segments in l

and m that align with one another. A ’-’ indicates that there is no corresponding residue on the given protein.

CHAPTER 5. STATISTICAL NATURE . . .

171

whether the identity of the l partner is the same as the identity of the m residue (“l– m equivalence”), 4) whether there is conservation on the l among the three species for the l partner, and 5) whether there is such sequence conservation for the m residue. In theory, these five categories can give rise to 32 (25 ) categories, but not every combination of possibilities exists. For instance, in our scheme, at least one partner must exist, and if only one exists, the issue of l–m equivalence is irrelevant. We also use larger categories (what we here call ”types”) than the ones just specified. Refer to Table 5.4 for a summary of the residue pair types. The first type of alignment pair is the case in which a residue on one branch has no correspondent on the other pair; one can conceptualize the existing residue as having been inserted into the sequence (or a residue on the other side having been deleted). In this asymmetric situation where one of the electrostatic contributions is exactly zero, Ci = 1/2. Type #2 is the one in which either the l partner or m parter is conserved but in which the two residues are different. A conserved residue is one that has retained its identity throughout evolutionary history. In this study, we consider a residue to be conserved if it is the same among the three species whose mutual alignment to which we have access. The presence of sequence conservation often implies that the function of the protein depends sensitively on the exact identity of the residue in question, that mutating the residue might cause a drastic alteration in function. Because the pair consists of different residues and because there is strong preference for the retention of the identity of at least one of them, we hypothesize that the electrostatics of the pair would also be strongly asymmetric. In other words, Ci would tend to approach 0. Pair type #3 is that in which the pair shares a common residual identity but in which neither partner is conserved. It is hard to say what exactly to expect in this case. The shared residue identity suggests that it is likely that there would be a fair amount of electrostatic symmetry. However, that the l and m partners are not conserved could suggest that this symmetry is only coincidental and that strong electrostatic symmetry should not necessarily be expected. Through a similar argument, we can expect that the fourth case, in which the two partners are both identical and conserved, should display exceptionally strong electrostatic symmetry,

CHAPTER 5. STATISTICAL NATURE . . .

172

Table 5.3: Categorization of residue pairs

a

cat. #a

Nb

l?c

m?c

l = m l cons.d

m cons.d

mean Ci e

std. dev. Ci e

8

41

n

y

n

n

n

0.5

0

9

14

n

y

n

n

y

0.5

0

16

4

y

n

n

n

n

0.5

0

18

1

y

n

n

y

n

0.5

0

24

65

y

y

n

n

n

0.69

0.27

25

34

y

y

n

n

y

0.70

0.25

26

65

y

y

n

y

n

0.64

0.25

27

29

y

y

n

y

y

0.69

0.22

28

9

y

y

y

n

n

0.77

0.22

29

13

y

y

y

n

y

0.78

0.18

30

7

y

y

y

y

n

0.84

0.10

31

46

y

y

y

y

y

0.81

0.17

The categories are numbered 0–31, one for each of the 32 possible combinations of

the five criteria. Only the categories which contain any residue pairs are listed. b

N is the number of resdiue pairs in the category.

c

l? and m? indicate whether the l or m residue exists, respectively.

d

l cons. and m cons. indicate whether there is sequence conservation for the l or m

residue, respectively. e

These values are calculated for the case in which Lai and Mai are the residual electro(CO)

static contributors to ∆E3 3 Ci is defined by Eq. 5.5.

in S2:2:80 —organized into pairs according to Table 5.2.

CHAPTER 5. STATISTICAL NATURE . . .

173

Table 5.4: Alignment pair types Pair number 1

Type l or m insertion

Symmetry?

Expected value for Ci a

asymmetry present

1/2

l or m conserved

strong asymmetry

Ci towards 0

(but l = m)

expected

l = m (but no

fairly strong

Ci towards 1 or not

conservation)

symmetry expected

much pattern

l = m (with l and

very strong

Ci towards 1

m conservation)

symmetry expected

everything else (l

the effects of many

Ci “randomly”

and m not conserved

mutations, most of

distributed

and l = m)

which we can guess

regions 2 3 4 5

are “neutral” a

Ci is defined by Eq. 5.5.

CHAPTER 5. STATISTICAL NATURE . . .

174

Table 5.5: Ci calculated for pair typesa Pair Typeb

Nc

meand

std. dev.d

1 (8,9,16,18)

60

0.5

0

2 (25,26,27)

128

0.67

0.24

3 (28)

9

0.77

0.28

4 (29,30,31)

66

0.81

0.17

5 (24)

65

0.69

0.27

a

Ci is defined by Eq. 5.5.

b

The pair type number is followed by the pair categories which together comprise

the type, marked off in parentheses. The pair types are defined in Table 5.4, while the pair categories are defined in Table 5.3. c

Number of residue pairs belonging to the type.

d

The mean and standard deviation for Ci of all the residue pairs of the given type.

that Ci should approach 1. Finally, we group all the other pairs into type #5, that this, the pairs in which neither the l partner or m parter is conserved and in which the two partners are not the same. We hypothesize that because there is no conservation, no strong selective pressure has kept these pairs from the randomizing effect of many mutations. Consequently, we would expect the electrostatics of this pair to have no significant preference to conserving or breaking l–m symmetry but to follow a resultant “random” distribution.

5.4.3

Results and discussion

In this section, we present calculations in which Lai and Mai are the residual contri(CO)

butions to ∆E3 3

in the S2:2:80 model. The cumulative distribution for the resultant

Ci is presented in Figure 5.12. We then classify the 328 pairs among the categories and types we have constructed. Sixty of the pairs are of type #1; that is, these pairs comprise single unmatched partners and therefore, Ci = 0.5. The cumulative distribution of Ci with pairs of type #1 removed is shown in Figure 5.13. For each of the categories and types, we calculate the mean and standard deviation for Ci of

CHAPTER 5. STATISTICAL NATURE . . .

175

aligned residue pairs. Table 5.3 displays these statistics for the categories (as well as the number of residues in each category); Table 5.5 lists the results for the pair types. Figure 5.14 displays the distribution of Ci for all of the residue pairs, organized by pair categories.

1.0

0.8

C

0.6

0.4

0.2

0.0 0.0

0.2

0.4 0.6 proportion (328 observations)

0.8

1.0

Figure 5.12: Cumulative distribution for Ci (CO)

The cancellation factor Ci is calculated for ∆E3 3

in S2:2:80 (Eq. 5.5). Points (x, y)

on this curve marks the fraction x of the total number of residue pairs whose Ci is less than or equal to y. We focus our attention on the pair types and look for the trends hypothesized in Section 5.4.2 and Table 5.4. We specifically determine whether there are statistically significant differences in the distributions of Ci for the various pair types and,

CHAPTER 5. STATISTICAL NATURE . . .

176

1.0

0.8

C

0.6

0.4

0.2

0.0 0.0

0.2

0.4 0.6 proportion (268 observations)

0.8

1.0

Figure 5.13: Cumulative distribution for Ci (without unpaired residues) (CO)

The cancellation factor Ci is calculated for ∆E3 3

in S2:2:80 (Eq. 5.5). Points (x, y)

on this curve marks the fraction x of the total number of residue pairs whose Ci is less than or equal to y—once the 60 residue pairs whose Ci = 0.5 are removed from consideration.

CHAPTER 5. STATISTICAL NATURE . . .

177

1.0

0.8

C

0.6

0.4

0.2

0.0

22

24

26 28 pair category number

30

32

Figure 5.14: Ci sorted by pair category (CO)

The cancellation factor Ci is calculated for ∆E3 3

in S2:2:80 (Eq. 5.5). Ci for the 268

pairs (in which there are actually residues on both the l and m) are plotted vs the pair category of the pair. Pair categories are defined in Table 5.3.

CHAPTER 5. STATISTICAL NATURE . . .

178

if so, whether they actually follow any expected trends. The results do follow, at least roughly, predicted trends. For instance, the mean Ci for type #2 pairs (0.67), expected to be approaching zero, is smaller than the mean Ci of type #4 pairs (0.81), which is predicted to tend to 1. However, such differences may be deceptive: the key issue is whether these differences are of any statistical significance. There is a distribution of Ci for each pair type. We consider the following statistical question: do the observed distributions suggest that if these points were drawn from underlying random distributions, they would be different random distributions? A specific related question is the actual one that we use: given two sets of observations, can we conclude, within some confidence interval, that the means of the two underlying distributions are different? For the following comparisons of sample means, we use a two–tail t–test with a confidence interval of 99% [101]. In determining whether there is any correlation between the presence of pairs with conserved residue(s) and the value of the cancellation factor, Ci , we pose and answer the following statistical questions: • Is the mean Ci for types 2 and 5 significantly different? In both cases, there is no l–m equivalence, but the former has at least one conserved partner but the latter has no conserved residues. The null hypothesis that they have the same mean cannot be rejected. • Is the mean Ci for types 3 and 4 significantly different? In both cases, there is l–m equivalence, but the latter has at least one conserved partner but the former has no conserved residues. The null hypothesis that they have the same mean cannot be rejected. • Is the mean Ci for the pool of types 2 and 4 different from Ci for a pool of type 3 and 5 residues? The former pool has at least one conserved partner, while the latter has none. The null hypothesis that they have the same mean cannot be rejected. We tackle similar statistical questions to ascertain electrostatic correlations to pairs of identical corresponding residues. Specifically, these questions are:

CHAPTER 5. STATISTICAL NATURE . . .

179

• Is the mean Ci for types 3 and 5 significantly different? In both cases, there is no conservation, but the former has no l–m equivalence, but the latter comprise pairs of identical residues. The null hypothesis that they have the same mean can be rejected. • Is the mean Ci for types 2 and 4 significantly different? In both cases, there is conservation of at least one partner, but the former has no l–m equivalence, but the latter comprise pairs of identical residues. The null hypothesis that they have the same mean can be rejected. • Is the mean Ci for the pool of types 3 and 4 different from Ci for a pool of type 2 and 5 residues? The former pool has l–m equivalence, while the latter does not. The null hypothesis that they have the same mean can be rejected. From these statistical tests, we can draw various conclusions. The first set of tests indicates that the value of Ci , the electrostatic cancellation factor, is not discernibly affected by the presence of sequence conservation in aligned residue pairs. This conclusion holds whether the other key factor, l–m equivalence for the residue pair, is present or absent. In contrast, the second group of statistical tests demonstrate that l–m equivalence in the pair (whether the partners are the same amino acids) is strongly correlated to the value of Ci . Specifically, identical aligned partners definitely give rise to stronger electrostatic cancellation than different aligned partners. In Section 5.4.2 and Table 5.4, we conjecture that pair type #5 residues (ones which have partners that are neither equivalent nor conserved) would be characterized by a somewhat “random” (or scattered) distribution for Ci . From comparing the distribution of type #5 residues (comprising category #24 pairs) to the distribution of other categories in Figure 5.14, we see that the distribution does cover a relatively broad range.

5.4.4

Discussion and conclusion

In retrospect, it is not surprising to find a strong differentiation between the value of Ci for identical residue partners and Ci for partners which are different amino

CHAPTER 5. STATISTICAL NATURE . . .

180

acids. Different amino acids are less likely to cancel each other electrostatically. As argued in Section 5.3, describing a residue by its net charge alone (the lowest order multipole) provides a generally adequate zeroeth order approximation for calculating ∆E (CO) . When residual partners are the same amino acid, they both have the same charge. Hence, in most cases, Ci for such pairs should be greater than 0.5; exceptions will arise if higher order moments are important or if the residues are not rotated roughly 180◦ about the C2 axis from each other. When residual partners are different amino acids, many of these pairs have different net charges. In such cases, Ci is then likely to be less than 0.5. This effect is clearly shown in Figure 5.14. Categories 24–27 are those residue pairs which are different; categories 28–31 are those categories with identical pairs. Notice the difference in the numbers of Ci less than 0.5 in the two halves of the plot. Interestingly, we do not find any statistically discernible difference in the distribution of Ci due to sequence conservation. This result points to a number of possible explanations, each of which requires further examination. We might interpret this negative result as evidence against the entire evolutionary scenario painted in Section 5.4.2. Sequence conservation might point to functional significance that has little to do with electrostatics (for instance, they may be important to proper peptide folding or as scaffolding of chromophores). Hence, although some of the conserved residue pairs might actually be crucial to the electrostatics of the prc, their role may be obscured by the others that do not play such a role. Another possibility is that electrostatic symmetry / asymmetry is more apparent at larger units. That is, instead of looking at residues, one needs to look at larger unity (such as polypeptide helix units, etc.). Finally, different correlations might emerge in different energy gaps (CO)

and different molecular models. Calculating Ci for ∆E2 2

could shed some light on

whether electrostatic symmetry breaking is different at the level of the bacteriochlorophylls than at the bacteriopheophytins. Calculating Ci for the fully neutralized model might more sensitively pick out the role played by conservation and l–m equivalence. Differentiation in Ci due to simply having the same net charge would not be a consideration in the neutral model (since every residue is neutral in the model).

181

Chapter 6 Concluding Comments In this chapter, we present a summary of our work—the methodology, results, and conclusions—develop some overall perspective, and point to possible future work. The essential thrust of our work has been to investigate the primary electron transfer in the Rhodopseudomonas viridis photosynthetic reaction center through computer simulation techniques based on a physical picture derived from Marcus theory. We have consistently pursued two goals throughout the dissertation. First, we have calculated various properties of the prc, which we then compared to experimental measurements. Second, we have attempted to distill the underlying physics of our computer models. We have met with some success in both goals. Ideally, we would like to be able to account quantitatively for the structure–function relationships of the prc. However, we have discovered that uncertainties in our knowledge of key issues and limitations in our computational techniques remain barriers to such an understanding. Nevertheless, this dissertation introduces some novel approaches in the study of the prc—particularly, with regards to explicating the basic physics of computer models for the prc.

6.1

Detailed summary of our findings

The dissertation comprises basically three parts. First, we have calculated diabatic free energy surfaces for the wt Rps. viridis prc, exploring the role played by variation

CHAPTER 6. CONCLUDING COMMENTS

182

in two parameters: the charge states of ionizable amino groups and the treatment of environmental dielectric response. Second, we have applied these models to compare calculated effective dielectric constants with those derived by Boxer, Steffen et al. and to simulate for mutant prc systems of Holten, Heller et al. Third, we have examined whether simplified or statistical models can be effectively used in the modeling of the prc and to uncover relationships between the physical and structural biological descriptions. After laying out the fundamental physical theory involved (Marcus theory for et) and computational techniques used in Chapter 1, we constructed nine models for the prc: the combinations of our three treatments for the dielectric environment (2:2, 2:2:80, and 2:80) and the three treatments of the charge states of the ionizable amino groups (standard, partially neutralized, and fully neutralized). For each model, we calculate three of the four components (∆E (CO) , E (RF ) , and λ) that comprise the free energies of electron transfer (∆G). ∆E (0) is constrained to yield the experimental value of ∆G13 . We focused primarily on the 2:2:80 boundary condition as the most physically realistic of the three boundary conditions used. Since ∆G13 is constructed to have the experimental value, we used other criteria to determine the credibility of the models. We found that both the S2:2:80 and P2:2:80 models yield values for ∆E (0) that are within error estimates of the tz calculations of the vacuum energies. Moreover, kinetics calculated for primary et in both these models showed l–m asymmetry in accordance to experiment. In contrast, the N2:2:80 model has unrealistically small values for ∆E (0) and implies an unrealistically high population for m side et. Hence, it seems that the N2:2:80 model is characterized by excessive charge neutralization. Our calculations provide evidence that either using the standard charge states or neutralizing some amino acids is proper for molecular modeling of the prc. In both the S2:2:80 and P2:2:80 models, ∆G12 is large and positive, lending support to a superexchange, rather than two–step, mechanism for primary et. A “qualitative picture” of the prc provides a simple partial explanation for many of the trends observed in the models. This qualitative picture is essentially a conception of the distribution of charges in the prc as a continuum. Of course, because the

CHAPTER 6. CONCLUDING COMMENTS

183

charges do not actually form a continuum, we need a more detailed model to explain trends not understood within the qualitative picture. Looking at individual contributors proves to be an effective way to understand the physics of the models and also to identify possibly significant amino acid residues. With different dielectric screening, the range of significant contributors changes quite a bit. There is a number of what might be called larger residual contributors. However, these gaps may be the results of a collective effect. Although our statistical examination of electrostatic residual contributors indicate that there is a large number of “large” contributors, experiment is needed to determine whether there are a few or many such key contributors. In our molecular dynamics of the prc, we found waters floating free of the complex in the partially neutralized and fully neutralized models. These “loose” waters may indicate that certain residues should remain charged. ∆G12 is large and positive in the S2:2:80 and P2:2:80 models. Part of the reason is the orientation of TyrM208, but even using an opposite orientation still leaves models favoring a superexchange model. (CO)

Various amino acids which were found to be important in ∆E13

(CO)

and ∆E3 3

are

suitable possible targets for site-directed mutagenesis. We have also provided detailed comparisons of our calculations to previous calculations. The work of Marchi et al. (mgcn) is essentially the S2:2 model without taking into account E (RF ) , the reaction field component. As such, it probably underestimates the amount of environmental dielectric response in the prc. The S2:2:80 model agrees (more or less) with the work of with Gunner et al. except that they do not treat explicitly λ, the reorganization energy, nor do they examine explicitly the possibility of different charge states for the ionizable amino groups. We have expended the greatest efforts in drawing comparisons to the calculations of Warshel, Parson, et al. (see Chapter 2 and Appendix A for details). Their pdld methodology, which uses a grid of polarizable dipoles, to simulate environmental dielectric response, differs from our combination of molecular dynamics and pb calculations. A consistent theme in their work is what we have termed the Warshel neutralization ansatz: an accurate model for the dielectric response of a protein complex must have enough dielectric response as to render the model virtually insensitive to the actual charge states of amino acids. We do not find such an ansatz to be borne out in our calculations. The

CHAPTER 6. CONCLUDING COMMENTS

184

most important difference between our modeling and that of Warshel, Parson, et al. remains the amount of dielectric response contained in the modeling. The models used by Warshel, Parson, et al. routinely contain much greater dielectric screening than used in our three models. In the second part of the dissertation, we adapt our modeling of the wt prc to see whether we can account for two different sets of experimental findings. In Chapter 3, we consider an experiment by Steffen, Boxer et al., in which their estimated effective dielectric constants on l chromophores are larger than on the m chromophores. Boxer et al. have suggested that this differential dielectric response is responsible for the favoring of l–side primary et over m–side transfer. From electric fields calculated from md and pb runs aimed at simulating the experiment of Steffen et al., we have calculated effective dielectric constants. However, we do not find strong differentiation between the l and m dielectric constants. In Chapter 4, we study two mutants of the prc described by Heller et al.: the β mutant and the Heller double mutant. Heller et al. estimated the kinetics for the wt and these two mutant systems; most strikingly, they found evidence for et down the m branch in the Heller double mutant. To study the mutants using our methodology, we estimated diabatic free energy surfaces for the mutants. The diabatic free energy surfaces obtained parallel qualitatively many of the observed behavior of the mutants, but do not provide a compelling explanation for all the experimental facts. In the third and final part of the dissertation (Chapter 5), we look at applying statistical methods and course–graining techniques to identify large–scale regularities in the structure–function of the prc. First, we studied the distribution of residual contributors to ∆E (CO) as a statistical distribution. We determined that these distributions (for the standard charge models and the fully neutralized models) are characterized by a substantial number of outliers. Moreover, they are not distributed in either a gaussian or exponential fashion. Second, we looked at using low order multipole expansions of charges in a residue to calculate ∆E (CO) . We found that such expansions are quite accurate for calculating the electric potential at all the chromophores, with the exception of BL —which depends strongly on the detailed distribution of charge in TyrM208. Hence, multipole expansions provide a method

CHAPTER 6. CONCLUDING COMMENTS

185

of testing for sensitivity to details in charge distributions. Third, we searched for any statistically significant correlation between the degree of electrostatic symmetry (CO)

breaking in ∆E3 3

of homologous l–m pairs and whether the residues of the pair are

conserved residues. We found that there was no such correlation, which is somewhat surprising.

6.2

Overall reflections on our work

In this section, we return again to the two overarching themes of the dissertation. Let us consider first the work of quantitatively calculating functional properties of the prc to be compared to experimental measurement. Overall, we can see that the methods at our disposal for calculating the kinetics and energetics of primary et in the prc are still fraught with rough edges. Results are quantatively sensitive to the two important variables of charge states of amino groups and the treatment of dielectric response. The work of this dissertation has been focused primarily on showing rigorously the degree of such dependence and to suggest the most likely value of these variables. Hence, alternative methods will be needed to determine these issues. For the charge states of ionizable groups, efforts at determining them via pKa shift calculations are under way [104, 75], even though there is a great amount of controversy surrounding the pertinent methodology. Indeed, one of the outstanding questions in the calculation of pKa shifts is the same as the one that divides our work from that of Warshel, Parson, et al.: how much dielectric response is required for accurate modeling of the electrostatic properties of a protein complex such as the prc? In some instances, pKa shift calculations performed with large dielectric constants (on the order of 20) have agreed better with experimental results than that with lower dielectric constants [75, 105]. Some workers have taken this finding to imply that proper dielectric modeling of proteins should therefore use this high dielectric constant, but others have argued that this result could easily be the result of failings in other approximations in the methodology for pKa shift calculations [75]. It is unfortunate but not surprising that our methodology does not provide compelling explanations for either the Steffen/Boxer work or for the Heller mutants.

CHAPTER 6. CONCLUDING COMMENTS

186

Given that there are still basic outstanding questions remaining for the modeling of the wt system for quantities at which the methodology excels (free energies, electrostatic potentials), accurate calculations that involve significant mutations are probably a bit beyond the limits of current techniques. It is possible that the low effective dielectric constants calculated for the Steffan/Boxer experiments indicate the need for more dielectric response, in accordance to Warshel, Parson, et al . Again, this is an issue that requires more experimental or computational elaboration. The limited success we have had in reproducing experimental measurements points to the need for further work in the quantitative fine–tuning of the computational methodology. Although developing a quantitatively accurate computer model of the prc is certainly a worthwhile endeavor, one can actually overestimate its importance. Even if one developed an atomic level model that fully reproduces the experimentally measured properties of the prc, one would still not necessarily understand the fundamental reasons for the success of the model. The atomic level models that we and other workers have used are incredibly complex systems in their own right. The ability of such models to reproduce experimental results can arise from various scenarios. One possibility is that the models (in their basic construction and in their parameterization) capture all of the physics germane to problem at hand, thus permitting quantitatively accurate calculations. Another possibility, however, is that quantitative reproduction of experimental values is a fortuitous mapping of many parameters (which characterize all these models) to few experimental numbers. Likewise, the failure of a given model to accurately reproduce experimental measurements does not imply the utter failure of that model. Certain questions arise in this situation: Is the fundamental physics of the model flawed? Are the particular parameters inaccurate? How sensitive is the model to variations in the physics (and the parameterization)? What, out of the hundreds, if not thousands, of parameters in the model, are the ones that determine the key properties of the models? We have addressed some of these issues as we pursued the second goal of the thesis—the explication of the underlying physics of our models. In many ways, this second theme is the primary novelty and achievement of the dissertation. We have examined the overall physics of changing the dielectric modeling and charge states

CHAPTER 6. CONCLUDING COMMENTS

187

of ionizable amino groups. We showed that a simplified electrostatic representation of the prc in terms of multipolar expansions is still quite accurate—and thus may be a step in still simpler models that would unveil the essential workings of the prc. We have addressed the question of distinguishing between amino acids which sensitively affect the overall properties of the prc and those residues which do not. Finally, we have tied the biology of the prc to the physics of the prc in searching for correlations between homology and electrostatics. Although we did not find suspected correlations, we believe that the fingerprints of the prc as an historical, evolutionary, biological entity should somehow be reflected in the physics of the prc. Although our work constitutes only the first steps at uncovering the “essential physics” of the reaction center, we believe that our work is significant for simply raising and tackling issues which have been largely ignored until now.

6.3

A few concrete suggestions for future work

Most of the work in this thesis has focused on the prc of one particular species— Rps. viridis. High resolution structures are available for other species, in particular, Rb. sphaeroides. Although there are profound similarities between the prc of Rps. viridis and Rb. sphaeroides, it would be worthwhile to apply the calculational techniques we have applied to Rps. viridis to Rb. sphaeroides. The most obvious rationales for such an effort are to see how well these techniques would work in another species and to take advantage of the fact that much experimental data are extracted from Rb. sphaeroides rather than Rps. viridis. However, we think that the most interesting reason for doing such calculations is that they would allow for a continuation of the themes developed in Chapter 5. The primary aim of the statistical modelling discussed in that chapter is to distill the “essentials” of the reaction center which are responsible for certain key functional properties: the strong l–m asymmetry and the unit quantum efficiency of the primary et. Applying the md and pb methodology to another prc (such as that of Rb. sphaeroides) would permit a number of new possibilities. First, we would be able to see to what extent reduced physical descriptions (such as the multipolar distributions calculated in Section 5.3) for the various prc

CHAPTER 6. CONCLUDING COMMENTS

188

are similar. Second, we would be able to compare the identity of large electrostatic contributors between species: which ones are the same? which ones are conserved residues? Third, we would be able to perform a more detailed study of l–m homology (as was done for Rps. viridis in Section 5.4). With the calculation of ∆E (CO) for Rb. sphaeroides in hand, one could then compare corresponding homologous contributors and their electrostatics directly.

189

Bibliography [1] Lawlor, D. W. Photosynthesis: Molecular, Physiological and Environmental Processes. Longman Scientific & Technical, Essex, England, second edition, (1993). [2] Walz, T. and Ghosh, R. Journal of Molecular Biology 265, 107–111 (1997). [3] Deisenhofer, J., Epp, O., Sinning, I., and Michel, H. Journal of Molecular Biology 246(3), 429–457 February (1995). [4] Ermler, U., Fritzsch, G., Buchanan, S. K., and Michel, H. Structure (London) 2(10), 925–936 (1994). [5] Ermler, U., Michel, H., and Schiffer, M. Journal of Bioenergetics and Biomembranes 26(1), 5–15 (1994). [6] Stowell, M. H. B., McPhillips, T., Rees, D., Soltis, S., Abresch, E., and Feher, G. Science (in press) (1997). [7] Bixon, M., Fajer, J., Feher, G., Freed, J. H., Gamliel, D., Hoff, A. J., Levanon, H., Mobius, K., Nechushtai, R., Norris, J. R., Scherz, A. ., Sessler, J. L., and Stehlik, D. Israel Journal Of Chemistry 32(4), U369–518 (1992). [8] Deisenhofer, J. and Michel, H. In The Photosynthetic Reaction Center, Volume II, Deisenhofer, J. and Norris, J. R., editors, chapter 3, 541–558. Academic Press, Inc., San Diego (1993). [9] Yeates, T. O., Komiya, H., Rees, D., Allen, J. P., and Feher, G. Proceedings of the National Academy of Science 84, 6438–6442 September (1987).

BIBLIOGRAPHY

190

[10] Komiya, H., Yeates, T. O., Rees, D. C., Allen, J. P., and Feher, G. Proceedings of the National Academy of Science 85, 9012–9016 (1988). [11] Deisenhofer, J. and Michel, H. The EMBO Journal 8(8), 2149–70 (1989). [12] Deisenhofer, J., Epp, O., Miki, K., Huber, R., and Michel, H. Nature 318, 618–624 (1985). [13] Allen, J. P., Feher, G., Yeates, T. O., Komiya, H., and Rees, D. C. Proceedings of the National Academy of Science 84, 5730–5734 (1987). [14] Lockhart, P. J., Steel, M. A., and Larkum, A. W. D. FEBS Letters 385(3), 193–196 (1996). [15] Boxer, S. G. In The Photosynthetic Reaction Center, Volume II, Deisenhofer, J. and Norris, J. R., editors, 179–221. Academic Press, Inc, San Diego (1993). [16] Kirmaier, C. and Holten, D. In The Photosynthetic Reaction Center, Volume II, Deisenhofer, J. and Norris, J. R., editors, chapter 3, 49–70. Academic Press, Inc., San Diego (1993). [17] Gunner, M. R. Current Topics in Bioenergetics 16, 319–367 (1991). [18] Fleming, G. R., Martin, J. L., and Breton, J. Nature 333, 190–192 (1988). [19] Du, M., Rosenthal, S. J., Xie, X. L., DiMagno, T. J., Schmidt, M., Hanson, D. K., Schiffer, M., Norris, J. R., and Fleming, G. R. Proceedings of the National Academy of Science 89, 8517–8521 (1992). [20] Peloquin, J., Williams, J., X., L., G., A. R., Taguchi, A., Allen, J. P., and Woodbury, N. W. Biochemistry 33(26), 8089–8100 (1994). [21] Woodbury, N. W. and Parson, W. W. Biochimica et Biophysica Acta 767(2), 345–361 (1984). [22] Goldstein, R. A., Takiff, L., and Boxer, S. G. Biochimica et Biophysica Acta 934, 253–263 (1988).

BIBLIOGRAPHY

191

[23] Woodbury, N. W. and Parson, W. W. Biochimica et Biophysica Acta 850(2), 197–210 (1986). [24] Schmidt, S., Arlt, T., Hamm, P., Huber, H., N¨agele, T., Wachtveitl, J., Meyer, M., Scheer, H., and Zinth, W. Chemical Physics Letters 223(1–2), 116–120 (1994). [25] Arlt, T., Bibikova, M., Penzkofer, H., Oesterhelt, D., and Zinth, W. Journal of Physical Chemistry 100(29), 12060–12065 (1996). [26] Marchi, M., Gehlen, J. N., Chandler, D., and Newton, M. Journal of the American Chemical Society 115(10), 4178–4190 (1993). [27] Gunner, M. R., Nicholls, A., and Honig, B. Journal of Physical Chemistry 100(10), 4277–4291 (1996). [28] Parson, W. W., Chu, Z.-T., and Warshel, A. Biochimica et Biophysica Acta 1017(3), 251–72 June (1990). [29] Warshel, A., Chu, Z.-T., and Parson, W. W. Journal of Photochemistry and Photobiology A–Chemistry 82(1–3), 123–128 (1994). [30] Alden, R. G., Parson, W. W., Chu, Z. T., and Warshel, A. Journal of the American Chemical Society 117, 12284–12298 (1995). [31] Kellogg, E. C., Kolaczkowski, S., Wasielewski, M. R., and Tiede, D. M. Photosynthesis Research 22, 47–59 (1989). [32] Marcus, R. A. and Sutin, N. Biochimica et Biophysica Acta 811, 265–322 (1985). [33] Moser, C., Keske, J. M., Warncke, K., Farid, R. S., and L., D. P. Nature 355(6363), 796–802 (1992). [34] Parson, W. W. and Warshel, A. In The Photosynthetic Reaction Center, Volume II, Deisenhofer, J. and Norris, J. R., editors, chapter 3, 23–47. Academic Press, Inc., San Diego (1993).

BIBLIOGRAPHY

192

[35] Yeates, T. O., Komiya, H., Chirino, A., Rees, D. C., Allen, J. P., and Feher, G. Proceedings of the National Academy of Science 85, 7993–7997 (1988). [36] DiMagno, T. J. and Norris, J. R. In The Photosynthetic Reaction Center Volume II, Deisenhofer, J. and Norris, J. R., editors, 105–127. Academic Press, Inc., San Diego (1993). [37] Woodbury, N. W., Lin, S., Lin, X., Peloquin, J. M., Taguchi, A. K. W., Williams, J. C., and Allen, J. P. Chemical Physics 197(3), 405–421 (1995). [38] Vos, M. H., Rappaport, F., Lambry, J.-C., Breton, J., and Martin, J.-L. In The Photosynthetic Bacterial Reaction Center II, Breton, J. and Verm´eglio, A., editors, 237–243. Plenum Press, New York (1992). [39] Vos, M. H., Rappaport, F., Lambry, J.-C., Breton, J., and Martin, J.-L. Nature 363, 320–325 (1993). [40] Woodbury, N. W., Peloquin, J. M., Alden, R. G., Lin, X., Lin, S., Taguchi, A. K., Williams, J. C., and Allen, J. P. Biochemistry 33(26), 8101–8112 (1994). [41] Ogrodnik, A., Eberl, U., Heckmann, R., Kappl, M., Feick, R., and MichelBeyerle, M. E. Journal of Physical Chemistry 95, 2036–2041 (1991). [42] Lockhart, D. J., Hammes, S. L., Franzen, S., and Boxer, S. G. Journal of Physical Chemistry 95, 2217–2226 (1991). [43] Holzapfel, W., Finkele, U., Kaiser, W., Oesterhelt, D., Scheer, H., Stilz, H., and Zinth, W. Proceedings of the National Academy of Science 87(13), 5168–5172 (1990). [44] Zinth, W. and Kaiser, W. In Photosynthetic Reaction Center Volume II, Deisenhofer, J. and Norris, J. R., editors, chapter 4, 71–86. Academic Press, San Diego (1993). [45] Holzwarth, A. R. and M¨ uller, M. G. Biochemistry 35(36), 11820–11831 (1996).

BIBLIOGRAPHY

193

[46] Kirmaier, C., Laporte, L., Schenck, C. C., and Holten, D. Journal of Physical Chemistry 99(21), 8903–8909 (1995). [47] Kirmaier, C., Laporte, L., Schenck, C. C., and Holten, D. Journal of Physical Chemistry 99(21), 8910–8917 (1995). [48] Laporte, L., Kirmaier, C., Schenck, C. C., and Holten, D. Chemical Physics 197(3), 225–237 August (1995). [49] Moser, C. C., Sension, R. J., Szarka, A. Z., Repinec, S. T., Hochstrasser, R. M., and Dutton, P. L. Chemical Physics 197(3), 343–354 August (1995). [50] Plato, M., M¨obius, K., Michel-Beyerle, M. E., Bixon, M., and Jortner, J. Journal of the American Chemical Society 110, 7279–7285 (1988). [51] Richards, F. M. Methods in Enzymology 115, 440–464 (1985). [52] McDowell, L. M., Gaul, D., Kirmaier, C., Holten, D., and Schenck, C. C. Biochemistry 30(34), 8315–22 (1991). [53] Lockhart, D. J., Kirmaier, C., Holten, D., and Boxer, S. G. Journal of Physical Chemistry 94, 6987–6995 (1990). [54] Steffen, M. A., Lao, K., and Boxer, S. G. Science 264, 810–816 (1994). [55] Bylina, E. J., Kirmaier, C., McDowell, L., Holten, D., and Youvan, D. C. Nature 336(6195), 182–184 (1988). [56] Nagarajan, V., Parson, W. W., Davis, D., and Schenck, C. C. Biochemistry 32(46), 12324–36 (1993). [57] Chan, C.-K., Chen, L. X.-Q., DiMagno, T., Hanson, D., Nance, S., Schiffer, M., Norris, J., and Fleming, G. Chemical Physics Letters 176(3–4), 366–372 January (1991). [58] Mattioli, T. A., Gray, K. A., Lutz, M., Oesterhelt, D., and Robert, B. Biochemistry 30(6), 1715–1722 (1991).

BIBLIOGRAPHY

194

[59] Finkele, U., Lauterwasser, C., Zinth, W., Gray, K. A., and Oesterhelt, D. Biochemistry 29(37), 8517–21 (1990). [60] Nagarajan, V., Parson, W. W., Gaul, D., and Schenck, C. Proceedings of the National Academy of Sciences of the United States of America 87(20), 7888– 7892 October (1990). [61] Vandervos, R., Franken, E. M., Sexton, S. J., Shochat, S., Gast, P., Hore, P. J., and Hoff, A. J. Biochimica et Biophysica Acta 1230(1-2), 51–61 (1995). [62] Shochat, S., Arlt, T., Francke, C., Gast, P., Vannoort, P. I., Otte, S. C. M., Schelvis, H. P. M., Schmidt, S., Vijgenboom, E., Vrieze, J., Zinth, W., and Hoff, A. J. Photosynthesis Research 40(1), 55–66 (1994). [63] Heller, B. A., Holten, D., and Kirmaier, C. Science 269(5226), 940–945 August (1995). [64] Taguchi, A. K. W., Stocker, J. W., Alden, R. G., Causgrove, T. P., Peloquin, J. M., Boxer, S. G., and Woodbury, N. W. Biochemistry 31(42), 10345–10355 (1992). [65] Lin, S., Xiao, W., Eastman, J., Taguchi, A., and Woodbury, N. Biochemistry 35(10), 3187–3196 (1996). [66] Chapman, S. K. and Mount, A. R. Natural Product Reports 12(2), 93–100 (1995). [67] Gehlen, J. N. The Effect of High and Low Frequency Polarization Modes on the Kinetics of Electron Transfer. PhD thesis, University of California at Berkeley, (1995). [68] Ross, S. M. Introduction to probability models. Academic Press, Boston, 5th edition, (1993). [69] Brooks, III, C. L. Current Opinion in Structural Biology 5, 211–215 (1995).

BIBLIOGRAPHY

195

[70] van Gunsteren, W. F. and Mark, A. E. European Journal of Biochemistry 204, 947–961 (1992). [71] Allen, M. P. and Tildesley, D. J. Computer Simulation of Liquids. Oxford University Press, New York, (1987). [72] Brooks, III, C. L., Karplus, M., and Pettitt, B. M. Proteins: A Theoretical perspective of Dynamics, Structure, and Thermodynamics. Wiley-Interscience, New York, (1988). [73] Gilson, M. K. Current Opinion in Structural Biology 5, 216–223 (1995). [74] Honig, B. and Nicholls, A. Science 268, 1144–1148 May (1995). [75] Antosiewicz, J., McCammon, J. A., and Gilson, M. K. Biochemistry 35(24), 7819–7833 (1996). [76] Harvey, S. C. Proteins: Structure, Function, and Genetics 5, 78–92 (1989). [77] Warshel, A. and Russell, S. T. Quarterly Review of Biophysics 17(3), 283–422 (1984). [78] Honig, B., Sharp, K., and Yang, A.-S. Journal of Physical Chemistry 97, 1101– 1109 (1993). [79] Sharp, K. A. and Honig, B. Annual Review of Biophysics and Biophysical Chemistry 19, 301–332 (1990). [80] Warwicker, J. Journal of Molecular Biology 236, 887–903 (1994). [81] Brooks, B. R., Bruccoleri, R. E., Olafson, B. O., States, D. J., Swaminathan, S., and Karplus, M. Journal of Computational Chemistry 4, 187–217 (1983). [82] Treutlein, H., Schulten, K., J.Deisenhofer, Michel, H., Brunger, A., and Karplus, M. In The Photosynthetic Bacterial Reaction Center: Structure and Dynamics, Breton, J. and Verm´eglio, A., editors, 139–150. Plenum Press, London (1988).

BIBLIOGRAPHY

196

[83] Treutlein, H., Schulten, K., Niedermeier, C., Deisenhofer, J., Michel, H., and DeVault, D. In The Photosynthetic Bacterial Reaction Center: Structure and Dynamics, Breton, J. and Verm´eglio, A., editors, 369–377. Plenum Press, London (1988). [84] Thompson, M. and Zerner, M. Journal of the American Chemical Society 113(22), 8210–15 Oct (1991). [85] Gilson, M. K. and Honig, B. H. Proteins: Structure, Function, and Genetics 3, 32–52 (1988). [86] Gilson, M. K. and Honig, B. H. Nature 330(6143), 84–86 (1987). [87] Gilson, M. K. and Honig, B. Proteins: Structure, Function, and Genetics 4, 7–18 (1988). [88] Alden, R. G., Parson, W. W., Chu, Z. T., and Warshel, A. Journal of Physical Chemistry 100(41), 16761–16770 (1996). [89] Alden, R. G., Parson, W. W., Chu, Z. T., and Warshel, A. In The reaction center of photosynthetic bacteria: structure and dynamics, Michel-Beyerle, M. E., editor, 105–116 (Springer-Verlag, New York, 1996). [90] Palaniappan, V., Schenck, C. C., and Bocian, D. F. Journal of Physical Chemistry 99(46), 17049–17058 (1995). [91] Palaniappan, V. and Bocian, D. F. Biochemistry 34(35), 11106–11116 (1995). [92] Wachtveitl, J., Farchaus, J. W., Das, R., Lutz, M., Robert, B., and Mattioli, T. A. Biochemistry 32(47), 12875–86 (1993). [93] Murchison, H. A., Alden, R. G., Allen, J. P., Peloquin, J. M., Taguchi, A. K., Woodbury, N. W., and Williams, J. C. Biochemistry 32(13), 3498–3505 Mar (1993). [94] Williams, J. C., Alden, R. G., Murchison, H. A., Peloquin, J. M., Woodbury, N. W., and Allen, J. P. Biochemistry 31(45), 11029–37 November (1992).

BIBLIOGRAPHY

197

[95] Chirino, A., Lous, E., Huber, M., Allen, J., Schenck, C., Paddock, M., Feher, G., and Ree, D. Biochemistry 33(15), 4584–4593 (1994). [96] Debus, R. J., Feher, G., and Okamura, M. Y. Biochemistry 25, 2276–2287 (1986). [97] Gehlen, J., Marchi, M., and Chandler, D. Science 263(5146), 499–502 January (1994). [98] Lao, K. Q., Franzen, S., Stanley, R. J., Lambright, D. G., and Boxer, S. G. Journal of Physical Chemistry 97(50), 13165–13171 (1993). [99] Kirmaier, C., Gaul, D., DeBay, R., Holten, D., and Schenck, C. C. Science 251, 922–927 February (1991). [100] Heller, B. A., Holten, D., and Kirmaier, C. Biochemistry 35, 15418–15427 (1996). [101] Moore, D. S. and McCabe, G. P. Introduction to the Practice of Statistics. W. H. Freeman and Company, New York, second edition, (1993). [102] Williams, J. C., Steiner, L. A., and Feher, G. Proteins 1(4), 312–25 December (1986). [103] Michel, H., Weyer, K. A., Gruenberg, H., Dunger, I., Oesterhelt, D., and Lottspeich, F. The EMBO Journal 5(6), 1149–1158 (1986). [104] Beroza, P. and Fredkin, D. Journal of Computational Chemistry 17(10), 1229– 44 July (1996). [105] Antosiewicz, J., McCammon, J. A., and Gilson, M. K. Journal of Molecular Biology 238(3), 415–436 (1994).

198

Appendix A A Critical Summary of Some Papers by Warshel and Parson In this appendix, we focus on four specific papers by Warshel, Parson, et al. concerning the prc (Parson et al. (1990) [28], Warshel et al. (1994) [29], Alden et al. (1995) [30], Alden et al. (1996) [89]. Their work is based primarily on two simulation techniques: pdld (the Protein Dipole Langevin Dipole technique) and fep/md (Free energy perturbation/molecular dynamics) described in previous work by Warshel et al. (such as reference [77].) We summarize their methodology, results, and highlight some of the underlying assumptions in these papers. Many of these assumptions, we believe, are at the heart of the differences between their calculations for the prc and the work presented in this dissertation.

A.1

Methodology, results, and consistent themes

The pdld methodology is used in varying forms in all the papers (except for Warshel et al. (1994)). In Parson et al. (1990), the free energy of transfer is broken up in the following way, where the terms are defined below: ∆G = α + ∆VQQ + ∆VQµ + ∆Vind + ∆VH2 O + ∆Vbulk − λ.

(A.1)

In Alden (1995) and Alden (1996), ∆G includes two additional terms, ∆Vmemb and

APPENDIX A. WARSHEL AND PARSON . . .

199

∆Vions , and α is renamed ∆Egas . Moreover, although the reorganization energy term (λ) is calculated explicitly, it is no longer included separately because it was thought to be included implicitly in the pdld methodology. To summarize, the new equation for ∆G is: ∆G = ∆Egas +∆VQQ +∆VQµ +∆Vind +∆VH2 O +∆Vmemb +∆Vions +∆Vbulk . (A.2) In contrast to the approach that we have taken, in which the core system always comprises the six chromophores as well as the four imidazole ligands, Warshel, Parson, et al. use the donor (D) and acceptor (A) chromophore as the core system in calculations of ∆GDA and calculate perturbations on that system. To calculate ∆G13 , for instance, α (or ∆Egas ) is the in vacuo energy of transferring an electron from SP to HL at infinite separation between the chromophores. ∆VQQ is the direct coloumbic energy between the chromophores as they are found in the prc; in other words, it is the energy to bring the two chromophores (still in vacuo) together from infinite separation. The remaining terms are the energies of interaction between the chromophores and the protein and surrounding environment. ∆VQµ is the change in energy of direct electrostatic interaction between the chromophores and the charges of the atoms resolved in the crystal structure. ∆Vind is the energy of interaction between induced dipoles in the surrounding medium with the chromphores; ∆VH2 O , the interaction of “mobile water,” solvent not resolved in the X–ray structure; ∆Vmemb , interactions with the membrane surrounding the prc; ∆Vions , interactions with electrolytes around the prc or membrane; and ∆Vbulk , the dielectric response of solvent outside the region accounted for in the other terms. The reorganization energy, λ is calculated by fep/md. The authors use electrochemical cycles to determine ∆Egas and PDLD calculations to calculate solvation terms. Dielectric response is represented by point polarizable dipoles on the protein atoms. Dielectric response from membranes and water are point polarizable dipoles placed on a grid. An origin for the simulation is chosen, and all the atoms in the crystal structure that fall within a certain radius of that origin are represented explicitly. The exact details of the simulation are varied, often to test different hypotheses.

APPENDIX A. WARSHEL AND PARSON . . .

200

Table A.1: Free energies of transfera Transition

∆Egas

∆VQQ

∆VQµ

∆Vind

∆Vbulk

λ

∆G

b P → P+ B− L

78.1

-29.9

-11.5

-10.6

1.2

4

23.30

+ −b P+ B − L → P HL

-6.2

-0.3

14. 2

-8.3

1

4

-2.50

+ −b P+ B − L → P BM

0.0

-0.04

15.8

-8.2

-2.9

0

4.66

+ −c P+ H− L → P HM

0.0

0.08

2.3

1.1

-0.5

0

2.98

d P∗ → P+ B − L

P∗ → P∗ → ∗

P →

d P+ H− L d P+ B − M d P+ H− M

-5.7 -7.3 -1.04 -4.32

a

These calculations are from Parson et al. (1990) [28]. b A cut–off radius of 19 ˚ A was used for the central region. Treatment 1 of the imidazole ligands was used for these numbers. c A cut–off radius of 17 ˚ A was used for the central region. d

Calculated using energies from transfer from P and ∆G01 = 29 kcal/mol.

APPENDIX A. WARSHEL AND PARSON . . .

201

In Parson (1990), Parson et al. present calculations for all the free energies of interest for the primary transfer. They examined alternative treatments of certain parameters (such as the size of the internal region, the treatment of the imidazole ligands). A typical calculation is given in Table A.1. Because their results did not depend very sensitively on the exact model used, the typical calculation presented is representative of all their results. Their basic conclusions about the free energies (derived primarily from averaging calculations from different models) are that ∆G12 = −4 kcal/mol, ∆G13 = −7 kcal/mol, ∆G12 = 2 kcal/mol (because P+ B− M was + − calculated to be 6 kcal/mol above P+ B− L ) and ∆G13 = −4 kcal/mol (because P HM

was found to be 3 kcal/mol above P+ H− L ). Parson et al. estimated the uncertainties in their calculations to be 2.5 kcal/mol. TyrM208 was determined to be important in lowering the energy of P+ B− L ; substituting TyrM208 with a phenylalanine (F) caused P+ B − L to increase by 5 kcal/mol. The papers that followed focussed on calculating ∆G23 , although Alden (1995a) explored the effects of various parameters on their calculation of ∆G13 . The basic argument is that the calculation of ∆G23 is more accurate than the calculation of other free energies. Their calculated values for ∆G13 and ∆G23 remained basically the same as in Parson (1990): that ∆G13 = -6 to -7 kcal/mol and that ∆G23 is approximately -3 kcal/mol (with an uncertainty of several kcal/mol). Several constant themes run throughout these papers. The key question that surrounds the calculation of ∆G32 is whether it is relatively small (around 3 to 5 kcal/mol, as Warshel and Parson have maintained) or is it large (on the order of 20 kcal/mol, as was calculated in mgcn, for example). Warshel and Parson argue that the primary errors in the methodology of mgcn are not accounting for the Born selfenergies and having insufficient dielectric screening. They argue that a small ∆G32 would consistently result from a computational model with appropriate amounts of dielectric screening. In their minds, a model with sufficient screening is one consistent with a physical picture embodied in what we call the “Warshel neutralization ansatz.” In Parson et al. (1990), they claim that the screening of the charged residues is so great that it would be tantamount to the charges not being there in the first place. Residues are charged only in regions of high screening, which neutralizes the effect

APPENDIX A. WARSHEL AND PARSON . . .

202

of the charge. If there is no high screening surrounding, the residue would not be charged in the first place. The calculations of Parson et al. (1990) were performed with all of the ionizable amino acids in their neutral configuration (the non–heme iron was also set to neutral charge). They claimed that since the effective dielectric constant of the protein environment must be on the order of 10, charging amino acids would have very little effect on the various free energies calculated. In Warshel et al. (1994), the authors set out to demonstrate explicitly the neutralization ansatz in the contex to their fep/md calculations. They calculated ∆G32 in four models: models in which the amino acids are in either their ionized or neutralized state and in which the solvent surrounding the protein either contained or did not contain “mobile waters.” Warshel et al. claim that these water molecules, which are not resolved by the X–ray structure, cause a great amount of dielectric screening. To account for this screening, in addition to the crystallographic water, water molecules were inserted in any available space inside and around the protein–membrane system. They found that the model with the charged amino acids but without the mobile water yields a ∆G32 on the order of 20 kcal/mol, whereas ∆G32 in the other three models are on the order of 5 kcal/mol. They concluded from this calculation that a large ∆G32 derives from inadequate dielectric screening (the problem they impute to mgcn) and that ∆G32 becomes small and insensitive to the actual charge states of the protein complex, once adequate screening is involved. Alden et al. (1995) examined more carefully the sensitivity of the calculations of ∆G32 and ∆G13 to model parameters. They treated all atoms within a radius ˚ of the midpoint between SP and HL explicitly. A sphere of radius 43.5 ˚ of 32 A A is inscribed about this midpoint. Space is partitioned into five regions: Region 1 is that occupied by the explicitly treated atoms lying within a membrane region. Region 1’ encompasses the remaining explicitly treated atoms (outside the membrane region). Region 2 is the membrane region not occupied by the atoms within the 43.5 ˚ A sphere. Region 3 is the part of the sphere outside the membrane region besides region 1’. Region 4 is space outside the 43.5 ˚ A. (See Figure A.1.) Three thicknesses for the membrane were used: 0 ˚ A (no membrane), 25 ˚ A, and 40 ˚ A. Dielectric response was treated with polarizable point dipoles on the atoms (for the

APPENDIX A. WARSHEL AND PARSON . . .

203

4

3 1’

1

2

2

1’ 3

Figure A.1: The dielectric regions for Alden (1995) ˚ is inscribed about this midpoint. Space is partitioned into A sphere of radius 43.5 A five regions: Region 1 is that occupied by the explicitly treated atoms lying within a membrane region. Region 1’ encompasses the remaining explicitly treated atoms (outside the membrane region). Region 2 is the membrane region not occupied by ˚ sphere. Region 3 is the part of the sphere outside the the atoms within the 43.5 A ˚. Three membrane region besides region 1’. Region 4 is space outside the 43.5 A thicknesses for the membrane were used: 0 ˚ A (no membrane), 25 ˚ A, and 40 ˚ A.

APPENDIX A. WARSHEL AND PARSON . . .

204

electronic polarizability) and on the grid (for the environmental dielectric response.) Each dipole is characterized by a polarizability constant K that corresponds to a bulk dielectric constant through the Clausius–Mossoti equation K = 3(−1)νi /(4π(+2)), (νi is a volume element). Non-hydrogen atoms were assigned a K corresponding to  = 2.2, while hydrogen atoms were set to  = 1. Alternative treatments for Regions 1,1’,2, and 3 were considered. The K for grid points in regions 1 and 2 were set to the same values, corresponding to either  = 2 or  = 4. Because mobile water is thought to permeate Regions 1’ and 3, the grid dipoles in these regions were assigned K = 0.256νi , a value determined to best fit solvation energy of monovalent ions. (Curiously, this value of K corresponds to a non-physical (negative)  in the Clausius–Mosotti equation. An  = 80 corresponds to K = 0.23νi , and lim→∞ K = 0.2387νi . Presumably, the K used for water represents a dielectric stronger than for the  = 80 traditionally used and has no direct correpondence to a dielectric constant in continuum dielectric models.) Finally, to explore the presence of various high dielectric material (including mobile solvent) with Regions 1 and 2, they assigned to points corresponding to internal cavities which can fit water, K corresponding to  = 1,  = 4, or K = 0.256νi (representing water). The effect of mobile ions was also included. Through these variations, Alden et al. basically concluded: “Calculations that do not include the membrane or solvent [i.e., mgcn] are shown to give unstable results that cannot be used to draw conclusions about the energies of the radical–pair states.” Specifically, they report that ∆G23 from fep/md calculations varies by about 10 kcal/mol when only the X–ray waters are included in the simulations. According to Alden et al., this deep sensitivity is indicative of problems in the modeling, a problem that is solved by increased amount of solvation. They also reiterated their ongoing argument that the proper accounting of the dielectric response of the protein makes the model relatively insensitive to various details of the modeling. Specifically, they found in their pdld calculations that their calculations of ∆G32 do not depend sensitively on the width of the membrane, the dielectric properties of Regions 1 and 2, or the dielectric treatment of the cavities. In a subtle contrast to Parson (1990) and Warshel (1994), which purported that the charge states of ionizable residues are

APPENDIX A. WARSHEL AND PARSON . . .

205

basically irrelevant to the calculation of ∆G23 , Alden et al. do find that ∆G23 is somewhat sensitive to the charge states of several amino acids very close to SP or HL . However, they nevertheless maintain that ∆G32 is independent of all ionizable groups other than the two or three close residues.

A.2

Interpreting the calculations of Warshel, Parson, et al.

We turn now to critically examining the work of Warshel, Parson, et al. in light of the work presented in this dissertation. We first describe the commonalities between our models and that of Warshel, Parson, et al., then highlight the differences, and finally outline the next steps that should be taken to resolve these differences. Warshel, Parson, et al. have repeatedly pointed out that the modeling of mgcn neglects the Born self-energy (or reaction field) term and that using a uniform dielectric constant of 2 inadequately accounts for the dielectric screening of the medium surrounding the protein—thus overestimating electrostatic contributions far away from the chromophores. We have included the reaction field term in our calculations here. Moreover, the differences between the dielectric response of our 2:2:80 models and our 2:2 models do show that 2:2:80 effectively screens out coloumbic interactions of residues far away from the chromophores. In these two ways, our work here agree with that of Warshel, Parson, et al . However, in other fundamental ways, the results and conclusions of our modeling— which does account for environmental dielectric screening and the reaction field terms—disagree with results and conclusions of Warshel, Parson, et al . First, we have found that our calculated values of all the ∆G, including ∆G13 and ∆G32 , are quite sensitive to the charge states of ionizable amino acid residues in all our dielectric models. Even with models that we have good reason to believe to have adequate dielectric response, we find that the Warshel neutralization ansatz is not borne out. Second, we find ∆G32 to be consistently much larger than the figure of 3 kcal/mol determined by Warshel, Parson, et al . It is still not possible to definitively

APPENDIX A. WARSHEL AND PARSON . . .

206

uncover the reasons for these two major differences, primarily because the differences in methodology do not allow for a straightforward comparison. Below, we examine a number of possible reasons for the discrepancies, including differences in the estimates of gas–phase energies and the differences due to TyrM208. However, we argue that the primary difference is that the models of Warshel, Parson, et al. incorporate vastly more dielectric screening than any of the models that we have considered in detail. We show that the constant insistence by Warshel, Parson, et al. on the neutralization ansatz requires a very strong dielectric screening, perhaps more than is experimentally justified.

A.2.1

∆Egas and ∆E (0)

Alden et al. (1996) argue that their calculations of ∆G13 are consistent with the (0)

Thompson–Zerner (tz) calculations for ∆E13 . They asked the question of what (0)

∆E13 would be in their formalism. They start with a ∆Egas (between SP and HL ) of 83 kcal/mol, the average of 77.6 and 89.3 kcal/mol from two specific calculations. Then, they subtracted ∆VQQ = 19 kcal/mol to bring the two chromophores to their respective positions in the prc; the two chromophores are separated by approximately 18 ˚ A. Finally, they subtracted 8.9 kcal/mol, the coloumbic interaction of the two chromophores with the four imidazole ligands and the other chromophores (BL , BM , and HL ). Therefore, their estimate of the free energy of P → P+ H− L in vacuo (∆E03 ) (0)

(0)

(0)

is 83 − 19 − 8.9 = 55 kcal/mol, and their estimate of ∆E13 = ∆E03 − 29 = 26 (0)

kcal/mol. The tz estimate of ∆E13 is 23.4 kcal/mol. However, Thompson et al. (0)

used an estimate of 33 kcal/mol for ∆E01 , instead of 29 kcal/mol. Hence, according (0)

to Alden et al., the tz calculation of ∆E13 is consistent with Warshel, Parson, et al. estimates for the gas phase energies—and therefore, the differences between the calculations of Warshel, Parson, et al. and mgcn must lie elsewhere. To verify this estimate, we calculated the energy of interaction of P+ H− L with the four imidazole ligands and the other chromophores in the 2:2 boundary condition to be -9.2 kcal/mol. This number is consistent with the -8.9 kcal/mol estimated by Alden et al. Therefore, it seems that their conclusion is probably correct.

APPENDIX A. WARSHEL AND PARSON . . .

A.2.2

207

TyrM208 and the size of ∆G32

Section 2.3.5 offers a detailed discussion of ∆G12 in our models. The basic conclusions are that ∆G12 can vary by approximately 6 kcal/mol depending on the orientation of a hydroxyl group of TyrM208. In all of our models, this hydroxyl group is oriented as to make the most positive contribution to ∆G12 . This result contrasts strongly to the work of Warshel, Parson, et al. which consistently shows TyrM208 lowering ∆G12 . The orientation of the hydroxyl group in our models is primarily due to electrostatic interactions with BL and PL , which orient the hydrogen away from BL . Parson et al. (1990) wrote that mimimizing the electrostatic interaction between the hydroxyl group of TyrM208 with the surrounding residues would cause the hydrogen to point towards BL . Figure 2.27 does show that our calculations are consistent with that conclusion. However, we find that once the electrostatic interactions with the chromophores are included (since they are included in the md simulations), it becomes more favorable for the opposite orientation. (See Figure 2.25) Specifically, the dominant contributors to this shift is BL and PL . Surprisingly, Alden et al. [88] also conclude that BL is key in determining the energy of the dipole, but in the opposite direction.

A.2.3

Modeling of the dielectric environment

In this section, we make the argument that the key issue of dispute between the modeling Warshel, Parson, et al. and our work (as well as that of Gunner et al. [27]) is the strength of the dielectric screening in the models. It is difficult to compare the results of Warshel, Parson, et al. to ours. Because we have partitioned ∆G differently, we can compare the total ∆G but not its constituents. Nevertheless, there is a number of results that imply strongly that the models considered by Warshel, Parson, et al. to have proper dielectric screening consistently have greater screening than the three dielectric models that we have considered in detail. First, Alden et al. (1995) report effective dielectric constants for various contributors to ∆G32 . Specifically, for each residue, they calculated the ratio of the change in the residue’s electrostatic contribution in vacuo through ionization (∆∆VQµ ) to the change in the electrostatic

APPENDIX A. WARSHEL AND PARSON . . .

208

Table A.2: Comparing effective dielectric constants for Alden (1995) to those for the 2:2:80 models from ∆G32

Residue Asp L155 Arg L103 Asp L60 Glu M232 Asp H36 Asp H33 Arg L135 Glu L106

Alden ∆∆VQµ 11.85 8.57 5.59 -5.14 -5.01 4.66 -5.29 -4.79

a

Equation A.3

b

Equation A.4

(1995) [30] ∆∆G eff a 3.67 3.2 1.25 6.9 0.55 10 -0.22 23 0.42 12 0.04 > 40 -1.71 3.1 -0.06 > 40

VS1:1 12.56 9.1 7.2 -5.38 -4.86 4.7 -5.26 -5.16

VS2:2:80 4.98 4.07 2.60 -2.15 -1.97 1.97 -1.90 -1.77

2:2:80 Models VN2:2:80 VS2:2:80 − VN2:2:80 -0.49 5.47 0.54 3.53 0.34 2.26 0.19 -2.34 -0.19 -1.78 0.01 1.96 -0.20 -1.70 0.36 -2.13

eff 2.3 2.4 3.2 2.3 2.7 2.4 3.1 2.4

contribution in the model (∆∆G): eff =

∆∆VQµ ∆∆G

(A.3)

Although there is no direct equivalent in our results, I calculated a quantity for each residue that should be roughly comparable: Let VS1:1 denote its contribution to (CO) (CO) ∆E32 in vacuo for the standard charge model; VS2:2:80 , the contribution to ∆E32 (CO)

in the S2:2:80 model; and VN2:2:80 , the contribution to ∆E32 we then defined an effective dielectric constant as 1 : eff =

VS1:1 VS2:2:80 − VN2:2:80

in the N2:2:80 model.

(A.4)

These calculations are recorded in Table A.2 As a starting point, it is reassuring that the ∆∆VQµ more or less matches S1:1 Hence, differences due to the atomic positions or the actual charge distributions for the proteins matter little. When we look at the effective dielectric constants, we notice that those from Alden et al., vary from 3.1 to > 40, whereas those for the 1

More properly, the numerator should be VS1:1 − VN1:1 , where VN1:1 would be defined as the (CO) contribution of the residue to ∆E32 in the neutral charge model in vacuo. However, it is fairly accurate to approximate VN1:1 as zero for this rough calculation.

b

APPENDIX A. WARSHEL AND PARSON . . .

209

2:2:80 model vary from 2.3 to 3.2. Other than eff for Arg L135, which happens to be 3.1 for both Alden et al. and for the 2:2:80 models, eff is larger in the pdld calculations. Moreover, for only three residues is ∆∆G larger in magnitude than 1 kcal/mol. In contrast, the effect of charging all the listed residues in the 2:2:80 models is greater than 1 kcal/mol in size. A caution is in order: This comparison is suggestive but not rigorous. The physical process modeled by Alden et al. is the charging of a single residue with all the other residues neutralized. Our calculation is that for charging all the residues at the same time. We would expect that charging only a single residue would give rise to more screening since the protein complex would be reorienting to screen out only a single charged residue and not a number of other residues. Hence, the contrast suggested by this comparison is probably not as great as it appears. However, Alden et al. (1995) do report ∆G32 for the cases of either neutralizing or charging all the residues: in the fully charged case, ∆G23 is about -5 kcal/mol, whereas in the fully neutral case, ∆G23 is about -1 kcal/mol. The difference between these two cases amounts to only 4 kcal/mol. For the 2:2:80 boundary condition, ∆G23 for the S and N models are -29.2 and -16.2 kcal/mol, respectively. Charging up all the amino acids causes a 13 kcal/mol shift in ∆G23 , indicating that there is much less screening in the 2:2:80 models than in the pdld calculations. Hence, the calculated effective dielectric constants are suggestive of the differences in screening involved. Does the fact that all the eff reported in Table A.2 for the 2:2:80 are between 2 and 4 indicate that the entire prc behaves like a relatively uniform low–dielectric medium? To answer this question, we calculate for all the ionizable groups eff , an approximation to eff that assumes VN2:2:80  VS2:2:80 : eff =

VS1:1 . VS2:2:80

(A.5)

This calculation of eff (as plotted in Figure A.2) demonstrates that tens of residues have an effective dielectric constant under 5, but many residues are highly screened, some up to a factor of 40. In other words, the two dielectric constants used to represent the regions of the 2:2:80 boundary condition gives rise to a wide range of effective

APPENDIX A. WARSHEL AND PARSON . . .

210

screening, some as dramatic as those found in pdld. The important contrast is that the number of residues that remain relatively unscreened (and therefore contribute significantly to ∆G23 are in the tens) whereas in pdld, they number fewer than five. These calculations clearly show that the pdld methodology has stronger screening than the 2:2:80 model. Is there any corresponding model within the pb formalism that is comparable to the pdld methodology? To answer this question, we calculated eff for the 2:80 and two dielectric models that we did not explore in great depth in the rest of the dissertation: 1) 4:4:80, which is the 2:2:80 dielectric model with a dielectric constant of 4 instead of 2 in the membrane and protein region and 2) a “high dielectric boundary” (hdb) dielectric boundary condition, which is akin to the 2:2:80 model, except the membrane is only 20 ˚ A in diameter, situated in −20 < z < 0 ˚ A. The hdb model crudely mimics the effect of bringing more mobile waters close to the central chromophores. The results for specific amino acids are tabulated in Table A.3. Several trends are clear. First, the eff for none of the models directly parallels the eff seen by Alden et al. (for the pdld method). Second, eff for the 4:4:80 model is very close to be consistently twice that for the 2:2:80 model–at least for the tabulated residues. The 2:80 model has some vague similarities to eff for pdld in that it ranges from cases of low screening to high screening (3.8 to 20). Finally, the uniformly high eff for hdb (20–32) indicates that the hdb model has (in general) stronger dielectric response than the pdld model. Hence, we can say that the strength of the dielectric response of the pdld model is generally greater than that of the 2:2:80 and 4:4:80 models, somewhat more than that of 2:80 but less than that of hdb. (Of course, part of the difficulties in comparing various models is that effective dielectric constants vary spatially in different ways for each model—hence, one often cannot say that one model has uniformly stronger dielectric than another model.) Because the hdb model is the only one of our dielectric models that is more strongly screening (at least for central residues) than the pdld model, it is instructive to study it in a bit more depth. Figure A.3 displays the contributions of individual (CO)

residues to ∆E32

(See Section 2.3.3 for a description of this type of plot.) Figure

A.4 shows eff for all the charged groups for the hdb dielectric model. Because eff

APPENDIX A. WARSHEL AND PARSON . . .

211

Effective dielectric constant for charged groups standard charge model (2:2:80) (from ∆G32) 50.0

effective dielectric constant

40.0

30.0

20.0

10.0

0.0 −100.0

−50.0

0.0 z (Å)

50.0

100.0

Figure A.2: eff for the 2:2:80 boundary condition eff (Equation A.5) for all the charged residues is plotted vs the z coordinate (Section 2.2.2) for the residue.

APPENDIX A. WARSHEL AND PARSON . . .

Table A.3: eff

a

212

for various dielectric models

Residues

2:2:80

2:80

4:4:80b

hdb

b

Asp L155

2.5

3.9

4.9

32

3.2

Arg L103

2.2

4.8

4.4

27

6.9

Asp L60

2.8

20

5.4

27

10

Glu M232

2.5

4.4

4.9

24

23

Arg H36

2.5

5.3

4.8

21

12

Arg H33

2.4

9.2

4.7

20

> 40

Arg L135

2.8

3.8

5.4

25

3.1

Glu L106

2.9

9.2

5.6

25

> 40

eff

c

(pdld)

a

Equation A.5

b

The 4:4:80 model is akin to the 2:2:80 dielectric model with a dielectric constant of

4 instead of 2 in the membrane and protein region and a “high dielectric boundary” (hdb) dielectric boundary condition, which is akin to the 2:2:80 model, except the membrane is only 20 ˚ A in diameter, situated in −20 < z < 0 ˚ A. c

Equation A.3

APPENDIX A. WARSHEL AND PARSON . . .

213 (CO)

is at least 8 (and more typically from 10 to 35) for all the charged residues, ∆E32

becomes quite insensitive to the charge state of an ionizable group. This insensitivity is in direct contrast to the sensitivity of calculated free energies to charge states in the 2:2, 2:2:80, and 2:80 dielectric models studied in detail in this dissertation—but is parallel to the insensitivity displayed by pdld. Throughout their papers, Warshel, Parson, et al. have continually stressed two principles: 1) that insensitivity of results to most particular details in the models suggests (but not necessarily implies) the validity of these results (the “insensitivity principle”) and 2) proper dielectric modeling should render the results mostly insensitive to the charge states of ionizable residues (the “neutralization ansatz”) [77, 28, 29]. They have shown that their results have largely followed these two principles. Determining the sensitivity of models to key parameters is certainly an important exercise in establishing the reliability of the models. However, sometimes, there is sensitive dependence on parameters in models that may reflect accurate treatment of the physics. The 2:2:80 dielectric model used in conjunction with md should account for the nuclear polarization of the atoms, the electronic polarizability of the prc, the dielectric response of the the membrane, and finally, the dielectric response of the aqueous environment surrounding the protein/membrane complex. However, in contrast to pdld, results from the 2:2:80 model are quite dependent on the charge states of the potentially ionizable amino groups. The reason for this discrepancy is basically that the dielectric screening in the pdld methodology is so strong as to almost completely screen the charged groups (akin to what happens in the hdb model.) In Alden (1995), Warshel, Parson, et al. show that their calculation of ∆G32 is largely insensitive to changing the dielectric treatment of various regions. Curiously, one parameter they did not change was the modeling of the dielectric constant for grid points in Regions 1’ and 3, which they fixed at the K value corresponding to water. Changing the dielectric constant for these regions would be a more interesting test of the sensitivities in the pdld methodology. With consistently high dielectric screening in regions 1 and 3, it is not surprising that the model results become basically insensitive to the other model parameters.

APPENDIX A. WARSHEL AND PARSON . . .

214

Individual Contributors to ∆G32 for the standard charge model high screening dielectric boundary condition 4.0

electrostatic contribution (kcal/mol)

3.0

2.0

1.0

0.0

−1.0

−2.0 −60.0

−40.0

−20.0 0.0 20.0 40.0 z−coordinate (Angstroms)

(CO)

Figure A.3: Individual contributors to ∆E32

60.0

in the S (hdb) model

For each residue in the S hdb model this figure displays the residue’s contribution (CO)

to ∆E32

vs its z–position. The inset graphs for each of the panels display the (CO)

cumulative ∆E32

(defined in Section 2.3.4) plotted vs z. (The coordinate system

is defined in Section 2.2.2.

APPENDIX A. WARSHEL AND PARSON . . .

215

Effective dielectric constant for charged groups standard charge model (high screening) (from ∆G32) 50.0

effective dielectric constant

40.0

30.0

20.0

10.0

0.0 −100.0

−50.0

0.0 z (Å)

50.0

100.0

Figure A.4: eff for the hdb boundary condition eff (Equation A.5) for all the charged residues is plotted vs the z coordinate (Section 2.2.2) for the residue.

APPENDIX A. WARSHEL AND PARSON . . .

216

We think that this situation is roughly akin to the hdb condition in which the charge state of the residues becomes virtually irrelevant because of the high screening. Now that we have established that Warshel, Parson, et al. have included much greater dielectric screening in the pdld methodology than that found in the three main models of the dissertation (2:2, 2:2:80, and 2:80), the question remains as to how much dielectric response is appropriate. The 2:2:80 model aims to account for nuclear response of the prc, the net dielectric response of the aqueous environment surrounding the prc–membrane complex, the electronic response of the prc and some accounting of the membrane dielectric response. This, in theory, encompasses all the dielectric response of the system—unless there is mobile solvent or water that Warshel, Parson, et al. invoke in Warshel (1994). The experimental evidence for such mobile solvent is ambiguous at this point. Warshel, Parson, et al. have pointed out that recent calculations of pKa shifts using the pb formalism have worked best with dielectric constants on the order of 20 and therefore, a higher dielectric constant should be used. However, Antosiewicz et al. [75] counter that the use of  = 20 for pKa shift calculations does not imply that such a high  should be used in all calculations. There are many other reasons for why  = 20 works so well in this one context. As Gunner et al. [27] and Warshel (1994) have noted, the key way to resolve this issue of dielectric modeling will be site-directed mutagenesis (combined with sophisticated computer mutagenesis modeling).

Suggest Documents