Simulation Modalities for Molecular Dynamics in Immunology Ulrich Omasits∗, Martin Neumann†, Othmar Steinhauser‡, Hannes Stockinger§, Rene Kobler¶, Rudolf Karch∗ , and Wolfgang Schreiner∗ Abstract. The human immune system uses a special way to detect any invasion from outside: small parts (peptides) of invaded bacteria, viruses, or toxins are presented by a special sort of molecules (MHC) to be detected by the immune system. Crucial for this detection is a proper binding between the peptide (epitope) and MHC molecules, i.e. formation of a pMHC complex (Figure 1a). This complexes are not only relevant for pathogen detection but also for development of vaccines. Research on vaccine development tries to find epitopes which do not cause the illness but still bind to MHC and stimulate immune reaction (immunization). In the present project we aim to support pharmaceutical development of vaccines by computer simulation: prospective epitope candidates are computationally screened for binding properties in molecular dynamics studies utilizing the Austrian Grids [1] infrastructure in terms of computing power and grid storage. Future cooperation with pharmaceutical industry and clinical studies can be envisaged.
1. Introduction 1.1. Computational methods for immunology By using molecular dynamics (MD) we intend to simulate the immunological relevant processes at an atomic level of detail. MD is the technique of numerically solving the Newtons equations of motion mi
d 2~ri ~ = Fi , i = 1, ... j dt 2
of an assembly of particles (atoms) at successive, closely spaced, discrete time steps ∆t of the order of a femtosecond (10−15 s), where the total force on each atom depends on the positions and properties of all other particles in the system. Notably due to hardware improvements during the last decades molecular dynamics applied to biological macromolecules has become a rapidly developing field of science. There are several MD packages available, commercial ones as well as those in public domain. Core Unit for Medical Statistics and Informatics, Medical University of Vienna, Vienna, Austria of Experimental Physics, University of Vienna, Vienna, Austria ‡ Institute of Biomolecular Structural Chemistry, University of Vienna, Vienna, Austria § Department of Molecular Immunology, Centre of Biomolecular Medicine and Pharmacology, Medical University of Vienna, Vienna, Austria ¶ GUP, Johannes Kepler University Linz, Linz, Austria, email:
[email protected] ∗
† Institute
Figure 1. (a) Cartoon representation of a pMHC complex (m9 peptide complexed to HLA B*2705 [2]). The peptide binding domain (α1- and α2-domain, white), the α3-domain (grey) and the associated β-2-microglobulin (black) are represented as ”cartoons”. The bound peptide is shown as ”licorice”. (b) A molecular dynamics simulation box. A few water molecules are carved out in the front of the cube in order to make the pMHC-complex visible.
1.2. Embedding the system The behavior of complex molecules largely depends on environment. In order to achieve a manageable representation of the natural pMHC complex a tradeoff between computational cost and simulation accuracy has to be made, especially concerning the amount of solvation water included in the simulation and the boundary conditions. Inclusion of a sufficient number of additional explicit solvent molecules slows down the simulation but guarantees a more realistic environment representation. To avoid an abrupt border with a vacuum, periodic boundary conditions (PBC) repeat the simulation box periodically in all three space axis, creating a continuous system. However, the imposition of periodicity leads to artifacts since all atoms now interact with their periodic images. These artifacts can be minimized using larger solvation shell sizes and an appropriate water model (Figure 1b).
2. Methods 2.1. Hardware - Austrian Grid Since the project was carried out within the framework of the Austrian Grid, the utilization of computational power within this environment was crucial. To find the most suitable software tool for the Austrian Grid infrastructure several MD packages were evaluated. From the large number of MD packages available we performed a series of benchmarks [ref Kobler] of NAMD2 [4], CHARMM [6], and GROMACS [5] on an 4x16 processor SGI Altix 350 system and a workstation cluster ”Hydra” (8 x 2 Athlon MP processors). Gromacs seemed to be the most suitable tool because of easy installation, good documentation, and a special and very fast procedure to treat water molecules. Its high scalar performance outweighed the fact that (on our system) Gromacs scales efficiently only up to 16 nodes.
2
Parallelization of MD – which essentially is an O(N)2 problem – is done by particle decomposition where every processor is allocated N/P particles. All production runs were performed on the SGI Al˚ tix 350 system. For a typical run of 2 ns of a simplified pMHC system surrounded by a 20 Angstrom water shell (157, 890 atoms), altogether about 800 CPU hours were required. 2.2. Structure preparation A MD simulation requires an atomistic structure as input which can be taken from crystal structures deposited in the Protein Data Bank [8]. In order to save computational time, only the essential parts of the proteins are considered in the simulation, i.e. the antigen binding α1 and α2 domains of the MHC molecule. Protons have to be added and termini and side chains are assigned typical charge states (pH 7). The complexes are centered in cubic boxes of different sizes which are initially filled with an equilibrium configuration of bulk water. The water models used are SPC and TIP4P, both recommended for usage in biomolecular systems [3]. To neutralize the system an appropriate number of Na+ counterions are added. 2.3. Simulation methods and parametrization GROMACS supports different force fields. The Gromacs force field is recommended for usage with SPC solvated systems while the OPLS AA/L force field is recommended for usage with TIP4P solvated systems. After the initial coordinate setup, as described above, the systems potential energy is first minimized by a steepest descent procedure, in order to allow the water molecules to adjust to the presence of the pMHC complex. The coordinates so obtained serve as a starting structure for the subsequent MD simulations. Periodic boundaries are employed and the system is heated linearly from 0K to 300K during the first 40ps and left at 300K for the last 10ps. Subsequent production runs use pressure coupling to 1 bar, the Particle Mesh Ewald (PME) [7] method for long-range electrostatic interactions and an integration step of 2fs. Total simulation times between 1ns and 16ns were used. For subsequent analysis, configurations are saved in trajectory files every 2ps for long runs (16ns total) and every 0.5ps for short runs (1ns total). 2.4. Analysis The trajectories of all simulation runs are evaluated via the extensive analysis modules provided by the GROMACS package.
3. Results We performed simulations of two different epitopes bound to the same MHC molecule. While one epitope is a natural epitope which binds to the MHC tightly, the other epitope is an artificially modeled one which binds rather weakly to the MHC and actually even detaches during simulations. We use this system to study the methods and the impacts of different simplifications and limitations. The main target is to simulate the system as realistically as possible to provide reliable computer simulations for vaccine design. 3.1. Impact of solvation shell thickness Simulations of the artificial epitope indicate that using larger solvation shells facilitates the epitope to ˚ or even 40A) ˚ detachment occurs during detach from the MHC. Within a large solvation shell (20A 3
˚ water environment but it took about 7ns a 2ns simulation. The epitope also detached with a 10A simulation time. 3.2. Impact of force field and water model Another interesting observation is that usage of different force fields and water models can change simulation results drastically. Using the more polar SPC water model (and corresponding Gromacs force field) the above described detachment of the artificially epitope occurred. Simulations of the same system with the rather inert TIP4P water model (and corresponding OPLS AA/L force field) resulted in a surprisingly stable pMHC complex. 3.3. Impact of simulation length When simulating the binding epitope with the polar SPC water, the epitope remained tightly anchored to the MHCs binding cleft. By elongating the simulation up to 16ns we found the epitope adopting several different states at unpredictable points of time. This is an important observation of major relevance for pharmaceutical development. It is important to evaluate all major configurational states of the epitope which can only be achieved by sampling over sufficiently long simulations.
References [1] The Austrian Grid Initiative, http://www.austriangrid.at. [2] M. Hlsmeyer, R.C. Hillig, A. Volz, M. Rhl, W. Schrder, W. Saenger, A. Ziegler, and B. Uchanska-Ziegler, ”HLA-B27 subtypes differentially associated with disease exhibit subtle structural alterations” (2002), em J.Biol.Chem., 277, 47844-47853, 49. [3] J. Zielkiewicz, ”Structural properties of water: comparison of the SPC, SPCE, TIP4P, and TIP5P models of water” (2005), J.Chem.Phys., 123, 1-6, 10. [4] L. Kale, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten, ”NAMD2: Greater scalability for parallel molecular dynamics” (1999), J.Comp.Phys., 151, 283-312, 1. [5] H.J.C. Berendsen, D. van der Spoel, and R. van Drunen, ”GROMACS: A message-passing parallel molecular dynamics implementation” (1995), Computer Physics Communications, 91, 43-56. [6] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, ”CHARMM - a program for macromolecular energy, minimization, and dynamics calculations” (1983), J.Comp.Chem., 4, 187-217, 2. [7] T. Darden, D. York, and L. Pedersen, ”Particle mesh Ewald: An N.log(N) method for Ewald sums in large systems” (1993), J.Chem.Phys., 98, 10089-10092, 12. [8] H.M. Bergman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, ”The Protein Data Bank” (2000), Nucleic Acids Research, 28, 235-242, 1.
4