Homology Model Tutorial In this tutorial, we will introduce MOE's Protein Modeling applications. The tutorial has been broken down into the following steps: ● Performing a Homology Search ● Building a Homology Model ● Comparing the Homology Model with the X-ray Structure ● Evaluating the Homology Model These MOE applications will be used: Application
Functions
PDB Search
Searches the protein family database for protein structures homologous to a query sequence.
Protein Align/Superpose
Aligns protein structures and/or sequences based on sequence and 3D features.
Homology Model
Builds homology models for a sequence from the structures of one or more aligned proteins.
Protein Geometry
Calculates features of protein structure and geometry useful in quality and error analysis.
Conventions Used in this Tutorial MOE
MOE Window
MOE | RHS
Right button bar of the MOE Window
MOE | Footer MOE footer button bar SE
Sequence Editor
SE | Footer
Sequence Editor footer bar
DBV
Database Viewer
In this tutorial, we will assume that you are using a three-button mouse. For information on using a two-button mouse see Using the Mouse. As an example, we will build a model for PDB entry 4MUZ, which is a 1.39Å X-ray structure. As a real PDB structure, it is in the non-redundant chain database in $MOE/project/pdb.mdb, but we will ignore it for the purposes of this exercise and use other homologous structures to build a model.
Performing a Homology Search 1. Close the current system using MOE | File | Close
2. Open the file containing the protein sequence that we will model. The sequence for this protein is in the MOE sample directory: MOE | File | Open | $MOE/sample/mol/4muz.fst
3. We will now use the PDB Search application to search for a suitable template for homology modeling. The search is performed on a database of protein structures and sequences that have been clustered into families. Open the panel with MOE | Protein | Search | PDB Select chain #1 from the Load Chain option list at the top right of the panel to load the 4MUZ.A sequence into the search area and press Search. After starting the search, delete chain 1 in the Sequence Editor by rightclicking on Chain 1 and selecting Delete — we will be loading an aligned version of the query after the search. A two-stage search strategy is used. Firstly, a fast scan is performed to create an initial list of candidates. Secondly, the list is then narrowed by HMMER, using Hidden Markov Models (HMM) to determine family membership. The results of the HMMER calculations are shown as the search progresses. Families are then reported in the Results list as shown. (See Homology Searching for more details.)
In this example, the search identifies OMP decarboxylase as a homolog of the target sequence. Each item in the Results section of the panel represents one or more PDB structures, possibly augmented by sequence-only entries from the Uniref50 database, clustered into a family and aligned by a sequence and structure based alignment protocol. The displayed code represents the PDB structure from within the family with the highest similarity score to the target sequence. 4. Double-click on PDB_4MUZ.A in the Results area, or select the result and press the Load Alignment... button. The PDB Search: Load Alignment panel appears, displaying the pre-aligned family, with the chains sorted by sequence similarity to the query sequence. Note that the PDB structure PDB_4MUZ.A, which was displayed in the Search results area, is one of 12 PDB structures in this family.
5. Select the Load Query Sequence checkbox and press Load All to load these chains into MOE. In the Sequence Editor, you now see the query sequence (chain 1) as well as all of the chains from the homologous protein family. Once again, delete chain 2 (4MUZ.A) appearing under the query before proceeding. You may now close the PDB Search panel. To annotate the sequences with the secondary structure, on SE | Footer, click To display the sequences with single letter codes, click To wrap the sequences to fit into the window, click sequences.
and scroll down to the aligned portion of the
Tip! The colored bars above the sequences reflect the secondary structure of the atoms in those residues having associated atomic coordinates. Chain 1 is a sequence-only chain, and therefore has no such bars.
Building a Homology Model In order to build a homology model, we will align the target sequence to the homologous structures, decide which
chain is to be the template for the model, and then build the model. 1. We will use the Protein Similarity Monitor to compare the sequences. Open the panel with: SE | Alignment | Similarity
The sequence identity matrix is non-symmetric. The percentage sequence identity is calculated from the number of identical residues between both chains, divided by the number of amino acids in the chain corresponding to the cell column. The matrix shows that the 4LUJ.A and 2CZE.A chains are much closer to the query sequence than the other eight chains. 2. Select chain 5 with the mouse, and then hold the key and select chain 12. Then hold the key and select chain 3.Then press the keyboard key and click OK to delete these chains.
3. In this case, we are most interested in the residues which differ between the sequences, so we can highlight residues in the sequence editor which are not conserved between all the chains. Click SE | Footer | Consensus: Select the Similarity: * for Conserved Residues check box:
4. Color the residues by the RMS deviation between the atoms using the Residue RMSD button from the Sequence Editor footer: SE | Footer | Residues
Tip! Residues with no atoms, like in the sequence-only query chain, are not colored with the RMSD mode.
Review the aligned sequences by scrolling down in the sequence editor. Note that most of the residues in the template chains are colored green, showing that they have similar conformations to each other. Note that no additional superposition is required since the structures in each PDB family have already been prepared in the database. Tip! It is good practice to check the positions of the mismatched residues in the structure. It may be best to choose a template structure with a lower overall sequence identity, if more of the conserved residues are near a ligand binding site, or where the residues pack into the middle of the protein. Mismatches can be accommodated more easily in surface loops, which are more flexible, and have less steric hindrance.
5. Build the homology model using 4LUJ.A as the template structure. Open Homology Model with SE | Protein | Homology Model
6. There is only one sequence loaded in MOE with no associated atoms (Chain 1); it is therefore specified by default as the sequence to be modeled. The Template will default to the first chain in the system where there are atomic coordinates: Chain 2 (4LUJ.A). Select Load Final Model in MOE near the top of the Homology Model panel. All other options can be left as the default values. 7. In the Homology Modeling panel press OK. In the output database, promodel.mdb, you can view the results as they are being calculated. 8. Once the calculation has started, clear the system by selecting MOE | RHS | Close. The built model will appear in the MOE window when the calculation completes. Notes: ● The currently loaded forcefield is displayed in the MOE | Footer which can be used to access the Potential Setup panel. The Amber10:EHT Forcefield with a Reaction Field (R–Field) should be used by default. ● The various models will be written to the database specified in the Output Database field. The default filename is promodel.mdb. ● If the Load Final Model in MOE option is left unchecked, the final structure can still be read in explicitly from the outpout database. ● The C-terminal & N-terminal Outgap Modeling option is turned off by default. When outgaps are ignored, the amino acids of the target sequence that extend beyond (either before or after) the template sequence in the alignment are clipped off before the model is built. ● By default, ten independent intermediate models will be built. These different homology models will be the result of the permutational selection of different loop candidates and sidechain rotamers. ● The intermediate model which scores best according to the chosen scoring function is chosen as the final model, subject to optional further energy minimization. ● Once the calculations have finished, the output database will include the ten intermediate models and a refined
final model (entry eleven). To get a better view of the molecules in the database, you can enlarge the molecular cells in the Database Viewer. Position the cursor over any one of the cells in the first column (the mol field), press the left mouse button and drag down and to the right, to enlarge the molecular drawings. Click the middle mouse button over one of these and drag the mouse to rotate the molecule.
Comparing the Homology Model with the X-ray Structure 1. When the calculation finishes, the final, refined model will be loaded in the MOE window. Note: MOE uses stochastic algorithms in its homology modeling, and results may differ slightly each time the model is constructed. Therefore, your results and the results shown here may not be identical. Open the PDB file with the X-ray structure for 4MUZ: SE | File | Open | $MOE/sample/mol/4muz.moe.gz The 4MUZ structure contains two nearly identical copies of the main chain, as well as other chains containing ligand and solvent molecules. 2. To compare the homology model with the X-ray structure, we will use Protein Align/Superpose:
SE | Align/Superpose Press Align followed by Superpose.
3. In the Sequence Editor, click the - button to the left of the Sequence Editor ruler to hide the chains with solvent molecules. Open the SE | Footer | Consensus panel and select the Plot: RMSD check box and deselect the Similarity: * for Conserved Residues check box:
4. Turn on the Selection Synchronize option, so that selecting residues in the Sequence Editor will select the atoms in those residues in the MOE window, and selecting atoms will select the residues. SE | Select | Synchronize
5. Hide the atoms, center the view point and show the protein backbone ribbon using the chain color:
SE | Footer | Atoms
SE | Footer | Ribbon
to render the final model as shown here. To simplify the image the second PDB chain (4MUZ.B) has been hidden in the System Manager. The homology model is shown in orange. The X-ray structure is magenta.
6. Scroll down the Sequence Editor. The RMSD plot shows that the model is reasonably close to the X-ray structure for most of the residues. The major differences are near the gaps between the query sequence and the template structure.
Evaluating the Homology Model It is good practice to examine the geometry of homology models for unusual or unreasonable features. Serious problems - particularly ones that cluster in one or a few areas of the model - can suggest a either a problem with the choice or template, or a problem with the alignment of the target sequence to the template. 1. In the Sequence Editor, select chain 1. 2. Investigate the geometric quality of the model using Protein Geometry from SE | Protein | Geometry | Phi-Psi Plot
3. From Residues, choose Selected Chains
The user interface of the Protein Geometry application presents data in the form of plots, such as the Ramachandran plot shown above, and as tabular lists. Click on Display: Data and select the outlier(s) in the list, and press Select Atoms at the bottom of the panel. Now, with the atoms of the outlier residue selected, you may perform operations such as a fine grain energy minimization in an attempt to obtain more typical phi and psi angles.
Further Work To improve a homology model: ● Check to see if new protein structures have been released which are similar to the target sequence. ● Adjust the position of gaps in the sequence alignment between the query sequence and the template structure, and build models. ● Consider the flexibility of the protein. Residues in surface loops and at the chain termini are likely to be much more flexible than residues in a helix or strand secondary structure, in the core of a domain. ● Explore loop conformations with the Loop Modeler.
Summary In this tutorial, we have shown how to use the protein modeling tools in MOE, searching with a protein amino acid sequence for a structure from a similar protein. MOE-Align was used to refine the alignment between the sequence and the template. A homology model was built using MOE-Homology, and the geometry of the model was checked using the Protein Geometry application.
See Also Homology Searching Sequence and Structure Alignment Building Homology Models Protein Geometry Potential Energy Selection and Configuration MOE Window Database Viewer Window Sequence Editor Window MOE Table of Contents Copyright © 1997–2017 Chemical Computing Group Inc.
[email protected]