SHORT TECHNICAL REPORTS insilico.mutagenesis: a primer selection tool designed for sequence scanning applications used in directed evolution experiments Ulrich Krauss and Thorsten Eggert Heinrich-Heine-Universität Düsseldorf, Jülich, Germany BioTechniques 39:679-682 (November 2005) doi 10.2144/000112013
Several primer prediction programs have been developed for a variety of applications. However, none of these tools allows the prediction of a large set of primers for whole gene site-directed mutagenesis experiments using the megaprimer method. We report a novel primer prediction tool (insilico.mutagenesis), accessible at www.insilico.uni-duesseldorf.de, developed for the application to high-throughput mutagenesis used in directed evolution or structure-function dependency projects, which involve the subsequent mutagenesis of a large number of amino acid positions (e.g., in whole gene saturation or gene scanning mutagenesis experiments). Furthermore, the program is suitable for all site-directed (saturation) mutagenesis approaches, such as saturation mutagenesis of promoter sequences and other types of untranslated intergenic regions. In anticipation of downstream cloning steps, the primer design tool also includes a restriction site control feature alerting the user if unwanted restriction sites have been introduced within the mutagenesis primer. The use of our tool promises to speed up the process of site-directed mutagenesis, as it instantly allows predicting a large set of primers.
INTRODUCTION The design of PCR primers (i.e., single-stranded DNA oligonucleotides complementary to a target sequence) has become an essential procedure in molecular biology for a variety of applications (e.g., gene amplification, sequencing, and site-directed mutagenesis). Thus, numerous primer prediction and analysis programs have been developed (1–4). However, none of the programs currently available facilitate the design of primers for whole gene site saturation and sequencescanning mutagenesis experiments using the megaprimer PCR method (5). The manual design of oligonucleotide PCR primers for these approaches is a laborious task, due to the large set of mutagenesis primers to be designed. In order to reduce the working time and to standardize the design procedure, we have developed a web-based primer prediction program termed insilico. mutagenesis. The program takes nucleotide sequences of target regions (open reading frames or intergenic regions Vol. 39, No. 5 (2005)
like promoter sequences) as input and predicts mutagenesis primers that can be directly used in a megaprimer PCR-based mutagenesis approach. The insilico.mutagenesis tool is entirely written in Perl (6), uses MySQL tables for easy data storage, and possesses an HTML-based user interface. The tool can be accessed via the insilico web site (www.insilico.uni-duesseldorf.de). Directed evolution has been proven to be a successful strategy to improve enzyme properties such as specific activities, substrate specificities, thermostabilities, or enantioselectivities (7–10). In most cases, single base mutations are introduced by means of error-prone PCR (epPCR) in a random manner. However, epPCR only results in a limited number of amino acid exchanges; therefore, only a small part of the total sequence space is accessible to mutagenesis. As a consequence, alternative techniques must be applied to generate a first generation library of high diversity (11). Complete saturation mutagenesis, also referred to as Gene Site Saturation Mutagenesis™
(GSSM™), is a novel technology for rapid in vitro evolution of proteins that can be used to circumvent this problem (12–14). Here, all possible base triplets are introduced at a given codon position, thereby resulting in the formation of a library containing all 20 amino acid exchanges at the target position. This is achieved at the genetic level by using degenerate mutagenesis primers. Subsequent use of in vitro PCR amplification generates a library of genes possessing all codon variations required for complete saturation of the original gene. DeSantis and coworkers applied this technique to generate a highly enantioselective nitrilase (13). Furthermore, the technique of complete saturation mutagenesis has been used in our institute to generate a variant Bacillus subtilis lipase A (BLSA) showing improved enantioselectivity toward different model substrates (14,15). Sequence-scanning mutagenesis techniques, like alanineor tryptophan-scanning mutagenesis, can be applied to investigate the functional role of specific amino acid residues with respect to catalytic mechanism, substrate binding, or signal transduction (16,17). Both complete saturation mutagenesis and scanning-mutagenesis techniques require the sequential saturation/substitution of numerous amino acid residues, depending on the size of the target protein or the region to be investigated. For the complete saturation of a regular protein consisting of 300 amino acids, 300 single codon exchanges (i.e., 300 megaprimer PCRs) must be performed. One step in such a challenging approach that is easily amenable to automation without the necessity of expensive robotic equipment, is the primer design using a personal computer. Therefore, we developed the program insilico.mutagenesis to automate the prediction of oligonucleotides that can be used directly in a megaprimer PCR approach. REQUIRED INPUT A schematic overview of the dataprocessing by insilico.mutagenesis is given in Figure 1. First, the program BioTechniques 679
SHORT TECHNICAL REPORTS requires the input of a target nucleotide sequence, including flanking vector sequences (plain sequence), to which we will refer as vectorA-template. It is not necessary to include the complete vector-sequence; about 40 bp up- and downstream of the gene of interest is enough to enable primer design for whole gene saturation or scanning mutagenesis. Second, a unique sequence identifier (sequence name) must be provided for data processing purposes. Third, the mutagenesis codon must be selected from a pull-down menu, taking into account the codonusage of the desired expression host. Next, the program requires the input of the start and stop position of the target gene (or intergenic region) within the overall sequence. Also, the region that should be mutated must be specified, and because, in practice, a too long megaprimer might be inefficiently elongated by the polymerase in the second round of PCR—probably due to the formation of secondary structures—it has been proven practical to design mutagenesis primers in a way that the megaprimer does not exceed a certain length (5). Therefore, the input of a so-called oligo-switch position is necessary, as explained in more detail later.
A
� �
�
� � �
B
Mutagenesis Primer Prediction Input
(1) Nucleotide sequence of the gene (or region) of interest including flanking vectorsequence (2) Unique sequence identifier (3) Target mutagenesis codon (4) Start and stop position of the gene (or region) within this sequence (5) Region to mutate (6) Oligo switch position
Generate subsequent mutant sequences for the defined region i = start - stop
PROGRAM ALGORITHM AND DATA PROCESSING As one example, we designed all mutagenesis primers in the complete saturation mutagenesis of a B. subtilis lipase (14). The gene was 543 bp in length, consequently having 181 coding triplets and a TAA stop codon. The first 90 codon-exchanges (corresponding to 270 bp) are achieved by the design of reverse mutagenesis primers (as shown in Figure 2B), which are used together with a vectorA-specific forward primer. Accordingly, the last 91 codonexchanges are introduced using a forward mutagenesis primer together with a vectorA-specific reverse primer. As a consequence, the amplified megaprimers do not exceed 273 bp in size. Therefore, these DNA fragments are well suited for the second PCR of megaprimer mutagenesis (5,18). The position at which the “switch” from 680 BioTechniques
i < oligo switch position
i > oligo switch position
Generate forward primer
Generate reverse primer
Iterate primer length
(1) Test for G/C clamp
Iterate primer length
(2) Calculate primer Tm Output
Primer information in browser window, to downloadable sequence and Microsoft© Excel file
Restriction Analysis Input
Restriction enzymes intended to be used for cloning purposes
Output
Number of recognition sites for the chosen restriction enzymes within the primer
Figure 1. User interface and flowchart of data processing of the insilico.mutagenesis primer prediction tool. (A) Screenshot of the tool’s data input user interface. (B) Flowchart of the tool’s data processing and data generation algorithm. Tm, melting temperature. Vol. 39, No. 5 (2005)
a reverse to a forward mutagenesis primer occurs is referred to as the oligo-switch position. Usually, as in our lipase example, the position halfway along the gene of interest is used. Using this input (Figures 1 and 2) the program generates mutant nucleotide sequences with subsequent single codon-exchanges within the region the user has defined. For each of the mutant sequences, a mutagenesis primer is predicted either as a forward or reverse primer, depending on the position of the desired mutation with respect to the oligo-switch position. The generated mutagenesis primers are checked for the ability to form the so-called GC clamps at the 3′ end, since Watson-Crick bonds between G and C will facilitate the initiation of complementary strand formation by the polymerase at the 3′ end of the hybridized primer (19). If no GC clamps can be formed due to the lack of G or C bases at the 3′ end of the primer, the program extends the oligonucleotide until a G or C is found
at its 3′ end. The maximum length of the primer is set at 40 bp. Finally, the mutagenesis primer data are stored in a MySQL database and displayed in form of an HTML table (Figure 1). In addition, the program calculates the melting temperature of every single primer based on the equation of Breslauer et al. (20) and the nearest neighbor thermodynamic parameter set as described by Allawi and SantaLucia (21). Furthermore, the predicted primer sequences can be viewed as FASTAformatted text output in the browser window or can be downloaded as a Microsoft® Excel® spreadsheet.
amplified full-length PCR product. Therefore, the program asks the user to supply the names of two restriction enzymes that will be used in the subsequent cloning steps. By using the BioPerl (22) module Bio::Restriction:: Analysis, insilico.mutagenesis indicates the number of recognition sites of those enzymes within each oligonucleotide. In case the program has predicted a mutagenesis primer whose sequence interferes with the desired cloning strategy, the oligonucleotide can be redesigned easily.
ADDITIONAL ANALYSES
In summary, we have presented a novel primer design tool (insilico. mutagenesis) specifically developed for high-throughput mutagenesis primer prediction, useful in complete saturation and whole gene scanning mutagenesis experiments. Thus, insilico.mutagenesis is designed to speed up the process of directed evolution or structure-function dependency projects. Furthermore, the primer design tool includes a restriction site control feature alerting the user in case of introducing unwanted restriction sites within the mutagenesis primer anticipating the cloning strategy.
The program enables the user to check each predicted mutagenesis primer with respect to additional restriction endonuclease recognition sites, which might interfere with the intended cloning strategy for the
A Target nucleotide sequence (including flanking vector sequences) vectorA–specific forward primer
...cgctcctagctagct GCT gaa cac…aca ctt tac tac ata aaa
start codon
0
16 1
primer
aat ctg gac ggc gga aat aaa gtt gca aac gtc…aat TAA ctgatatgatc … 6
5 4
vectorA–specific reverse
Mutagenesis region
start to mutate
19 2
5
oligo-switch
stop to mutate
292 92
B
1
stop-codon
4
559 562 181 stop
[bp] [aa] [aa
Region to mutate BSLA (Bacillus subtilis Lipase A)
Oligo-switch
1
BSLA-1 … 91-rev
181 aa BSLA-92 … 181-fw
Oligonucleotide name
Sequence
Oligonucleotide orientation
Length (bp)
Tm (°C)
BLSA-89-rev
5′- atttccgccgtccagSNNttttatgtagtaaag -3′
reverse
33
deg.
BLSA-90-rev
5′- tttatttccgccgtcSNNattttttatgtagtaaag -3′
reverse
36
deg.
BLSA-91-rev
5′- aactttatttccgccSNNcagattttttatgtag -3′
reverse
34
deg.
BLSA-92-fw
5′- ataaaaaatctggacNNSggaaataaagttg -3′
forward
31
deg.
BLSA-93-fw
5′- aaaaatctggacggcNNSaataaagttgcaaac -3′
forward
33
deg.
BLSA-94-fw
5′- aatctggacggcggaNNSaaagttgcaaacg -3′
forward
31
deg.
Figure 2. Application of insilico.mutagenesis to design primers for site-directed mutagenesis. (A) Visualization of the Bacillus subtilis Lipase A (BSLA) nucleotide sequence, explaining the input requirements of the insilico.mutagenesis tool. Codons 89–94 of the BSLA gene sequence are printed in bold face. (B) Orientation of mutagenesis primers around the oligo-switch position in the BSLA gene and sequences of the oligonucleotides computed by the insilico.mutagenesis tool. deg.,degenerate; Tm, melting temperature. Vol. 39, No. 5 (2005)
CONCLUSIONS
ACKNOWLEDGMENTS
We thank Bernd Cappel (HHU Düsseldorf) for help installing the program in a server-based environment to make it accessible via the world wide web. COMPETING INTERESTS STATEMENT
The authors declare no competing interests. REFERENCES 1.Rozen, S. and H. Skaletsky. 2000. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132:365-386. 2.Lu, G., M. Hallet, S. Pollock, and D. Thomas. 2003. DePie: designing primers BioTechniques 681
SHORT TECHNICAL REPORTS for protein interaction experiments. Nucleic Acids Res. 31:3755-3757. 3.Turchin, A. and J.F. Lawler, Jr. 1999. The Primer Generator: a program that facilitates the selection of oligonucleotides for site-directed mutagenesis. BioTechniques 26:672676. 4.Canaves, J.M., A. Morse, and B. West. 2004. PCR primer selection tool optimized for highthroughput proteomics and structural genomics. BioTechniques 36:1040-1042. 5.Barettino, D., M. Eigenbutz, R. Valcarcel, and H.G. Stunnernberg. 1994. Improved method for PCR-mediated site-directed mutagenesis. Nucleic Acids Res. 22:541-542. 6.Moorhouse, M. and P. Barry. 2004. Bioinformatics, Biocomputing and Perl. John Wiley & Sons, New York. 7.Petrounia, I.P. and F. Arnold. 2000. Designed evolution of enzymatic properties. Curr. Opin. Biotechnol. 11:325-330. 8.Brakmann, S. 2001. Discovery of superior enzymes by directed molecular evolution. ChemBioChem. 2:865-871. 9.Reetz, M.T. 2004. Controlling the enantioselectivity of enzymes by directed evolution: practical and theoretical ramifications. Proc. Natl. Acad. Sci. USA 101:5716-5722. 10.Jaeger, K.-E. and T. Eggert. 2004. Enantioselective biocatalysis optimized by directed evolution. Curr. Opin. Biotechnol. 15:305-313. 11.Eggert, T., M.T. Reetz, and K.-E. Jaeger. 2004. Directed evolution by random mutagenesis: a critical evaluation, p. 375-390. In A. Svendsen (Ed.), Enzyme Functionality: Design, Engineering, and Screening. Marcel Dekker, New York. 12.Short, J.M., inventor. Diversa Corporation, assignee. 2001. Saturation mutagenesis in directed evolution. U.S. Patent No. 6.171.820. 13.DeSantis, G., K. Wong, B. Farwell, K. Chatman, Z. Zhu, G. Tomlinson, H. Huang, X. Tan, et al. 2003. Creation of a productive, highly enantioselective nitrilase through gene site saturation mutagenesis (GSSM). J. Am. Chem. Soc. 125:11476-11477. 14.Funke, S.A., A. Eipper, M.T. Reetz, N. Otte, W. Thiel, G. van Pouderoyen, B.W. Dijkstra, K.-E. Jaeger, and T. Eggert. 2003. Directed evolution of an enantioselective Bacillus subtilis lipase. Biocatal. Biotransformation 21:67-73. 15.Funke, S.A., N. Otte, T. Eggert, M. Bocola, K.-E. Jaeger, and W. Thiel. Combination of computational prescreening and experimental library construction can accelerate enzyme optimization by directed evolution. Protein Eng. Des. Sel. (In press). 16.Mahan, S.D., G.C. Ireton, B.L. Stoddard, and M.E. Black. 2004. Alanine-scanning mutagenesis reveals a cytosine deaminase mutant with altered substrate preference. Biochemistry 43:8957-8964. 17.Santiago, J., G.R. Guzman, K. Torruellas, L.V. Rojas, and J.A. Lasalde-Dominicci. 2004. Tryptophan scanning mutagenesis in the TM3 domain of the Torpedo califonica acetylcholine receptor beta subunit reveals an alpha-helical structure. Biochemistry 43:10064-10070.
682 BioTechniques
18.Ling, M.M. and B.H. Robinson. 1997. Approaches to DNA mutagenesis: an overview. Anal. Biochem. 254:157-178. 19.Lowe, T.M., J. Sharefkin, S.Q. Yang, and C.W. Dieffenbach. 1990. A computer program for selection of oligonucleotide primers for the polymerase chain reaction. Nucleic Acids Res. 18:1757-1761. 20.Breslauer, K.J., R. Frank, H. Blocker, and L.A. Marky. 1986. Predicting DNA duplex stability from the base sequence. Proc. Natl. Acad. Sci. USA 83:3746-3750. 21.Allawi, H.T. and J. SantaLucia, Jr. 1997. Thermodynamics and NMR of internal G.T mismatches in DNA. Biochemistry 36:1058110594. 22.Stajich, J.E., D. Block, K. Boulez, S.E. Brenner, S.A. Chervitz, C. Dagdigian, G. Fuellen, J.G. Gilbert, et al. 2002. The bioperl toolkit: Perl modules for live sciences. Genome Res. 12:1611-1618.
Received 19 April 2005; accepted 3 June 2005. Address correspondence to Thorsten Eggert, Heinrich-Heine-Universität Düsseldorf, Institut für Molekulare Enzymtechnologie, Forschungszentrum Jülich, D-52426 Jülich, Germany. e-mail:
[email protected] To purchase reprints of this article, contact
[email protected]
Vol. 39, No. 5 (2005)