5998_02_p661-674
1/12/06
3:03 PM
Page 661
ASSAY and Drug Development Technologies Volume 3, Number 6, 2005 © Mary Ann Liebert, Inc.
Protein Expression Plasmids Produced Rapidly: Streamlining Cloning Protocols and Robotic Handling Maria Kornienko, Allison Montalvo, Brian E. Carpenter, Michael Lenard, Pravien Abeywickrema, Dawn L. Hall, Paul L. Darke, and Lawrence C. Kuo
Abstract: As many processes in the preclinical drug discovery process become highly parallel, the need to also produce a large number of different proteins in parallel has become acute, such as for protein crystallization and activity screening. In turn, the requisite DNA constructions to produce these proteins must now be done at a rate that requires automated cloning procedures, each with an intrinsic low failure probability per sample. The high-throughput cloning solutions presented here achieve production of 192 different expression plasmids at a success rate of greater than 95% of the targeted open reading frames. Time for completion of the set by one person is reduced to approximately 11 working days, starting with polymerase chain reactions for a number of source clones and ending with purified expression plasmids. Achievement of this throughput utilizes the following: (1) the Beckman Coulter (Fullerton, CA) Biomek® FX liquid handler for most manipulations, (2) Gateway™ cloning technology (Invitrogen Corp., Carlsbad, CA), and (3) computer programs designed for parallel processing of all sample information, including primer design and the resulting DNA and protein sequence assembly. Exemplary data are presented for discovery of a form of the Rho-kinase that crystallizes (ROCK2).
Introduction RODUCTION OF SOLUBLE active proteins is crucial for crystallization, assay development, and HT screening. In the case of proteins destined for crystallization, minor variation of the length or sequence of the expressed protein can have a dramatic impact upon the quality of crystals produced in an unpredictable way.1 Magnifying the task of expression plasmid production is that the discovery of an effective protein expression condition for the ORF of interest is unpredictable, often involving the need to test a combination of fusion tags, expression vectors, and host cells for the best results.2–5 Prior to the advent of HT methods of cloning and expression testing, this discovery process was time consuming, sequential, and iterative. To accelerate the process, laboratories now are in the process of adopting automated and parallel molecular biology procedures to make many expression clones for the more rapid, sin-
P
gle-pass (non-iterative) discovery of soluble and easily purified target proteins.6 The recent definition of the human genome sequence now allows cloning of entire protein families for functional and structural studies. As in the case when a large number of variants of a single protein are needed for discovery of well-ordered crystals, the need for genomic HT expression clone production is acute and being addressed in a variety of ways worldwide.7–10 We present here facile methods for HT production of expression clones that combine readily available instrumentation adapted to perform high-fidelity cloning operations. Thus, with the use of the Invitrogen Corp. (Carlsbad, CA) Gateway™ cloning method on the Beckman Coulter (Fullerton, CA) Biomek® FX robotic platform, a single operator can begin with one to 192 source clone ORFs and finish two 96-well plates of purified expression plasmids that are based upon a variety of vectors in 11 working days. Additionally, methods that use this au-
Department of Structural Biology, Merck Research Laboratories, West Point, PA. ABBREVIATIONS: His6, hexahistidine; HT, high-throughput; ORF, open reading frame; PCR, polymerase chain reaction; ROCK2, a form of Rho-kinase (ROKalpha or ROCKII); GST, glutathione S-transferase.
661
5998_02_p661-674
1/12/06
3:03 PM
Page 662
662
Kornienko et al.
HT PCR primer design with "Oligos Creator" programs PCR copy of ORFs
Agarose gel analysis of PCR
BP Reactions Bacterial transformation of BP reactions into cloning cell line (with plating and colony picking) Entry plasmid isolation
Agarose gel analysis of isolated plasmids
LR Reactions Bacterial i transformation of LR reactions into cloning cell line (with plating and colony picking) Expression plasmid isolation
Expression plasmid quantification and normalization l Send plasmids out for baculovirus or mammalian expression
Agarose gel analysis of isolated plasmids
Transformation (w/o plating) off expression plasmids into bacterial expression cell line(s) for expression
Protein purification and crystallization (performed by other groups of Structural Biology) Identify clone(s) for further modifications
END POINT: CRYSTALS
Design mutagenic primers with " Mutation Maker" program QuikChange mutagenesis of expression plasmid(s)
Bacterial transformation o into cloning cell line (with plating and colony picking)
Plasmid isolation
Agarose gel analysis of isolated plasmids
FIG. 1. Automated procedures of cloning for structural biology. White, cloning steps; shaded, steps other than cloning. HT, high-throughput; ORF, open reading frame; PCR, polymerase chain reaction.
5998_02_p661-674
1/12/06
3:03 PM
Page 663
663
Protein Expression Plasmids
FIG. 2. Biomek FX Deck Layout. The Cytomat Hotel (not shown) is located on the upper left side of the deck representation and connected with the deck with a conveyor belt. P1–P12, Passive Automated Labware Positions (ALPs) used for placement of plates, tip boxes, and other labware. P1, P2, P4, P5, and P6 ALPs are connected to water baths for heating or cooling. C1, ALP at the end of the Cytomat conveyor belt for labware delivered to or from the Hotel. TL1, tiploader ALP for loading tips on the 96-channel head. SPE1, solid-phase extraction ALP. Vacuum is applied at this position to remove liquid from a filter membrane. Holder1, ALP for placing labware from SPE ALP for assembling the vacuum manifold. W, Span-8 wash ALP. TR1, Span-8 tip trash ALP. A SPECTRAmax PLUS 384 plate reader (not shown) is located on the lower left side of the deck representation. Store1, Labware store ALP for the plate reader. Regrip1, regripping station for the plate reader labware. S1, plate reader drawer ALP.
tomated platform to create site-directed mutant protein forms via the QuikChange® (Stratagene, La Jolla, CA) protocol to improve crystal quality are described.11 These processes of clone creation for structural biology are outlined in Fig. 1. Finally, the large amount of DNA and protein sequence data processing that must accompany the creation of large sets of expression plasmids is no longer practical with commonly available software, typically designed for manipulations of sequences one at a time. Our custom-designed software routines for handling the primer design and sequence assemblies for up to 96 clones in parallel are outlined herein. Materials and Methods
PCR Oligonucleotide primers used in the HT cloning process include an ORF-specific hybridization region, an optional protease cleavage site, and Gateway recombination sites, as shown in Fig. 3. These primers are designed with the custom software West Point Structural Biology Sequence Clone Manager Suite (Merck & Co., Inc., West Point, PA) in an automated fashion, and ordered from Integrated DNA Technologies (Coralville, IA) in a 96-well plate, premixed in pairs. Since Gateway primers are typically longer than 50 base pairs, polyacrylamide gel electrophoresis-purified primers were found to greatly reduce the error rate in the primer region of the PCR product. AccuPrime™ Taq DNA Polymerase Hi Fi (Invitrogen Corp.) was selected from the many enzymes tested to pro-
Instrumentation The Biomek FX dual arm platform equipped with the Beckman Span-8 and 96-channel pipetting arms, a Cytomat® Hotel (Beckman Coulter), Plate Shuttle System (Kendro Laboratory Products GmbH, Hanau, Germany), and a SPECTRAmax PLUS 384 plate reader (Molecular Devices Corp., Sunnyvale, CA) was used for all robotic operations. Operations with the pipetting arms always used disposable tips. The deck was configured with four cooling Automated Labware Positions for incubation of biomaterials and reagents at 4°C, one heating Automated Labware Position to perform a 42°C heat shock step of the bacterial transformation procedure, and a Solid-Phase Extraction device, used in plasmid purification. Custom adapters to accommodate the plates for sufficient thermal transfer during 4°C incubation and 42°C heat shock steps were ordered from Acme Automation (Spring City, TN). The schematic representation of the deck layout used for all procedures is shown in Fig. 2. Thermocycling was performed outside the robotic platform with the GeneAmp® PCR System 2700 (Applied Biosystems, Foster City, CA).
FIG. 3. Example of typical polymerase chain reaction primer pairs used in the Gateway cloning method. Destination vectors encoding peptide or protein fusion partners may be used, and removal of residual amino acids from the Gateway recombination sequence from the open reading frame of interest is achieved by encoding a protease cleavage site (PCS) (TEV, thrombin, or other) into polymerase chain reaction primers. Primer pairs illustrated are thus used for N-terminal fusion proteins (A), C-terminal fusion proteins (B), or proteins with Nterminal and C-terminal fusion (C). attB1 and attB2, Gateway recombination sequence; XXXXX, gene-specific region of the primer for hybridization.
5998_02_p661-674
1/12/06
3:03 PM
Page 664
664
vide the highest accuracy in amplification of various sequences with the same PCR condition. When a single DNA template is used, a reaction mix for 120 reactions was manually prepared in the quarter reservoir (Beckman Coulter), consisting of 480 l of template DNA (10–100 ng/l), 60 l of AccuPrime Taq Hi Fi (5 U/l), 600 l of 10 AccuPrime PCR Buffer, and 4,620 l of water. On the deck of the robot, reaction mix was dispensed into a 96-well PCR plate (the “reaction plate”), 48 l per well, with the Biomek FX Span-8 pipette. Primer pairs (2 l each at 12 pmol/l) were transferred into the reaction plate and mixed with reaction mix with the Biomek FX 96-channel pipette head. The reaction plate was moved off of the deck of the robot and centrifuged briefly, and PCR was performed. DNA gel analysis E-Gels® 96 (Invitrogen Corp.) were used for HT analysis of PCR mixtures or plasmid preparations, and the EGel 96 holder was used to fix the gel position on the robot deck. Sample preparation and gel loading were performed with the Biomek FX 96-channel head. Loading buffer was prepared per the manufacturer’s recommendation and poured into a tip box lid placed upsidedown on the deck of the Biomek FX. Loading buffer and DNA samples (7 l each) were combined in a 96-well PCR plate to prepare gel samples. Water (10 l) was first loaded into the gel wells, followed by gel samples (10 l). Loading of markers and subsequent gel electrophoresis were performed manually. BP reactions The PCR assay samples were used without any purification in the BP reactions. The amount of BP Clonase™ Enzyme Mix recommended in the Gateway manual can be reduced by a half with almost no change in reaction efficiency if the incubation time of BP reaction is increased from 1 h to 3 h or overnight. A BP reaction mix for 120 reactions was prepared manually in a quarter reservoir (Beckman Coulter) placed on ice and consisted of 240 l of Donor Vector (150 ng/l), 480 l of 5 BP reaction buffer (Invitrogen Corp.), 960 l of TE Buffer, and 480 l of BP Clonase Enzyme Mix (Invitrogen Corp.). The quarter reservoir with BP reaction mix was placed on the deck of the robot, and BP reaction mix was dispensed into a 96-well PCR BP reaction plate, 18 l in each well, with the Biomek FX Span-8. PCR mixtures (2 l) from the process described above were transferred into a “BP reaction plate” by the Biomek FX 96-channel pipette head. The components of the reaction were mixed by the Biomek FX 96-channel pipette head programmed for pipetting up and down several times. The BP reaction plate was then centrifuged briefly and incubated for 1–18 h
Kornienko et al.
at ambient temperature. After the BP reaction was completed, 2 l of proteinase K solution was added with the Biomek FX Span-8 pipette. The reactions were mixed with the Biomek FX 96-channel pipette head. The plate was centrifuged briefly and incubated at 37°C for 15 min. BP reactions were transformed into the Escherichia coli cloning cell line with the bacterial transformation method described below. Entry clones were isolated with the plasmid purification method described below. LR reactions The automated operations to transfer ORFs of interest from Entry plasmids to Destination vectors to create the desired, final expression plasmids via the LR reaction were identical to analogous operations detailed for the BP reaction above. The differences are that in the LR reactions, the purified Entry plasmids were used in place of the PCR products, and Destination vectors were used in place of the Donor vectors. As with the BP reactions, E. coli bacteria were transformed with the product mixture and single colony clones were used in the plasmid purification procedure. QuikChange site-directed mutagenesis The QuikChange II site-directed mutagenesis kit (Stratagene) was used in this protocol. Mutagenic primers were designed in parallel with a custom program, Mutation Maker version 3.0, that uses Stratagene primer design guidance (P. Abeywickrema, Merck & Co., Inc.). The reactions were assembled on the deck of the robot, and thermal cycling was performed. The mutagenesis protocol described here for 12 reactions can be scaledup for greater numbers. A reaction mix for 22 reactions (extra volume for manual control reactions) consists of 88 l of a template vector (25 ng/l), 33 l of deoxynucleotide triphosphate mix from the kit, 55 l of 10 reaction buffer, 22 l of PfuUltra™ Hi Fi (2.5 U/l, Stratagene), and 176 l of water manually prepared in a microcentrifuge tube on ice. A portion of the reaction mix (187 l) was manually transferred into two wells of an empty PCR plate on the chilled position on the deck of the Biomek FX, as opposed to placement in the deck reservoir, because of low volumes. The plate containing mutagenic primers (2.5 pmol/l) was placed on the deck of the robot. Reaction mix (17 l) and forward and reverse primers (4 l each) were combined in the 96-well PCR plate with the Span-8. Components of the reaction plate were mixed with the 96-channel head using a custom pipetting procedure (Biomek FX template) in which the head moves in a spiral fashion. The reaction plate was centrifuged briefly, thermal cycling was performed according to recommendations of the QuikChange manual, and the plate was then transferred to the Biomek FX deck.
5998_02_p661-674
1/12/06
3:03 PM
Page 665
665
Protein Expression Plasmids
DpnI (10 U/l), diluted in half with water, was manually transferred into one of the empty wells of the reaction plate, 2 l was transferred, robotically, into reaction plate wells for mutagenic reactions with the Span-8, and reactions were mixed with the 96-channel head. The reaction plate was centrifuged briefly and incubated for 2 h at 37°C. Transformations of competent E. coli cells used 4 l of each reaction as in the bacterial transformation method, and the mutated plasmids were isolated with the plasmid purification method, both described below. Bacterial transformation Chemically competent E. coli cells were purchased in PCR plates (GC5™ [Gene Choice, Inc., Frederick, MD] or TurboCells™ [Gene Therapy Systems, Inc., San Diego, CA]). Purified plasmids or BP, LR, or QuikChange reaction mixtures (1.5–4 l) were mixed with 20 l of competent cells at 4°C on the Biomek FX deck, gently but thoroughly with the 96-channel head using a custom pipetting template directing the head movement in a spiral fashion. After a 20-min incubation at 4°C, the plate was moved with the Biomek FX gripper arm to the 42°C heating position for a 30-s heat shock and then to the 4°C cooling position for 2 min, after which 80 l of SOC medium was added to the transformed cells. The plate was removed from the deck and incubated for 1 h at 37°C with shaking at 210 rpm for cell recovery. Agar trays with 48 sections (QTrays, Genetix USA, Inc., Boston, MA) were used instead of traditional Petri dishes to speed up the manual plating step. Transformed cells (10–20 l) were plated on each section. Liquid transformation of expression plasmids for seed culture preparation E. coli expression cell lines were transformed with the bacterial transformation method described above except that the transformed cultures were not plated, but transferred into fresh media with antibiotic in 96or 24-well deep-well plates, and grown overnight. These cultures are used as inocula for bacterial expression testing. Plasmid purification One colony per reaction was inoculated into 1.2 ml of rich medium with antibiotics in a 96-well deep-well plate. Cultures were grown overnight with shaking, followed by centrifugation for 8 min at 3,000 g. The plate with culture pellets was placed on the deck of the Biomek FX, medium was aspirated to waste with the 96-channel pipette head to an empty 96-well deep-well plate, and pellets were used as starting material for the plasmid purification method described here. The QIAprep 96 Turbo Miniprep Kit (Qiagen, Inc., Valencia, CA) was used for
plasmid isolation, and the robotic method was created with the Biomek FX Qiagen DNA Miniprep Wizard (Beckman Coulter). ROCK2 cloning for protein expression Various lengths of the ROCK2 ORF were cloned into custom Destination vectors designed for baculovirus recombination. The vectors are derived from the baculovirus transfer vector pVL1392 (Pharmingen, San Diego, CA), which was customized by adding His6 or GST tags for N- or C-terminal fusion and then Gatewayadapted. Baculovirus generation and insect cell infections (SF9 cells) were performed by Kemp Biotechnologies (Frederick, MD). Test expression cultures were in a volume of 1 L, and cells were harvested 72 h post-infection. The expression samples were analyzed by immunoblots of sodium dodecyl sulfate-polyacrylamide gel electrophoresis gels, detected with fluorescently labeled antibody specific for either His5 or GST (Invitrogen) or with antibody combinations of primary against His5 or GST and secondary anti-mouse horseradish peroxidase that used luminescence detection. Clone data management Custom programs (Merck & Co., Inc.) described below were created for automated bulk PCR and mutagenic oligonucleotide design and for sequence information management. West Point Structural Biology Sequence Clone Manager Suite. There are three applications within the West Point Structural Biology Sequence Clone Manager Suite of programs, listed below. Programs were written in Visual Basic and Excel (Microsoft, Redmond, WA) macros, and utilized an Excel workbook for output of results. The utility of this software is not limited to robotic operations, as it is stand-alone, and can be applied to manual cloning manipulations. A short description of the three programs follows: • Automated Oligos Creator. Design of up to 96 PCR primer pairs in parallel for up to 96 templates is performed, with output of the PCR product sequence for each sample. The output of the designed PCR primers is an Excel table that can be used in DNA oligonucleotide ordering in a 96-well plate format. The rules for each oligonucleotide design are: 1. The length of the template-complementary region of the primer is always 22 bases. 2. Choice of extensions is customized by the user, to include such things as Gateway recombination site, protease cleavage site, Shine-Dalgarno sequence, etc.
5998_02_p661-674
1/12/06
3:03 PM
Page 666
666
• Truncation Oligos Creator. Similar to Automated Oligos Creator, but most useful for PCR primer design when a series of ORF length variants are made from a small number of templates. Notable here is the ability to choose truncation sites by clicking directly on a display of the protein sequence. The program knows what nucleotide positions your amino acid sequence choices refer to, eliminating the tedious back-translation. The screen shot illustrating this step is shown on Fig. 4. The Automated and Truncation Oligos Creator applications are not specific to the Gateway process. • Expressions. Uses the PCR product sequences generated as output from Automated Oligos Creator or Truncation Oligos Creator along with the desired Gateway Donor and Destination vectors to produce the DNA sequence of the Entry clones and final expression plasmids. In addition, translations of the proteins that are produced by these expression plasmids are provided, along with the predicted molecular weights of those proteins, which is useful for analysis of the expressed proteins. Sequences of the Destination vectors are held within the program data sheets for repeated use, and can be edited when needed by the user. Note that Destination vector-encoded tags (GST, His6, etc.) are automatically included in the translation of the plasmid ORFs. Mutation Maker Program. The program (the program details and code will be described separately [manuscript in preparation]) allows the parallel design of multiple mu-
Kornienko et al.
tagenic primers that use a single template for the QuikChange site-directed mutagenesis method. The nucleotide sequence of the plasmid with ORF of interest is used as input, and the program translates the ORF. There are two ways to choose the amino acid(s) to mutate: (1) by clicking on a position in the amino acid sequence displayed in a window, or (2) by designating amino acid number. The codons for mutated amino acids are optimized via selectable codon usage tables for E. coli, Spodoptera frugipedra, Drosophila melanogaster, or Homo sapiens. After the primer is designed, the melting temperature and GC% of the primer are displayed. If these parameters are not satisfactory, the primer can be re-designed by increasing or decreasing the number of nucleotides flanking the mutated region. Multiple mutations can be introduced by one primer, and all primer sets for a given ORF can be created without leaving the main window. The output of the program is an Excel spreadsheet listing forward and reverse primers along with hyperlinks to text files of mutated plasmid sequences, mutated ORF sequences, and translations of mutated ORF sequences.
Results and Discussion Automated procedures Gene to expression clone. Reliable HT cloning methods were developed on the Biomek FX dual-arm plat-
FIG. 4. Truncation Oligos Creator screen shot. Protein truncations are designated either by point-and-click, or by entering amino acid position number. ROCK2, Rho-kinase variant.
5998_02_p661-674
1/12/06
3:03 PM
Page 667
667
Protein Expression Plasmids
form using Gateway cloning technology. Over 3,000 expression clones have been generated on this platform within 18 months, with one experimenter processing one or two 96-well plates of clones at a time. Gateway technology was chosen as our HT method since transfer of protein ORFs to expression vectors is fast, robust, and automation friendly.10,12 Shortened and simplified Gateway protocols were successfully adapted to automation and are described here. Beginning with DNA samples containing the ORFs of interest, the Gateway recombination sites are first added to the ORF via PCR. The resulting PCR product is then transferred via site-specific recombination to an Entry vector, and finally to one or more Destination vectors to give the desired expression plasmids, as shown in the schematic Fig. 1. Experimental aspects of these cloning steps that are pertinent to HT production include: (1) PCR primer design can be fast and simple; (2) LR and BP recombination reactions are so efficient and robust that several effective shortcuts are described here; (3) transformations can be performed robotically, including those for expression tests of the final plasmids; and (4) DNA sequence management programs must process sets of samples in parallel for efficiency. When the ORF of interest is cloned into a Gateway destination vector that encodes a fusion tag using the standard Gateway protocol, the recombination site encodes 12–14 amino acids that remain present in the expressed protein, even if the fusion tag encoded by the Destination vector includes a protease site for tag removal.6,10 As shown in Fig. 3, an optional protease cleavage site
A.
Fusion Tag
ENLYFQG
next to the protein of interest can be included in the primer design, so that no recombination site amino acids remain after protease cleavage. In this method PCR primers encoding a protein cleavage site are to be incorporated at the 5- or 3-end of the ORF. We have routinely introduced either TEV or PreScission cleavage sites in one PCR primer. An example of the exact protein products produced by use of this strategy is shown in Fig. 5. For PCR reaction, AccuPrime Taq DNA Polymerase Hi Fi was found to utilize long primers and to efficiently amplify various sequences in one run without optimization of conditions of the individual reactions. In primer design, 22 nucleotides were used for hybridization to the template for every primer, regardless of the template or any additional part of the priming oligonucleotide that contains the non-hybridizing Gateway recombination sequence, and optional protease cleavage sites. We were able to amplify fragments from 700 to 3,500 base pairs routinely using primers designed without optimization, and using the same PCR condition. In sample sets of varying template length, the PCR extension step time calculation was based on the longest fragment in the set. On average, 98% of PCR procedures yielded the fragment of correct molecular weight in sufficient yield for the subsequent BP reaction. Optimizations of melting temperature and %GC content were never done, and appear to be unnecessary for the great majority of cases. A typical result for a set of 96 PCR assays is shown in Fig. 6, showing successful PCR amplification for all 96 reactions. The amount of PCR product estimated by vi-
attB1
Cloned ORF TEV protease cleavage
Fusion Tag
B.
Fusion Tag
ENLYFQ
attB1
G
attB1
Cloned ORF
ENLYFQG Cloned ORF TEV protease cleavage
Fusion Tag
attB1
ENLYFQ
G
Cloned ORF
FIG. 5. Proteins produced with or without a primer-encoded protease cleavage site. (A) Schematic representation of expressed fusion protein produced from the Expression vector, created with the traditional Gateway cloning method. The TEV protease (ENLYFQG) cleavage site is used as an example. Amino acids translated from the attB1 recombination site remain attached to the protein of interest after the protease cleavage. (B) Schematic representation of expressed fusion protein produced from the Expression vector, created with the modified Gateway cloning method. The TEV protease (ENLYFQG) cleavage site is used as an example. Amino acids translated from the attB1 recombination site are removed from the protein of interest after the protease cleavage. ORF, open reading frame.
5998_02_p661-674
1/12/06
3:03 PM
Page 668
668
FIG. 6. Typical polymerase chain reaction results for 96 separate high-throughput reactions to start a project. Reaction samples (5 l) were loaded and run on a 1% E-Gel 96 agarose gel. Samples were prepared and loaded on the gel with the Biomek FX. Lane M, E-Gel 96 High Range markers (Invitrogen), 100 ng per lane.
sual comparison with molecular weight markers is higher than 500 ng per lane when 5 l of the PCR assay mixture is loaded, which is more than sufficient for downstream Gateway cloning. For placement of the PCR product into a Donor vector we modified commercial Gateway cloning protocols to make the technology automation friendly. According to the Gateway manual, PCR products must be poly(ethylene glycol)-precipitated or gel-purified for use in a BP reaction. Gel or poly(ethylene glycol) purification procedures would have made automation difficult, or would have been very laborious if performed manually. In order to evaluate the efficiency of BP reactions with crude PCR products, BP reactions were set up with gel-purified, column-purified, and non-purified PCR products in parallel. Bacterial transformations with the BP reaction products using the non-purified PCR products gave nearly the same number of colonies as with gel-purified or column-purified PCR products, namely, more than 1,000 colonies obtained in each case for 2 l of BP reaction was transformed into 50 l of competent cells, 450 l of SOC medium was added, and 100 l of culture mixture was plated. This finding made it possible to eliminate the laborious manual gel purification step and to automate all Gateway steps with available equipment. Nonpurified PCR products were used in all our HT cloning runs with efficiency of the BP reaction step at 99–100% based on DNA sequencing results. Our choice of the Donor vector is always pDONRZeo (Invitrogen). This vector encodes the rare antibiotic resistance gene, Zeocin™ (Cayla, Toulouse, France), which allows the broad usage of Destination
Kornienko et al.
vectors with all frequently used antibiotic resistances. Also, use of pDONR-Zeo in a BP reaction eliminates unwanted E. coli transformations from any non–Zeocinresistant PCR template left in the unpurified PCR reaction mixture. BP and LR reactions are tolerant to DNA concentration variability. For example, DNA quantities taken into BP or LR reactions (PCR product or purified plasmids, respectively) were always estimated by visual observation of gel analyses. Since gel analysis was sufficient for estimation of DNA concentration no sample was sacrificed for spectral analysis. Our ultimate goal is to create expression clones suitable for efficient functional protein production. Often a protein is expressed better in one expression system compared to another, so that vectors for different expression systems need to be tested. Vector features such as promoters, upstream regulatory elements, fusion tags, and protease cleavage sites are important for expression and purification.2–5 Taking these factors into consideration, a Gateway Destination vector library was constructed to contain the few commercially available vectors with our custom combinations of His6 tag, GST, other fusion proteins, and cleavage sites, for either bacterial or insect cell expression. Experience with all of the approximately three dozen vectors used so far is that the Gateway recombination reaction from Entry vector into Destination vector is invariably highly efficient. Bacterial transformation procedures on the robot are of two kinds. One procedure is a traditional transformation with plating, and it is performed when clonal selection is required. The other procedure is a liquid transformation (see Materials and Methods) in which freshly transformed cultures were transferred into a medium with antibiotics and then grown overnight. This process eliminates culture plating and colony picking steps and is useful for transformation of purified expression plasmids into expression cell lines for preparation of seed cultures for bacterial expressions. The aliquot of the seed culture was used to inoculate the expression medium for bacterial expression. In a test of eight different expression plasmids transformed into seven expression cell lines grown directly in liquid culture (see Materials and Methods, Liquid transformation) versus the same transformation having been first plated, the resulting expression of protein of interest was the same in all cases. We would contend that clonal isolation of a transformed cell may occasionally identify the best expression plasmid–host combination, but that, in general, good results can be obtained by skipping that isolation with a large savings in time and effort. In particular, we recognize that some toxic proteins may require special treatment wherein liquid transformation is not appropriate, but we nonetheless use it as standard practice.
5998_02_p661-674
1/12/06
3:03 PM
Page 669
669
Protein Expression Plasmids
Mutagenesis. In order to create ORF modifications in parallel, an automated mutagenesis protocol was developed. The method as described in Materials and Methods is for 12 mutants in parallel, and it scales up easily to 96 reactions in parallel. The QuikChange mutagenesis process is not as robust as the Gateway cloning method; therefore closer attention was paid to automated pipetting techniques. Speed of aspiration and dispensing was 50% and 25%, respectively, during all aspiration and dispensing steps. In order to mix reaction components gently but thoroughly we created a custom mixing pipetting procedure (Biomek FX template) with the 96-well pipetting head moving in a spiral fashion. Higher reaction efficiency was achieved when this pipetting template was introduced for the mixing step. As a result QuikChange mutagenesis reactions set up with the robot had the same efficiencies as manual controls, and correlated with the efficiency predicted in the QuikChange manual; greater than 90% of colonies contained the codon change of interest for single amino acid changes. The automated protocol has been successfully used to change up to seven amino acid segments in a single reaction.
Time savings. The time saved with automated versus manual methods is outlined in Table 1 for individual steps, and for the entire process of producing 192 expression clones. The automated procedures outlined here consume only about 9 h “hands-on” time, while the overall process extends only 11 working days. The bulk of time for the overall process is waiting for colony or culture growths, and reaction incubations. Time for physical manipulation of samples consists mainly of setting up the robot, so that we find use of automation with less than 12 samples less efficient than simple manual cloning. As the number of samples increases, however, manual methods become burdensome and error-prone, while the effort for robotic operation remains essentially unchanged. It is noteworthy that in Table 1 nearly half of the time saved is from streamlined data manipulation procedures. When software designed for one-by-one sequence manipulation is used, the time spent manipulating and recording DNA sequences increases linearly with sample number. The custom molecular biology software described in Materials and Methods processes all plasmids to be made as a set (one to hundreds), beginning with
TABLE 1. ESTIMATION OF RESEARCHER’S TIME CONSUMPTION FOR CREATION OF 192 EXPRESSION CLONES AND SUBSEQUENT MUTAGENESIS OF 192 OPEN READING FRAMES (ORFS) Researcher’s time (h) required for Cloning procedure Gateway ORF cloning Gateway primer design and ordering PCR set up Gel purification of PCR mixtures BP reaction set up Transformation of BP Plating on Petri dishes Plating on Qtrays with dividers Entry clone plasmid isolation LR reaction set up Transformation of LR Plating on Petri-agar plates Plating on Qtrays with 48 sections Expression clone plasmid isolation Clone data management Site-directed mutagenesis Mutagenic primer design and ordering Set of QuikChange mutagenic reactions Transformation of mutagenic reactions Plating on Petri-agar plates Plating on Qtrays with 48 sections Isolation of mutated plasmid DNA Clone data management Total researcher’s time used for ORF cloning and mutagenesis
Manual process with non-modified protocol
Automated process with our modified protocol
12.a 1 4 1 2 2 Not used 3.5 1 2 4 Not used 3.5 12.a
0.5 0.5 Not used 0.5 0.5 Not used 0.75 0.5 0.5 0.5 Not used 0.75 0.5 0.5
8a 1 2 2 Not used 2 8.a 71
0.5 0.5 0.5 Not used 0.75 0.5 0b 9
PCR, polymerase chain reaction. aWith the help of commercial or web-based software. bMutated sequences are generated automatically during mutagenic primer design process.
5998_02_p661-674
1/12/06
3:03 PM
Page 670
670
Kornienko et al. TABLE 2.
RHO-KINASE VARIANT (ROCK2) VARIANT TABLE
AND
EXPRESSION SCORING
Amino acid Variant number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19a 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
N-terminal
C-terminal
Tag
Cleavage site added
Promoter
1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40
1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409
N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS C 6HIS N GST N GST N GST N GST N GST N GST N GST N GST N GST N GST N GST N GST C GST C GST C GST C GST C GST C GST C GST C GST C GST C GST C GST C GST N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS N 6HIS
PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission
Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic
Total
Soluble
N N Y Y N N N N N N Y N Y Y Y N N N Y Y Y Y Y Y Y N Y N N N Y Y N N Y N Was not tested N N Y N Y Y N N Y N N N Y Y Y Y Y Y N N N N Y Y Y Y Y Y Y Y Y Y Y Y N N Y Y Y Y N N N N N N N N Y N Y Y Y Y Y N Y N N N Y Y Y N N N N N Y N Y N N N Y N N N N N N N
5998_02_p661-674
1/12/06
3:03 PM
Page 671
671
Protein Expression Plasmids TABLE 2.
RHO-KINASE VARIANT (ROCK2) VARIANT TABLE
AND
EXPRESSION SCORING (CONTINUED)
Amino acid Variant number 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93a 94 95 96
N-terminal
C-terminal
1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40 1 11 1 90 1 1 90 80 70 60 50 40
1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409 1388 552 438 1057 1057 534 359 369 379 389 399 409
Tag C C C C C C C C C C C C N N N N N N N N N N N N C C C C C C C C C C C C
6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS 6HIS
Cleavage site added PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission PreScission
Promoter Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Basic Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin Polyhedrin
Total
Soluble
N N Y N Y N N N Y Y Y N Y N Y N Y N Y N N N Y N N N Y Y Y Y N N N N Y N Y N N N N N Y Y N N Y Y Y N Y N Y Y Y N N N Y Y Y Y Y N Was not tested Y Y Y N Y Y
Most clones focused around the kinase catalytic domain at the N-terminus of the protein. Results of expression were labeled as follows: Y, the band of correct molecular weight was detected on western blot; N, the band of correct molecular weight was not detected by western blot. 6HIS, hexahistidine; GST, glutathione S-transferase. aVariants 19 and 53 were not used in expression testing, because of cloning failure.
PCR primer design (see Fig. 4) and carrying out subsequent sequence manipulations for all desired products. Details of the software are beyond the scope of this paper, but in Materials and Methods we present the principles used in creating the software since we believe that replication of this software is easy given knowledge of Visual Basic, MS Excel workbooks, and the cloning methods. Material costs are comparable for the automated, modified protocols given here and traditional manual methods, on a per-clone basis. In the automated protocols, consumption of disposable labware is identical to that in manual operations. The modified protocols given here reduce costs because of a 50% volume reduction of BP and LR
Clonase, omitting PCR product purification, and the use of QTrays, which hold 48 samples in place of traditional Petri dishes. These savings more than offset a modest increase in solution volumes used in the protocols presented. ROCK2 case study An example of successful implementation of the HT methods outlined above is the cloning and expression testing of 96 expression plasmids for the kinase ROCK2. ROCK2 is a multi-domain protein, one domain of which is a serine/threonine kinase.13 The kinase activity has been suggested to be a suitable target for inhibition in the treatment of stroke and possibly other conditions. A majority
5998_02_p661-674
1/12/06
3:03 PM
Page 672
672
Kornienko et al.
TABLE 3.
RHO-KINASE VARIANT (ROCK2) EXPRESSION RESULTS SUMMARY FOR 94 CLONES
Promoter
Number of clones
Expression results Expression tested Soluble protein was expressed Expression was detected by western blot (His tag or GST antibodies) No expression was detected by western blot (His tag or GST antibodies)
94 31 60 34
of mammalian kinase crystal structures have been obtained with truncated forms (catalytic domains) of the natural proteins14 so the cloning plan for the ROCK2 project emphasized expression of the kinase catalytic domain, ultimately for structural study with bound inhibitors. The catalytic domain has been expressed in insect cells, but no crystal structure is reported yet.15 The cloning strategy for 96 variants here also included the potential for obtaining longer forms of the protein. As listed in Table 2, the design of variants included the following features: 1. 12 length variants of the ORF; nine of them consisted of mainly the kinase domain, plus three variants including additional portions of the protein up to the full natural length of 1,388 amino acids. 2. Either Polyhedrin or Basic promoters were encoded in the Destination vectors. 3. Either a GST tag for solubility or a His6 tag for facile purification was used as either the N- or C-terminal tags. The tags were encoded in the Destination vector sequence. 4. Two types of primer extensions were used in PCR, one of them just encoding the Gateway recombination sequence (24 clones) and another adding the PreScission cleavage site to allow cleavage of residual amino acids (72 clones) of the Gateway recombination sequence. The entire HT cloning process was completed for 94 of the 96 clones shown in Table 2 on the first attempt. The
IMPACT
OF THE
FUSION TAG
Polyhedrin
Basic
6%
57%
30%
9%
Expressed (out of total number of variants analyzed) Soluble (out of total number of variants analyzed)
Expressions from the constructs of the same length variants and tags are compared (variants 1–24, polyhedron promoter, vs. variants 49–72, basic promoter, Table 2).
GST, glutathione S-transferase; His, histidine.
TABLE 5.
TABLE 4. IMPACT OF THE PROMOTER ON RHO-KINASE VARIANT (ROCK2) EXPRESSION AND SOLUBILITY
ON
Expressed (out of total number of variants analyzed) Soluble (out of total number of variants analyzed)
expression result for each variant is also shown in Table 2, and summarized in Table 3. Variants 2, 7, 10, and 74 were scaled up to 20-L cultures. Variants 2 and 74 were purified, and a crystallization screen was set up for these variants. The expression level of purified proteins was about 4 mg/L of culture. Crystallization screening yielded crystals of variants 2 and 74 that diffracted at 10 Å. Efforts to obtain crystals with better diffraction are underway. Analysis of the expression screen results of ROCK2 variants, as shown in Tables 2–5, allows several conclusions. For a given length variant there was no significant difference in the overall expression level of protein between the Polyhedrin- and the Basic promoter-driven constructs (61% and 57% of the clones, respectively, as detected by western blotting; Table 4). However, three times more variants with Polyhedrin promoter produced soluble protein than ones with Basic promoter (Table 4). This result indicates that the nature of the promoter can have an impact on the efficiency of soluble protein expression; therefore inclusion of different promoters in an HT cloning run can help to achieve the successful soluble protein production. To evaluate the influence of the fusion tags on the expression and solubility, the expression results of clones with Polyhedrin promoter were analyzed, as shown in Table 5. N-terminal GST tag was the most efficient for soluble protein expression among the tags tested in the ROCK2 expression screen. As it is usually difficult to
RHO-KINASE VARIANT (ROCK2) EXPRESSION
AND
SOLUBILITY
N-6His
C-6His
N-GST
C-GST
55% 36%
64% 18%
73% 73%
55% 36%
Expression and solubility of clones with polyhedrin promoter and Prescission cleavage site were compared (variants 1–12 with N-terminal hexahistidine [N-6His] tag, variants 13–24 with C-terminal 6His tag, variants 25–36 with N-terminal glutathione Stransferase [N-GST] tag, variants 37–48 with C-terminal GST tag, Table 2).
5998_02_p661-674
1/12/06
3:03 PM
Page 673
673
Protein Expression Plasmids
ment of proteins generated from these clones will be published elsewhere. Conclusions With results achieved with our HT molecular biology procedures we have demonstrated that cloning of multiple protein variants as well as gene families can be performed in a parallel, automated fashion with a high rate of success. Introduction of automated methods and data management software drastically accelerated the cloning process and greatly reduced the operator error. Acknowledgments
FIG. 7. Analysis of Rho-kinase variant (ROCK2) expression. Fluorescently labeled antibodies specific for glutathione Stransferase were used for the immunoblot shown. Variant labels here are described in more detail in Table 2. Std, standard.
predict which tag will be the most efficient for the expression of the specific target, it is essential to incorporate various tags into HT cloning runs of protein variants. Inclusion of the PreScission cleavage site with PCR to become a part of an expressed ORF did not have a noticeable impact on expression and solubility, but was helpful in order to remove not only the fusion tag, but also residual Gateway amino acids from the expressed protein variant. An example of a western blot is shown in Fig. 7 to demonstrate a difference in expression level and amount of soluble protein generated from different ORF length variants with the same promoter and expression tag. Some length variants did not produce soluble protein (Table 2), demonstrating the importance of expression testing of multiple clones of various lengths due to the unpredictable outcomes. The methods outlined here can be applied not only to cloning of multiple variants of the same protein, but also to creation of expression clone collections of gene families, when ORFs of many genes are cloned in parallel. The catalytic domains of 500 genes of the human kinase family (average length 900 nucleotides) have been cloned with these methods into a single baculovirus expression vector. In addition, the full-length versions of the kinase expression clones were made with a 95% success rate start to finish—from PCR amplification of the fragment of interest from the purchased clone to the purified expression plasmid. Results of expression, purification, and assay develop-
The authors wish to thank Chris Kemp (Kemp Biotechnologies) for providing protein expression services and Yuan Liu (Merck &Co., Inc.) for the valuable help in bioinformatics. References 1. Munshi S, Kornienko M, Hall DL, Reid JC, Waxman L, Stirdivant SM, Darke PL, Kuo LC: Crystal structure of the Apo, unactivated insulin-like growth factor-1 receptor kinase. Implication for inhibitor specificity. J Biol Chem 2002;277:38797–38802. 2. Baneyx F: Recombinant protein expression in Escherichia coli. Curr Opin Biotechnol 1999;10:411–421. 3. Leiting B, Pryor KD: High-level of soluble protein in Escherichia coli using a His6-tag and maltose-binding-protein double-affinity fusion system. Protein Expr Purif 1997;10: 309–319. 4. Dyson MR, Shadbolt SP, Vincent KJ, Perera RL, McCafferty J: Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression. BMC Biotechnol 2004;4:32. 5. Sorensen HP, Mortensen KK: Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli. BMC Microbial Cell Factories 2005;4:1. 6. Marsischky G, LaBaer J: Many paths to many clones: a comparative look at high-throughput cloning methods. Genome Res 2004;14:2020–2028. 7. Service R: Structural genomics, round 2. Science 2005;307: 1554–1558. 8. Rual JF, Hill DE, Vidal M: ORFeome projects: gateway between genomics and omics. Curr Opin Chem Biol 2004; 8:20–25. 9. Braun P, Hu Y, Shen B, Halleck A, Koundinya M, Harlow E, LaBaer J: Proteome-scale purification of human proteins from bacteria. Proc Natl Acad Sci U S A 2002;99:2654– 2659. 10. Brasch MA, Hartley JL, Vidal M: ORFeome cloning and systems biology: standardized mass production of the parts from the parts-list. Genome Res 2004;14:2001–2009. 11. Longnecker KL, Garrad SM, Sheffield PJ, Derewenda ZS: Protein crystallization by rational mutagenesis of surface residues: Lys to Ala mutations promote crystallization of RhoGDI. Acta Crystallogr 2001;D57:679–688.
5998_02_p661-674
1/12/06
3:03 PM
Page 674
674 12. Hartley JL, Temple GF, Brasch MA: DNA cloning using in vitro site specific recombination. Genome Res 2000;10: 1788–1795. 13. Amano M, Fukata Y, Kaibuchi K: Regulation and functions of Rho-associated kinase. Exp Cell Res 2000;261:44–51. 14. Protein Data Bank. Available at: http://www.rcsb.org/pdb/. 15. Amano M, Fukata Y, Shimokawa H, Kaibuchi K: Purification and in vitro activity of Rho-associated kinase. Methods Enzymol 2000;325:149–155.
Kornienko et al.
Address reprint requests to: Maria Kornienko Department of Structural Biology Merck Research Laboratories (WP26-344) West Point, PA 19486 E-mail:
[email protected]