Recombinant Proteins in Escherichia coli: Optimizing ...

17 downloads 0 Views 497KB Size Report
Department of Chemistry, Faculty of Science, University of Zabol, Zabol 98615-538, ... Zabol University of Medical Sciences, P.O. Box: 61615-585 Zabol, Iran.
American-Eurasian J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015 ISSN 1818-6769 © IDOSI Publications, 2015 DOI: 10.5829/idosi.aejaes.2015.15.11.96100

Recombinant Proteins in Escherichia coli: Optimizing Production Strategies 1

Leila Soufi, 2Seyedeh Mahsan Hoseini-Alfatemi, 3Mehdi Sharifi-Rad, 4Marcello Iriti, 5 Marzieh Sharifi-Rad, 6Masoumeh Hoseini and 7,8Javad Sharifi-Rad

Department of of Biology, Guilan University of Medical Sciences, Rasht, Iran 2 Pediatric Infections Research Center, Mofid Children Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran 3 Zabol University of Medical Sciences, Zabol, Iran 4 Department of Agricultural and Environmental Sciences, Milan State University, Via G. Celoria 2 - 20133 Milan, Italy 5 Department of Chemistry, Faculty of Science, University of Zabol, Zabol 98615-538, Iran 6 Department of Food Science and Technology, Faculty of Agriculture, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran 7 Zabol Medicinal Plants Research Center, Zabol University of Medical Sciences, P.O. Box: 61615-585 Zabol, Iran 8 Department of Pharmacognosy, Faculty of Pharmacy, Zabol University of Medical Sciences, P.O. Box: 61615-585 Zabol, Iran 1

Abstract: The recombinant proteins are under special attention due to their medical, pharmaceutical and industrial uses. Bacteria are simple and cost effective hosts for producing recombinant proteins. The Gram-negative bacterium Escherichia coli is a high yielding recombinant protein production system, due to its specific characteristics such as the rapid and high-density growth, need of poor medium and known genome. In this review, we examine the factors involved in the production of recombinant proteins by E. coli, focusing on the optimization of strategies responsible for the improvement of yield. Key words: Escherichia coli

Recombinant proteins

INTRODUCTION Today, demands for bio-drugs are increasing and their production is possible in different systems such as bacteria and mammalian cells [1]. The expression system consists of three basic elements: a gene coding for the target protein, an expression vector to harbor the gene and a suitable host organism. Utilization of mammalian cells is expensive, thus using of microorganisms such as bacteria is reliable for recombinant protein production in large scale [2, 3]. Escherichia coli is a Gram negative bacterium with rapid and high density growth, cheap media, fast gene expression, known genome, easy genetic manipulation and bearing a variety of cloning vectors. It is a suitable tool for fast and high yielding recombinant protein production [4]. It has been reported that a number

Gram negative bacteria

Genome

of factors influence the distribution of proteins in E.coli clones: the gene copy number, regulation or inducibility of a gene, the stability of enzyme and/or a combination of any of these factors will influence the distribution of proteins in a host [5]. A major bottle neck in structural biology is that many mammalian proteins are expressed very poorly or are not expressed in soluble forms. These difficulties may arise due to the improper folding of mammalian proteins in E. coli cells, where they are quickly digested by proteases or accumulated as inclusion bodies. A number of expression systems have been developed to overcome these problem, for example, by coexpressing mammalian proteins with molecular chaperones, expressing them as fusions with maltosebinding protein, glutathione S-transferases, or expressing them at low temperatures using cold shock vectors [6].

Corresponding Author: Javad Sharifi-Rad, Zabol Medicinal Plants Research Center, Zabol University of Medical Sciences, P.O. Box: 61615-585 Zabol, Iran.

2149

Am-Euras. J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015

E. coli is disabled in the post-translational changes (such as phosphorylation and glycosylation) as well as other bacteria. These modifications, which only occur in eukaryotic cells, are accomplished with the eukaryotic glycosylation enzymes that entry into the bacteria by genetic engineering [7]. Recently, a novel strain of the E. coli with the name "SHaffle" is made based on the trxB gor (the inhibitor of SMG96 strain) that allows the bacteria to form disulphide bonds which are typical a post-translational modifications of eukaryotic proteins [8]. Due to time-saving and reducing of economic costs are imperative, we have reviewed the strategies of optimizing the recombinant protein production. Production and Purification of Recombinant Proteins: Usually, the proteins with stable folding and globular domains are the main goals of biological studies. More than 50% of the eubacterial proteins and 10% of eukaryotic proteins are produced in E. coli. Generally, proteins with molecular weight below 60 kD have more successful expressed in soluble form in E. coli. Proteins that are not expressed in soluble form may not be modified or folded right, or may precipitate in E. coli through inclusion body formation. In brief, the production of a recombinant protein is based on a bacterial colony with recombinant plasmid cultivated in proper media such as enriched LB (lactose broth) or TB (terrific broth), containing the appropriate concentrations of antibiotics (such as 100 g/ml ampicillin). Since the optimisation of the recombinant protein expression in the bacterial system is performed by monitoring the cell density [9], when OD600 is 0.6 we can induct the colonies by proper amounts of inducer such as IPTG (isopropyl- -D-1thiogalactopyranoside), thermal shock, or pH changes in the proper time and temperature. To find the optimum concentration of inducer, optimum time of induction and optimum temperature of induction, after the cultivation of bacteria and reaching the appropriate optical absorption, different concentrations of inducer (0.1, 0.25, 0.5, 0.75, 1.0 and 1.5 mM) will be assessed and the optimum concentration of inducer will be obtained first. Then, in the optimum concentration of inducer, expression time (0, 1, 2, 3, 4 and 5 hours) will be analyzed and the best induction time will be obtained. Finally, in the optimal inducer concentration and optimal time, different temperatures (15, 20, 23, 28, 30 and 37 °C) will be checked and the optimal temperature will be determined. Then, bacterial cells will be collected by centrifugation at 5000 RPM for 10 minutes [4, 10] and the protein will be

extracted by using some techniques, such as purifying with nickel-nitrilotriacetic acid column (Ni-NTA), from the cell deposits. The concentration of this protein can be determined with the Bradford method. Purified protein will be compared with its marker using the SDS-PAGE gel electrophoresis. Recombinant expression systems can be used to modify the target protein, i.e. to introduce mutations, deletions or to genetically fuse the target protein with engineered “tags”. Such tags often promote protein expression and solubility. Typically, they also mediate high-affinity binding to standardized affinity matrices and, therefore, allow for highly efficient and streamlined purification schemes. Such tags can be removed from the target protein during the purification process. This step is often accomplished by site-specific proteases recognizing a unique, short recognition motif that has been artificially introduced between the tag and the target protein. Proteases bdSENP1, bdNEDP1, ssNEDP1, scAtg4 and xlUsp2 can efficiently cleave appropriately tagged target proteins within a wide range of temperatures [11]. Optimization of Factors Influencing the Recombinant Protein Expression: Here we report optimal conditions to produce a protein in solution form, because the protein production into inclusion body (insoluble form) entails costs for turning it into the soluble form. Selection of Proper Host Strain: Selection of the host strain is important in recombinant protein production. The appropriate host bacteria have low levels of proteases, can maintain plasmid retention and provide the genetic elements related to expression system. The example of the strain that is very suitable for hosting is E. coli BL21, because it is able to grow in basic media, it is not pathogen (of course can survive in the host tissues and cause diseases) and also lacks in OMPT and Lon proteases that may decompose the target proteins. In addition, the derivatives of the BL21 strain like strains without recA (BLR strains) which contain repeated sequences to sustain the goal plasmid, the lacY mutants which are able to generate proper value of the protein expression and strains which prevent from inclusion body formation such as C41 and C43 can be used. Choosing the appropriate strain can also increase the mRNA stability, for example a strain that contains a mutation in the RNAase E gene (rne131 mutant) [12]. For eukaryotic proteins, it is often important to use BL21(DE3) derivatives carrying additional tRNAs to overcome the

2150

Am-Euras. J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015 Table 1: The main bacterial groups that produce recombinant proteins. Most suitable hosts for production of each type of protein is given [14]. *GRAS: generally recognized as safe Host Proteobacteria Caulobacteria

Main bacterial species

Phototrophic bacteria High production of membrane proteins Cold adapted bacteria Improved protein folding

Rodhobacter Pseudoalteromonas haloplanktis Shewanella sp. Strain Ac10

Pseudomonads

Pseudomonas fluorescens

Halophilic bacteria Actinobacteria Streptomycetes

Firmicutes

Main features

Easy purification of secreted RSaA fusions Caulobacter crescentus

Efficient secretion

Pseudomonas putida Pseudomonas aeruginosa Halomonas elongate Chromohalobacter salexigens

Solubility favored

Streptomyces lividans Streptomyces griseus Nocardia lactamdurans Mycobacterium smegmatis

Efficient secretion

Nocardia Mycobacteria

Efficient secretion Posttranslational modifications

Coryneform bacteria

High-level production and secretion; GRAS*

Corynebacterium glutamicum Corynebacterium ammoniagenes Brevibacterium lactofermentum

Bacilli

High-level production and secretion

Lactic acid bacteria

Secretion; GRAS

Bacillus subtilis Bacillus brevis Bacillus megaterium Bacillus licheniformis Bacillus amyloliquefaciens Lactococcus lactis Lactobacillus plantarum Lactobacillus casei Lactobacillus reuteri Lactobacillus gasseri

effects of codon bias. Historically, ampicillin has been the most commonly used antibiotic-selection marker, but it has been replaced by carbenicillin, which is more stable. Vectors encoding resistance to kanamycin or chloramphenicol are now widely used. E. coli k-12 is one of the strains which is also suitable for hosting [13]. The main bacterial groups producing recombinant protein are reported in Table 1. Choosing Appropriate Vector and Promoter: An efficient prokaryotic expression vector should contain a strong and tightly regulated promoter, a SD site that is positioned approximately 9 bp to 5' end of translation initiation codon and is complementary to the 3' end of 16SrRNA and an efficient transcription terminator positioned at 3' end of gene (Figure 1). In addition, the vectors require an origin of replication, a selection marker and a gene that facilitates the promoter activity regulation [7]. Most powerful vectors for recombinant protein

Case proteins Hematopoietic necrosis virus capsid proteins -1,4-glycanase Sphaeroides membrane proteins 3H6 Fab Human nerve growth factor -Lactamase, peptidases, glucosidase Human granulocyte colony-stimulating factor Single chain Fv fragments Penicillin G acylase -Lactamase Nucleoside diphosphate kinase M. tuberculosis antigens Trypsin Lysine-6-aminotransferase Hsp65-hIL-2 fusion protein Mycobacterial proteins Protein-glutaminase Pro-transglutaminase Cellulases -Galactosidase Disulfide isomerase Antibodies Subtilisin Amylases Fibronectin-binding protein A, internalin A, GroEL -Galactosidase VP2-VP3 fusion protein of infectious pancreatic necrosis virus Pediocin PA-1 CC chemokines

production and gene expression in bacteria are PET28a and PET22b. In these systems, the target gene can put under the control of T7 (a strong promoter) which, in host genome, the transcription of the target gene under this promoter will be done by the RNApolymerase of bacteriophage T7. Because of the independence of gene transcription to the host cell, production of mRNA and protein in this system does not take the cellular protein synthesis factors and the protein production content is higher than systems which used host cell transcription polymerase. Expression of recombinant plasmids need to strong promoters to provide high levels of gene expression [12]. Promoter’s sequence is an important factor in gene expression. For the promoter activity control in the recombinant protein production, chemical inducers like IPTG can be used. However, due to the toxicity and lack of economic saving, some of these chemicals are not recommended in recombinant protein production for the pharmaceutical consumption.

2151

Am-Euras. J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015

Fig. 1: Schematic picture of the sequence elements of a prokaryotic expression vector. Shown as an example is the hybrid tacpromoter (P) consisting of the -35 and -10 sequences, which are separated by a 17-base spacer. The arrow indicates the direction of transcription. The RBS consists of the SD sequence followed by an A+T-rich translational spacer that has an optimal length of approximately 8 bases. The SD sequence interacts with the 3' end of the 16S rRNA during translation initiation, as shown. The three start codons are shown, along with the frequency of their usage in E. coli. Among the three stop codons, UAA followed by U is the most efficient translational termination sequence in E. coli. The repressor is encoded by a regulatory gene (R), which may be present on the vector itself or may be integrated in the host chromosome and it modulates the activity of the promoter. The transcription terminator (TT) serves to stabilize the mRNA and the vector. In addition, an antibiotic resistance gene, for e.g. tetracycline, facilitates phenotypic selection of the vector and the origin of replication (Ori) determines the vector copy number [7] Therefore, promoters that are manageable by changing the temperature or pH which derive from bacteriophage lambda such as PL and PR can be used for many expression vector designing due to their high power and easy control. These promoter’s activity can be set by a mutation in CI repressor (CI857) gene and changing the temperature. This repressor is active in low temperatures (30°C) and can prevent gene expression under the control of the promoter PL. With the temperature increment to 42°C, the repressor’s shape changes and can not be connect to the operator. In this way, by changing the temperature of 30 to 42°C, shape and behavior of repressor can be modified and it loses its deterrence power, therefore gene expression under control of promoters PL and PR will be increased. A representing vector with PL promoter is able to recombinant protein expression under thermal induction; in addition, with an indicator sequence, such as B-Pel, it is able to recombinant protein expression in preplasmic space of E.coli. This promoter is known by bacterial RNApolymerase and can be used for protein expression and thus does not need to use specific strains of E. coli as the host. Of course, there are also some promoters that can be aroused by reducing the temperature [15]. With the composition of effective factors on recombinant protein production, stimulation in several variants brings one new variant with all of these traits. For example: in bla protein production, if the promoter can

be converted from wildtype to ML2-5 with a mutation or if xyLS gene (which codes promoter activation factor) can be converted from wild type to the H39, levels of recombinant protein production will be increased. If we build a cassette which contains both of these mutated genes, levelof recombinant protein expression will be increased [16]. Optimizing the Genetic Code: Redesigning the desired gene using the prokaryotic codon preference system may increase protein expression in the host bacteria. Using gene optimizer softwares can modify the GC content, the genetic code quality, DNA motifs like the ribosome entry site and the possibility of the formation of mRNA secondary structures (which may prevent the expression of protein) and etc. Finally, system will give us optimal sequences automatically. Comparison between wild and optimized sequences is shown up to 70 % accretion in the protein production in optimized sequence [15, 17]. The genetic code should be designed with high content of adenine and thymine in the RBS (Ribosomal binding site) for its efficiency increment. First codon after the start codon at the translation beginning site is very important and it is better to have high amounts of adenine. Transcription terminator, which put at the end of the target gene, can be optimized for plasmid stability increment with prolongation of end codon and put together a few end codons (which is UAA in E. coli) [12].

2152

Am-Euras. J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015

As the AT content is higher, the target protein will be secreted more easily into the media, so, it will be easier purificated [18]. The most common methods of genetic code optimization are the “one amino acid-one codon” and “codon optimization”. The first method can be in 2 forms: the desired gene is made using abundant codon for each amino acid in the E. coli genome, or the desired gene is made using the codon of each amino acid which has been used in the highest expression genes of E. coli. The Optimizer is a software which can be used for this purpose. In the second method, a codon is considered randomly among a set of codons for each amino acid, using some softwares such as “GeMS” based on the weight of each codon. Research results show that the second method leads to more recombinant protein production and thus will be cost effective [19]. mRNA Stability: Recombinant protein production depends on mRNA stability and its translation. mRNA half-life at 37°C in E. coli I s from a few seconds up to 20 minutes. mRNA expression time is directly related to its stability. Because some RNAases such as RNAase II, PNPase and RNAase E break down mRNA, protection from RNAases depends on RNA folding, protection by ribosomes and mRNA stability adjustment by polyadenilation. Polyadenilation by PAP I and PAP II polymerases facilitates the degradation with RNAase II and PNPase. The strains which contain a mutation in the RNAase E gene have higher mRNA stability [12]. Expression of Multiple Proteins in a Vector (CoExpression): Co-expression is bacteria transformation with one or more plasmid as bacteria containing more than one protein encoding genes. Co-expression in members of a stable multi-protein complex or a protein with its chaperones can increase the amount of target protein with correct folding in many cases. In addition, in this method, any protein present protects other proteins from proteolytic degradation [20]. A plasmid containing the desired gene or a polycistronic plasmid which contains several genes can be used, but it is not recommended to use more than two plasmids. Plasmids can be marked with an antibiotic resistance gene such as spectinomycin, kanamycin, chloramphenicol or ampicillin to isolate them from each other [12]. Ampicillin was used routinely, but nowadays carbencillin is used commonly because it is more stable. Vectors containing kanamycin or chloramphenicol resistance genes are also widely used. Plasmids should not have an identical origin of replication because when they are transformed into the bacteria

concomitantly, one of the plasmids stops through the RNA antisense mechanism. So plasmids whose origin of replication or antibiotic resistance marker are identical should not be used in one bacteria. Duet vectors (Novagen) with 2 cloning sites, 5 origins of replication and 4 antibiotic resistance markers makes possible the simultaneous expression of more than 8 proteins. Combination of Duet vectors with any other industrial plasmids facilitate the usage of substances such as glutathione-S-transferase (GST) or maltose binding protein (MBP), which increase the protein solubility and prevent of the protein dimer formation [20]. One of the molecular chaperone in the E. coli periplasm is the 17 kDa Skp protein which forms a trimer with a central cavity. This cavity allows Skp to engulf native polypeptide substrates and prevents their subsequent aggregation. Co-expression of Skp with a poorly soluble single chain antibody resulted in its secretion into the E. coli periplasm as well as improved solubility and phage display of the antibody fragment and diminished the toxicity of the antibody for the host cells [21]. The Suitable Starter Media: In most laboratory, starter culture is grown one day in a rich media such as LB and incubated at 37°C. This culture will saturate tomorrow morning, thus causing plasmid instability and silencing and, finally, reducing the target protein production [20]. To avoid this problem, we use selected enriched media and incubate the culture at 37°C for a few hours or one day at 20-25°C until the optical density will be 3-5 in 600 nm. To get more cell density in a culture, more nutrients, especially carbon resources are required. Enough cell density can produce more target protein. Glucose as the carbon resource and NH 4Cl as the nitrogen source for bacteria are essential [4]. Glucose can be replaced with glycerol to prevent the media acidification due to glucose acidic products such as acetic acid [22]. However, the basic media with glucose show a better performance in the recombinant protein production than the glycerol-based media [13]. Cultures should be allowed to grow up to exponential growth phase, until the OD 600 = 0/6; at this stage, stimulation by the inducers can be done. Appropriate pH and flask shaking is effective to obtain a good culture [23]. In most cases, LB is used, even if the TB (Terrific Broth) is better because this medium is richer than LB, thus causing the bacteria to reach the desired optical density faster and to increase the recombinant protein production. In addition, stronger buffering feature of TB avoids the pH changes of medium due to bacterial products. Tryptone and kanamycin concentrations in the

2153

Am-Euras. J. Agric. & Environ. Sci., 15 (11): 2149-2159, 2015 Table 2: The inducer types according to types of promoter used in the recombinant protein production in E. coli [7] Promoter (source)

Regulation

Induction

lac (E. coli)

lacI, lacIq lacI(Ts)a, lacIq(Ts)a lacI(Ts)b

IPTG Thermal Thermal Trp starvation, indole acrylic acid IPTG, lactosec Phosphate starvation Nalidixic acid L-Arabinose Osmolarity Glucose starvation Tetracycline pH Anaerobic conditions, nitrate ion IPTG Thermal IPTG Thermal IPTG IPTG

trp (E. coli) lpp (E. coli) phoA (E. coli) recA (E. coli) araBAD (E. coli) proU (E. coli) cst-1 (E. coli) tetA (E. coli) cadA (E. coli) nar (E. coli) tac, hybrid (E. coli) trc, hybrid (E. coli) lacI(Ts)a, lacIq(Ts)a lpp-lac, hybrid (E. coli) Psyn, synthetic (E. coli) Starvation promoters (E. coli) pL (ë) pL-9G-50, mutant (ë) cspA (E. coli)) pR, pL, tandem (ë) T7 (T7) T7-lac operator (T7) ëpL, pT7, tandem (ë, T7) T3-lac operator (T3) T5-lac operator (T5) T4 gene 32 (T4) nprM-lac operator (Bacillus spp.) VHb (Vitreoscilla spp.) Protein A (Staphylococcus aureus)

phoB (positive), phoR (negative) lexA araC

cadR fnr (FNR, NARL) lacI, lacIq lacId lacI, lacIq lacI lacI, lacIq ëcIts857

Thermal Reduced temperature (

Suggest Documents