Structure of murine complement component C3. I. Nucleotide ...

9 downloads 0 Views 891KB Size Report
The nucleotide sequence coding for the j3 chain of murine C3 was determined from cloned cDNA and genomic DNA fragments, Sonicated subfragments.
Vol. 259, No. 22, Issue of November 25,, pp. 13851-13856,1984 Printed in U.S.A.

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1984 hy The American Society of Biological Chemists, Inc.

Structure of Murine Complement Component C3 I. NUCLEOTIDE SEQUENCEOF CLONEDCOMPLEMENTARY B CHAIN*

AND GENOMIC DNA CODING FOR THE

(Received for publication, April 9, 1984)

Ake LundwallS, Rick A. Wetselp, Horst DomdeyllII, Brian F. Tack,and GeorgH.Fey** From the Department of Immunology, Research Institute of Scripps Clinic, LA Jolla, California 92037 and the lSwiss Institute for Experimental Cancer Research, CH1066 Lausanne, Switzerland

The nucleotide sequence coding for the j3 chain of pant in each pathway of activation. This protein consists of murine C3 was determinedfromclonedcDNAand two polypeptide chains, and p, with respective molecular genomic DNA fragments, Sonicated subfragments weights of 115,000 and 75,000. Cleavageof C3 by the classical were randomly inserted into the bacteriophage M13 pathway C3 convertase (C4b2a), characteristic of the first and sequenced using the dideoxynucleotide technique.step in C3 activation, results in the formation of two fragEach nucleotidewas sequenced on average six times in ments, a vasoactive peptide (C3a) derived from the (Y chain thesestudies.Thederivedaminoacidsequenceinand a macromolecular fragment (C3b) having an a’@chain cludes a signal peptide and a tetra-arginine sequence structure. A bimolecular complex of activated forms ofC3 between thej3 and (Y subunits in the precursor polypeptide prepro-C3. Together with the accompanying re- and Factor B (C3bBb) further functions in the alternative F., Gibson, pathway as a C3 convertase, thereby providing a positive port (Wetsel, R. A., Lundwall, A., Davidson, T., Tack, B. F., and Fey, G. H. (1984) J. Biol. Chem. feedback-amplification mechanism for further C3 consump259,13857-13862), this paper completes the analysis tion (for reviews,see Refs. 1 and 2). It has been recently of the coding sequences for the prepro-C3 polypeptide. proposed that a hemolytically inactive form of C3, C3(i), in as theinitial The derived molecular weightof the unglycosylatedB complex with activated Factor B (Bb), functions chain (642 amino acids) is 70,641. The sequences of C3 convertase of the alternative pathway (3). “Nascent” C3b the first two introns in the murineC3 gene and of the (4) generated by either pathway can bind covalently via a 5”flanking 106 nucleotides are also reported. The 5 ’ - transient or “labile” binding site (5) to cell surface, complex flanking region contains a TATA consensus sequence polysaccharides, or immune aggregates. This results in the inagreementwithan earlier report(Wiebauer, K., formation of an ester bond (6), the acyl group being contribDomdey, H., Diggelmann, H., and Fey, G.H. (1982) uted by a residue contained in the C3d region of the a chain Proc. Natl. Acad. Sci. U. S. A. 79, 7077-7081), pre- (7,8). The C3d fragment is produced from bound C3b by the sumed to be involving in regulating the expression of concerted actions of Factors H and I (9) and additional blood the C3 gene. A striking feature of the derived sequence was that only 3 cysteine residues were found, all lo- proteases (9-12). Subsequent interactions of surface-bound cated in the C-terminal part of the polypeptide chain. C3b (and forms thereof) with cellular elements operative in host defense constitute an important surveillance mechanism No carbohydrate attachment sites were predicted in (13). An activated form of C4, C4b, also contains a labile the j3 chain. binding site which has the transientability to form a covalent association with acceptors on plasma membranes (14) and immune aggregates (15, 16). Surface-bound C4b, similar to The complement system is comprised of a group of proteins C3b, functions to recruit and participate in the activation of which play a significant role in host resistance to infection. the next complement component in the reaction sequence. Two primary pathways of activation are known, the classical Interestingly, the major blood protease inhibitor, a2M, can and alternative. Both pathways proceed initially by the se- enter intoa covalent complex with a large number of proteases quential self-assembly of multimacromolecular enzyme com- and provides fora similar opsonic function. The covalent plexes. The thirdcomponent of complement, C3, is a partici- association of each of these proteins with the appropriate * This work was supported by United States Public Health Service biological target is mediated by an internal P-cysteinyl-yGrants A119651 and A119222 from the National Institutes of Health. glutamyl thiol ester bond (for review, see Ref. 17). For murine C3, cloned cDNA and genomic DNA sequences from the Research Institute of Scripps This is publication 33561” Clinic. The costs of publication of this article were defrayed in part have been previously isolated and analyzed (18, 19). The by the payment of page charges. This article must thereforebe hereby following observations were made. The order of the (Y and p marked “advertisement” in accordance with 18 U.S.C. Section 1734 subunits in the precursor pro-C3 molecule was shown to be solely to indicate this fact. in agreement with previous protein-se$ Supported by Grant-in-Aid 80-841 awarded to B. F. Tack from NH&a-COOH, quence analysis of guinea pig and human pro-C3 and mature the American Heart Association. I Supported by National Institutes of Health Training Grant HI, processed forms of C3 (20), and theexistence of signal peptide 07195. for precursor pro-C3 was inferred. The mouse precursor pro11 Recipient of a postdoctral fellowship from the Deutsche For- C3 molecule was shown to contain 4 arginine residues in the schungsgemeinschaft. Present address, Division of Biology, California transition region between the polypeptide chains. These are Institute of Technology, Pasadena, CA 91125. ** To whom correspondence should be addressed at, Department most likely removed by proteolytic processing during the of Immunology, IMM14, Scripps Clinic and Research Foundation, conversion of the precursor to the mature and p subunits. 10666 North Torrey Pines Road, La Jolla, CA 92037. The total amino acid sequence for mouse C3a anaphylatoxin (Y

(Y

13851

13852

Murine C3 P Nucleotide and Amino Acid Sequences

and part of the coding sequences for the C3d domain were deduced, as well as the amino-terminal portion of the C3 (Y chain. The signal peptide of C3 was shown to be encoded by a separate first exon of the gene. In order to elucidate the structural correlates of C3 which mediate specific biological responsesthroughinteractions with cellular receptors and protein ligands, it is essential to determine the complete primary structure of this molecule. Limited protein-sequence data have been published for human C3 (21-23); however, murine C3 has notbeen previously studied at this level. Due to thelarge size of this molecule, a monumental effort would be required to fully sequence it at the protein level. Therefore, we have undertaken the task of deriving the complete amino acid sequence of C3 from nucleotide sequences of cloned cDNA and genomic DNA fragments. In this paper,we report the complete sequence of the murine C3 p chain. EXPERIMENTALPROCEDURES

Materiak-Low-gelling-temperature agarose was from FMC Corp. DNA-grade agarose and electrophoresis-grade acrylamide were from Bio-Rad. Urea and Triswere GenAR-grade from Mallinckrodt Chemical Works. Restriction enzymes were from New England Biolabs. T4 DNA polymerase, deoxynucleotide triphosphates, anddideoxynucleotide triphosphates were from Pharmacia P-L Biochemicals. The 17nucleotide universal primer for M13 sequencing was from Collaborative Research.Isopropyl-P-D-thiogalactoside and 5-bromo-4-chloro3-indolyl-P-~-galactopyranoside were from BoehringerMannheim. 5'-(a-[35S]thio)dATP (2400 Ci/mmol) was from Amersham Corp. DNA ligase was a gift from Drs. David Bentley and Stan Fields, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom. Large-fragment DNA polymerase (Klenow) was from Bethesda Research Laboratories.Escherichia coli strain E.C. JM101-TG1 was derived from the standard JMlOl strain (24) by Toby Gibson, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom). The sonicator was a model W375 cell disruptor from Heat Systems-Ultrasonics, Inc. equipped with a cup horn for 150-ml beakers. The graphics terminal used for computer-assisted analysis of DNA sequences was a VT640 RetroGraphics terminal combined with a VT59PI printer interface from Digital Engineering. The Digital LA120 printer was upgraded for graphics by a Decplot board from Texprint. Containment-Recombinant DNA work was performed in accordance with the National Institutes of Health guidelines for recombinant DNA research under PI physical and EK1 biological containment conditions. Preparation of DNA Restriction Fragments-The cDNA clone pMLC3/7 and the genomic DNA clone XMC3/KW4 have previously been described (18, 19).Phageparticles and plasmid DNA were prepared and purified by double-equilibrium centrifugation in cesium chloride density gradients as described by Maniatis et al. (25). 100 pg of plasmid pMLC3/7 DNA were digested with 200 units of the restriction enzyme PstI for 3 h at 37 "C and subsequently with 300 units of EcoRI for 1.5 h at 37 "C in standard buffers as specified by the manufacturer. Another aliquot was digested with StuI. To generate HindIII restriction fragments of XMC3/KW4, 529 pg of phage DNA were digested with 350 units of HindIII for 2 h at 37 "C. The fragments were separated by electrophoresis in 0.9% agarose gels containing ethidium bromide and visualized by ultraviolet light. Troughs were cut in front of the relevant bands and filled with 1.0% low-gelling-temperature agarose containing ethidium bromide. The fragments were electrophoresed into thelow-gelling-temperature agarose and recovered by excision, melting at 65 " C , and repeated extractions with phenol saturated with 0.5 M NaCl. Subcloning into M13 Phage Sequencing Vectors-Random subfragments of cDNA and genomic DNA fragments were cloned into M13 by standard procedures (26). Briefly, 5-10 pg of a DNA fragment 12 kilobase pairs in length were self-ligated for 2 h a t 16 "C with DNA ligase in order to multimerize the fragment. The ligase was heatinactivated for 10min at 80 "C. The DNA was sonicatedunder conditions calibrated to generate average subfragments of200-600 bp' in length. For fragments with two different cohesive ends, 10% The abbreviation used is: bp, base pair.

of monomeric fragment was then added to the oligomers in order to achieve even statistical representation of the terminal sequences of the fragment upon sonication. The ends of the sonicated fragments were repaired with 10 units of T4 DNA polymerase for 3 h at 16 "C. Fragments in the size range of 300-600 bp were isolated by trough elution after electrophoresis in a 1.5% agarose gel containing ethidium bromide. Following phenol extraction and ethanol precipitation, the mixture of random subfragmentswas ligated into thedouble-stranded replicative DNA of phage M13 mp8, linearized with the enzyme SrnaI, and dephosphorylated with calf intestinal phosphatase (Boehringer Mannheim). Blunt-end ligation was carried out for 14-18 h at 16 "C. The ligated material was used to transform E. coli K12 JM101-TG1, treated with calcium chloride. The transformed cells were plated on 2 X YT agar plates (10 g of Bacto-tryptone, 10 g of yeast extract, 5 g of sodium chloride, and 15 g of Bacto-agar/liter) in H-top agar (10 g of Bacto-tryptone, 8 g of sodium chloride, and 8 g of Bacto-agar/liter; Ref. 26), containing isopropyl-0-D-thiogalactosideand 5-bromo-4chloro-3-indolyl-~-~-galactopyranoside. Colorless plaques were used to inoculate 1.5-ml cultures of E. coli cells. Cultures were grown for 5.5 h a t 37 "C, and phage particles were isolated from the culture medium by precipitation with polyethylene glycol. Single-stranded DNA (template DNA) was prepared from virions by phenol extraction and ethanol precipitation and resuspended in a final volume of 50 pl of 10 mM Tris-HC1, pH 8.0, 0.1 mM EDTA. DNA Sequencing-DNA sequencing was performed using the dideoxy sequencing technique (27) with modifications for the use of 5'(a-[36S]thio)dATP(28). 5 pl of template DNA were annealed with 0.15 pmol of universal primer in a total volume of 11 pl for 1 h at 60 "C. From this was then takenfour aliquots of 2 pl. The sequencing reactions were carried out by adding to each aliquot 2 p1 of deoxynucleotide triphosphate/dideoxynucleotide triphosphate mixtures (26) and 2 p l of polymerase mixture. This mixture contained 10 pmol of 5'-(a-[35S]thio)dATP(400 Ci/mmol) and 2 units of DNA polymerase (Klenow fragment) in 9 pl of 10 mM dithiothreitol and 10 mM Tris-HC1, pH 8.0. The reaction was allowed to proceed for 20 min at room temperature. Then, 2 p1 of a chase solution, containing0.5 mM deoxynucleotide triphosphates, were added to each reaction, and the incubation was continued for another 20 min at room temperature. The reaction was stopped by addition of formamide/dye solution (26). Electrophoresis was performed in 52 cm long, 0.35mm thick 6% polyacrylamide gels containing a buffer gradient (28). After electrophoresis, the gels were fixed with ethanol and acetic acid, dried, and exposed for autoradiography following standard procedures. Computer-assisted Evaluation of Sequence Data-sequences were aligned and an overall consensus sequence was derived using the computer programs DBAUTO and DBUTIL (29, 30). DIAGON was used for searches for internal homologies and comparison with other known sequences (30). RESULTS

Sequence of the C3 @ Chain-A set of murine C3 cDNA plasmids has been described whichcarries insertsencompassing the coding sequences for the a chain and the C-terminal 580 amino acids of the @ chain including the @/atransition region (18). The codingsequences for theN-terminal 62 amino acids of the @ chain were not representedby the cloned cDNA. However, a murine C3 genomic DNA clone, XMC3/ KW4, haspreviously been foundto contain the missing coding sequences. Its restriction map and the sequence of a 232-bp HindIII-BarnHI fragment containing the 5' end of the gene have beenpublished (19). Here, the complete sequence of the p chainhas beenanalyzed by sequencing first a 1956-bp HindIII subfragment containing the 5' end of the gene derived from the genomic clone XMC3/KW4 and then two overlapping cDNA fragments covering the main portion of the @ chain codingsequences. The two cDNA fragments were a 1665-bp EcoRI-PstI fragment encoding the central portionof the @ chain anda 242-bp StuI fragmentfor the C terminus of the 0 chain and the pia junction region. Both cDNA fragments were derived from the plasmid pMLC3/7. A synoptic map of these fragments and of the sequencing strategy is shown in Figs. 1 and 2. Theassignment of the relative locations of these fragments with respect to each other was

Murine C3 @ Nucleotide and Amino Sequences Acid

approximately 250 bpshould cover thetransition region, provided that thisregion did not carry additionalStuI restrictionsites. An analytical digestion of DNAfromplasmid Protein pMLC3/7 with StuI was performed, followed by electrophoresis of the fragments in agarose gels. The result (data not shown) indicated that a StuI band of approximately 250 bp cDNA was present. Upon double digestion withBglII, PstI, or BstEII and StuI, this fragment was reduced in size and showed the expected behavior for a fragment originating from the /3/a transition area (data not shown). Itwas therefore chosen for further sequence analysis. The genomic DNA fragment and genomic the two cDNA fragments were separately prepared,oligomerDNA ized, and sheared by sonication. Random subfragments were then cloned into phage M13and sequenced as described under "Experimental Procedures." The combined coding sequences FIG. 1. Alignment of polypeptide, cDNA, and genomic DNA derived from the genomic DNA fragment and thetwo cDNA sequences. Top, numbers above and below the boxes refer to amino acid residues. The signal peptide (SP, -24 to -I), f3 chain (1-642), fragments are given in Fig. 3, and the intron sequences and and only the first part of the chain are shown. The 4 arginine the 5"flanking sequences are given in Fig. 4. The sequence of the 1956-bp genomic DNA fragment was residues in the (3/a transition region are represented by vertical bars. The numbers under the boxes correspond to exon-intron boundaries obtainedfrom 54 random subclones,covering both DNA in genomic DNA (3and 66) or to restriction sites (101, 618, and 628 strands completely (Fig. 2). A total of 12,321 characters of in the (3 chain and 52 in the chain) for the enzymes HindIII (If), primary sequence data were generated for thisfragment, S t u l ( S ) ,and PstI (P)in the corresponding DNA sequences shown below. Middle, part of the cDNA insert of plasmid pMLC3/7 (18) is covering each characterof the consensussequence on average shown. Numbers above the box are coordinates in basepairs starting 6.29 times. A TATA consensussequence, probably representfrom the EcoRI site ( E ) that defines one end of this fragment. The ing part of the promoter of the C3gene, had previously been EcoRI-PstI fragment (1665 bp) and the StuI-StuI fragment (242 bp) reported at position 76 downstream from the 5' boundary of were used for subcloning and sequencing (see Fig. 2). Bottom, The this fragment (the HindIII site)(19). The data obtainedhere 1956-bp HindIII fragment of the genomic DNA clone hMC3/KW4 (19) is shown. The scale is in base pairs. Exon 1 ( E x l ) has coordinates (Fig. 4) confirm thesequence and location of this element as 107-242. Translation initiation triplet AUG is a t 163. Introns 1 and well as thepreviously reported coding sequences forthe signal 2 (fntl and fnt2) have coordinates 243-1303 and 1494-1854, respec- peptide and exon 1 of the gene (19). In addition, we have tively. The EcoRI site ( E )in exon 3 ( E d )at position 1872 corresponds derived the sequences of intron 1 (1061 bp), exon 2 (190 bp), to the left boundary of the cDNA fragment shown above. intron 2 (481 bp), and partof exon 3 (102 bp), all contained within the 1956-bp genomic DNA fragment (Fig. 4). The 3' 1 r r end of this large fragment overlaps by 87 bp with the 5' end 0 1000 zaao 3000 base pain 3800 of the EcoRI-PstI cDNA fragment of plasmid pMLC3/7 (Fig. 2). In the first intron, a repetitive sequence consisting of a - H I 10-fold tandem repetition of the tetranucleotide ACAT was c 6 observed (Fig. 4). s i The nucleotide sequence of the 1665-bp E c o R I - P s t I cDNA fragment coding for the majority of the /3 chain was determined by sequencing 68 random subclones (Fig. 2) covering the fragment completely on both strands. A total of 14,901 characters of primary sequence data were generated, thereby reflecting each nucleotide of theconsensus sequence 8.94 times on average. The 3' end of this fragment specified all but the C-terminal16 amino acids of the @ chain. FIG. 2. Sequencing strategy for the fl chain of murine C3. The C-terminal amino acids of the /3 chain and the /3/a Top, the fragments that were sonicated, randomly subcloned in M13, and sequenced are aligned here according to their overlaps to form junction region were determined by sequencing a 242-bp StuI one contiguous sequence. From top and bottom, the 1956-bp HindIII fragment derived from cDNA plasmid pMLC3/7. This fragby 26 nucleotides and fragment from XMC3/KW4 (coordinates 1-1956), the 1665-bpEcoRI- ment overlaps the EcoRI-PstI fragment PstI fragment from pMLC3/7 (coordinates 1970-3535), and the 242- extends into the N-terminal coding sequences forthe LY chain. bp StuI fragment from pMLC3/7 (coordinates 3709-3751). H , E, P, Two random subclones of this fragment were sequenced, one S, restriction sites of enzymes HindIlI, EcoRI, PstI, and StuI, respectively. Bottom, The double line in the middle represents the two on each strand, which cover the C-terminal portion of the @ complementary strands of contiguous sequence. Each sequence deter- chain completely. This p / a junction region had previously mined from random individual subclones is indicated by a small been sequenced using the chemical sequencing method (IS), horizontal barto show with which of the strandsthey match above or and the resultsgiven here confirm those earlier data. below the double line. From the nucleotide sequence given in Fig. 3, the length of the mature C3 /3 chain was calculated to be 642 amino acid residues, corresponding to a molecular weight of 70,641 for madeusing previously establishedrestrictionmapsand Southern-blot hybridization data which had been obtained the unglycosylated peptide chain. Thisvalue is in close agreeusing radiolabeledcDNA subfragmentsas hybridization ment with the values of 70,000-75,000 reported for the human probes (18,19). Thechoice of the StuI fragment to cover the (1, 2) and of 75,000 reported for the murine 0 chain (31), @/cy transition region was only made toward the end of this obtained by using differenttechniques. From the comparison sequencing project. At this point, the coding sequences for of the amino acid composition of murine C3 /3 with human major portions of the /3 and a chains (45)were already known, C3 0 (22, 32, 33), we detect a preference in murine C3 0 for and their analysis suggested that a StuI-StuI fragment of aspartic rather than glutamicacid as the major carrier of

P

SP

"

U

~""_

""""""

(Y

(Y

I

"

13853

-

13854

Murine C3 p Nucleotide and Amino Acid Sequences

FIG.3. Nucleotide sequence and derived aminoacid sequence for the mouse C3 B chain. Amino acid sequences are given in the single-letter code. The N terminus and C terminus of the mature p chain (NB, Cp) as well as the N terminus of the a chain ( N a ) are indicated. The arginine quadruplet is located between Cp and Na. Cleavage sites for the restriction enzymes EcoRI ( E ) , Hind111 (H), StuI ( S ) ,and PstI (P)are indicated above the corresponding nucleotides. The N terminus of the mature B chain is indicated by N@(above nucleotide) as well as the EcoRI, HindIII, StuI, and PstI sites at positions 214 ( E ) , 300 (H), 1853 and 2095 ( S ) and 1883 (P). The C terminus (CP) is marked above residue 1926, followed by the arginine multiplet and the N terminus of the a chain ( N a )above nucleotide 1939.

negative charges. Among the hydrophobic residues, alanine and isoleucine appear more abundantin murine C3 and leucine more abundant in human C3. No Asn-X-Ser and AsnX-Thr triplets were foundin the sequence reported here, which are the most commonly used carbohydrate attachment sites in other mammalian plasma glycoproteins (34). Only 3 cysteine residues were found, and all of these are located in the C-terminal portion of the 6 chain. Internal Repetitions and Codon Usage-The sequence has been analyzed for internal repetitions by using the diagonalmatrix program DIAGON (30). With a scan length (sliding window) of 25 amino acids and a proportionalscore parameter of 280, no significant internal repetitions were discovered. At this parameter setting, the three internal homologies in the complement factor B sequence were clearly detected (35-37). Therefore, the C3 sequence does not appear to contain ex-

tended internal repetitions. It does contain a number of short (510 bp) repeated sequences. However, the statistical significance of this finding is low. Codon-usage analysis was performed using Staden’s program (29), and the result is given in Table I. The sequence data reported here allow us to correct four mistakes in previously published sequences; ( a ) in the 5’flanking sequence of the gene(Fig. 4), nucleotides 52-58 should read GGGAGGG instead of GGGACGG, and nucleotides 90-101 should read GGCTACAGCCCC instead of GGCACAGCCCC as previously reported (19); and (b) amino acids 20-25of the mature /3 chain are VLEAH instead of VLGAH (translationerror),and nucleotides 408 and 428 should read (as in Fig. 3) CAAGACCATCTACACCCCTinstead of CCAGACCATCTACACCCCC. The corresponding amino acids are KTIYTP instead of QTIYTP as previously reported (38).

Murine C3 p Nucleotide and AminoSequences Acid

13855

FIG. 4. Introns 1 and 2 and part of the promoter region of the C 3 gene. The TATAbox at position 76, the cap site (107),the exon-intron junctions (242, 1303, 1493, and 1854), and the 10foldtandemrepeat of the nucleotide

ACAT (608-647) are underlined.The signal peptide plus the first three amino acids of the mature B chain ( I , P, and M) as well as the amino acids coded by exon 2 and the 5’4erminal portion of exon 3 are given inthe single-letter code.

GC~ClG~GCCCCllCCClClG~G~llCClCC~ClC~GlCCC~ClCClCllG~GlC~CllCClCCl 1690 1700 1710 1720 1730 1740

The sequence of the C3 genomic DNA fragment confirms and extends the previous reports with a few corrections (see We have presented the complete primary structure of the below). Therefore, this report establishes the 5”flanking semurine C3 4 chain as deduced from sequences of both cDNA quence of the C3 gene more definitively. It is anticipated that and genomic DNA clones. The combined approach of random this area will contain notonly the TATA consensus sequence, subcloning in M13 phage vectors and dideoxynucleotide sebut also other asyet unidentified control sequences necessary quencing has allowed us to produce very reliable sequence data inapproximately the same timethat itwould have taken for the expression of the C3 gene. A repetitive sequence to analyze these DNA fragments with a targetted sequencing element was found in the first intron. This sequence did not strategy. It was necessary to use both genomic DNA and belong to mouse satellite DNA (39, 40), nor did it resemble cDNA sequences to derive the complete amino acid sequence mouse B1 family or the related human Alu family sequences, of the C3 molecule, because no cloned cDNA sequences for the bestcharacterized murine and human repetitive sequence the 5‘ end of the mRNA were available. Usually it is not classes (41, 42). The deduced amino acid sequence of the murine C3 P chain possible to deduce aprotein sequence by inspection of a was compared with the partial amino acid-sequence data of genomic DNA sequence, because alternative pathways of RNA transcription and/or maturation may be utilized. In this cyanogen bromide peptides from the human C3 p chain (46). particular case, cDNA sequences were found which encoded Sequence homology was observed despite the species differ580 out of the 642 residues of the mature /3 chain. The cap ences, and nine characterized human cyanogen bromide fragsite of the murine C3 mRNA had previously been mapped ments could be aligned with the mouse sequence in an un(18), and amino acid sequence data for the N termini of ambiguous manner. Out of 187 amino acid residues in human human (22) and guinea pig C3 @ has been published (20). C3 @,132 were identical in mouse C3 P. Thus, a 73% direct These published sequences agree well with the murine se- homology wasobserved. No typical carbohydrate attachment sites were found in the quences presented here. For these reasons, the amino acid sequence presented here should represent the major transcrip- murine ,8 chain coding sequence. We conclude that most likely tionandtranslation product of the C3 gene. We cannot the murine C3 @ chain is not glycosylated in agreement with formally exclude that alternative RNA-production mecha- our previously reported findings (43). Glycosylation of the nisms and amino acid sequences might exist that differ in the human C3 p chainhas been reported (44), however. It is 60 N-terminal residues of the p chain from those given here. possible that species-specific glycosylation patterns have However, we have not found any experimental evidence that evolved. Astrikingfeature of the primary structure is the low suggests their existence. Until now, no protein-sequencedata are available to confirm the existence and thesequence of the abundance of cysteine residues in the p chain compared to predicted signal peptide. the (Y chain (45). The human a and p chains are disulfideDISCUSSION

Murine C3 P Nucleotide and Amino Acid Sequences

13856

TABLE I Codon usage for mouse C3 /3 chain The C3 /3 chain coding sequence contained 54.3% C + G residues and 45.7% A + T residues. T was used 390 times, C 519 times, G 526 times, and A 491 times. ~"________ ~~

Amino acid

Codon

-~

Times used

%"

codon

% ""

~

Phe

TTT 36.0 9 TTC 16 1 TTA TTG 5 CTT 1 12 CTC 6 CTA 29 CTG

Ala

GCT18.4 7 GCC 18 42.4 Leu GCA 12 31.6 GCG 1 2.6 TAT25.0 5 TAC 15 25.0 CAT 35.7 5 CAC 64.3 9 Ile ATT 15 CAA 8 33.3 ATC 22 CAG 16 66.7 ATA 1 AAT14.3 4 Met ATG 12 AAC 85.7 24 Val 3 GTT AAA11.4 5 20 GTC AAG 39 88.6 6 GTA GAT 17 48.6 42 GTG GAC 18 51.4 Ser 6 GAA 10 29.4 TCT 14 TCC GAG70.6 24 6 TCA TGT 1 33.3 0 TCG TGC 66.7 2 9 AGT TGG 4 100.0 9 AGC CGT 0 0.0 Pro 8 CCT CGC 28.0 7 ccc 13 CGA16.0 4 CGG CCA 10 22.8 7 28.0 CCG 5 13.9 AGA16.0 4 Thr 11 AGG12.0 3 22.9 ACT GGT 16 33.3 ACC 5 11.1 17 GGC 35.4 16 35.6 ACA 4 GGA15.6 7 8.3 ACG GGG 17 37.8 a Expresses use of a certain codon relative to otherpossibilities for a given amino acid residue. 64.0 1.9 9.3 1.9 22.2 11.1 53.7 39.5 57.9 2.6 100.0 4.2 28.2 8.5 59.2 13.6 31.8 13.6 0.0 20.5 20.5 22.2 36.1

8. Law, S. K., Lichtenberg,N. A., and Levine, R. P. (1979) J. Immunol. 123,1388-1394 9. Whaley, K., and Ruddy, S. (1976) Science (Wash. D.C.) 193, 1011-1013 10. Pangburn, M. K., Schreiber, R. D., and Muller-Eberhard, H. J . (1977) J. Exp. Med. 146, 257-270 11. Nagasawa, S., and Stroud, R. M. (1977) Immunochemistry 14, 749-756 12 HarrisoniR. A., and Lachmann, P. J. (1980) Mol. Immunol. 17, 219-228 13. Ehlenberger, A. G., and Nussenzweig, V. (1977) J . EXP. Med. 145,357-371 14. Law, S . K., Lichtenberg, N. A., Holcombe, F. H., and Levine, R.P. (1980) J. Immunol. 125,634-639 15. Goers, J. W. F., and Porter, R. R. (1978) Biochem. J. 175, 675684 16. Campbell, R. D., Dodds, A. W., and Porter, R. R. (1980) Biochm. J. 189,67-80 17. Tack, B. F. (1983) in Springer Seminar in Immunopathology (Miescher, P. A., and Muller-Eberhard, H. J., eds) Vol. 6, pp. 259-282, Springer-Verlag, Heidelberg 18. Domdey, H., Wiebauer, K., Kazmaier, M., Muller, V., Odink, K., and Fey, G. H. (1982) Proc. Natl. Acad. Sci. U. S. A. 79, 7619762.1

19. Wiebauer, K., Domdey, H., Diggelmann, H., and Fey, G. H. (1982) Proc. Natl. Acad. Sci. U. S. A. 79, 7077-7081 20. Goldberger, G., Thomas, M. L., Tack, B. F., Williams, J., Colten, H. R., and Abraham, G. N. (1981) J . Biol. Chem. 256, 1261712619 ""_ 21. Hugli, T. E. (1975) J. Bid. Chem. 250,8293-8301 22. Tack, B. F., Morris, S. C., and Prahl, J. W. (1979) Biochemistry 18, 1497-1503 23. Thomas, M. L., Janatova, J., Gray, W. R., and Tack, B. F. (1982) Proc. Natl. Acad. Sci. U. S. A. 79, 1054-1058 24. Messing, J., Gronenborn, B., Muller-Hill, B., and Hofschneider, P. H. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 3642-3646 25. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) in Molecular Cloning: A Laboratory Manual, pp. 1-545, Cold Spring Harbor

Laboratory, Cold Spring, NY 26. Bankier, A. T., and Barrell, B. G. (1983) in Techniques in Nucleic

~

linked, and nofree sulfhydryl group has been observed in the human /3 chain (23). This could mean thatall cysteine residues are involved in disulfide bonds with thea chain or that there may be one internal disulfide bridge in the /3 chain and only one bridge linking the two subunits. If this were the case, then the two a' chain subfragments present in C3c need to be linked to each other rather than both to the /3 chain (45). Acknowledgments-We thank Toby Gibson and Maartende Bruijn for advice on the sequencing technology. The Department of Immunology of the Research Institute of Scripps Clinic has generously contributed equipment to thelaboratory of G. H. Fey, We thank Lois Tack for critical reading of the manuscript and Keith Dunn for the preparation. REFERENCES 1. Muller-Eberhard, H. J., and Schreiber, R. D. (1980) Adu. Immunol. 29, 1-53 2. Reid, K. B. M., and Porter, R. R. (1981) Annu. Reu.Biochem. 50, 433-464 3. Pangburn, M. K., and Muller-Eberhard, H. J. (1980) J. Exp. Med. 152, 1102-1114 4. Bokisch, V. A., Dierich, M. P., and Muller-Eberhard, H. J. (1975) Proc. Natl. Acad. Sci. U. S. A. 72. 1989-1993 5. Muller-Eberhard, H. J., Dalmasso; A. P., andCalcott, M.A. (1966) J. Exp. Med. 123,33-54 6. Law. S. K.. and Levine. R. P. (1977) Proc. Natl. Acad. Sci. U. S. A. 7. Law,'S. K., Fearon, D. T., and Levine, R. P. (1979) J. Immunol. 122, 759-765

27. 28. 29. 30. 31.

Acid Biochemistry (Flavell, R. A., ed) Vol. B5-08, Elsevier/ North-Holland Scientific Publishers, Ltd., pp. 1-34, Limerick, Ireland Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A.. 74,5463-5476 Bictgin, M. D., Gibson, T. J.. and Hone. G. F. (1983) Proc. Natl. A&d. Sci. U. S. A. €50,3963-3965 Staden, R. (1982a) Nucleic Acids Res. 10,4731-4751 Staden, R. (198213) Nucleic Acids Res. 10, 2951-2961 Gyongyossy, M. I. C., and Assimeh, S. N. (1977) J. Immunol. I

I

118,1032-1035 32. Taylor, J. C., Crawford, I. P., and Hugli, T. E. (1977) Biochemistry 16,3390-3396 33. Nilsson, V.R., Beisswenger, J. B., and Wyman-Caufman, S. (1980) Mol. Immunol. 17, 1319-1333 34. Marshall, R. D. (1972) Annu. Reu. Biochem. 41,673-702 35. Mole, J. E., Anderson, J. K., Davison, E. A., and Woods, D. E. (1984) J. Biol. Chem. 259, 3407-3412 36. Christie, D. L., and Gagnon, J. (1983) Biochem. J. 209,61-70 37. Gagnon, J., and Christie, D. L. (1983) Biochem. J. 209,51-60 38. Fey, G. H., Domdey, H., Wiebauer, K., Whitehead, A. S., and Odink, K. (1983) Springer Seminars in Immunopathology (Miescher, P. A., and Muller-Eberhard, H. J., eds) 6, pp. 119147, Springer-Verlag, Heidelberg 39. Lewin. B. (1980) in Gene Exmession: Eucaryotic Chromosomes, V01.'2, pp. 531-569, Wiley, New York 40. Lewin, B. (1983) in Genes, pp. 378-391, Wiley, New York 41. Deininger, P. L., Jolly, D. Y., Rubin, C. M., Friedmann, T., and Schmid, C. W. (1981) J. Mol. Biol. 151, 17-33 42. Schmid, C. W., and Jelinek, W. R. (1982) Science (Wash. D.C.) 216,1065-1070 43. Fey. G. H.. Odink., K.., and Chauuis. R. M. (1980) Eur. J . Immunol. io, 75-82 44. Chiu, F. J., and Atkin, C. L. (1983) J. Biol. Chem. 258, 72007207 45. Wetsel, R. A,, Lundwall, Davidson, F., Gibson; T., Tack, B. F., and Fey, G. H. (1984) J . Biol. Chem. 259,13857-13862 46. Lundwall, A. (1982) Ph.D. dissertation, University of Uppsala, .

A.,

Sweden

I

Suggest Documents