Isolation and in Situ Localization of a cDNA Encoding ...

3 downloads 0 Views 12MB Size Report
... Kex2; prohormone convertase (PC); endoprotease; PC2; neuropeptide; ... Most neuropeptides are synthesized as large, biologically inactive precursors that ...... T., Mammarbachi, A., Cloutier, T., Seidah, N. G., and Castellucci, V. F. (1993).
Cellular and Molecular Neurobiology, Vol. 14, No. 1, 1994

Isolation and in Situ Localization of a cDNA Encoding a Kex2-like Prohormone Convertase in the Nematode Caenorhabditis elegans Eduardo Gómez-Saladín,1 David L. Wilson,1,2 and Ian M. Dickerson1,3 Received January 20, 1994; accepted February 24, 1994 KEY WORDS: Kex2; prohormone Caenorhabditis elegans.

convertase

(PC);

endoprotease;

PC2;

neuropeptide;

SUMMARY 1. A cDNA that encodes a Kex2-like prohormone convertase (PC) containing an active site similar to that of mammalian PC2 has been isolated from C. elegans. Total RNA was isolated from a mixed population of strain BA713 worms. After poly-(A)-selection and reverse transcription, degenerate/nested polymerase chain reactions (PCR) were performed using primers based on conserved regions within the active sites of the known vertebrate and invertebrate endoproteases. 2. Two distinct 300-bp PCR products that shared homologies with the active sites of known Kex2-like endoproteases were isolated. These two PCR products were used to screen a C. elegans cDNA library. 3. The complete cDNA for a Kex2-like endoprotease, designated CELPC2, was isolated and determined to be 2527 bp in length. This size was confirmed by northern analysis. The deduced amino acid sequence for the CELPC2 cDNA is very similar to the known Kex2-like endoproteases, especially at conserved regions within the active sites, but not identical to any one of them. The strongest structural homology was to vertebrate and invertebrate PC2 sequences. 1

Department of Physiology and Biophysics, University of Miami School of Medicine, P.O. Box 016430 (R-430), Miami, Florida 33101. Department of Biology, University of Miami, Coral Gables, Florida 33124. 3 To whom correspondence should be addressed. 2

9 0272-4340/94/0200-0009$07.0070 © 1994 Plenum Publishing Corporation

10

Gómez-Saladín, Wilson, and Dickerson

4. In situ hybridization suggests that CELPC2 is synthesized primarily in cells associated with the circumpharyngeal nerve ring and the dorsorectal ganglion.

INTRODUCTION Most neuropeptides are synthesized as large, biologically inactive precursors that must undergo endoproteolysis, usually at pairs of basic amino acids, before they attain bioactivity (Mains et al, 1990). Recently, a family of serine endoproteases in eukaryotes was discovered, which cleave propetides at pairs of basic amino acids (Steiner et al, 1992). Cleavage begins either in the trans Golgi (Seidah et al, 1991) or in secretory granules (Orci et ai, 1987). Processing by endoproteases often exhibits tissue and developmental specificity (Dickerson and Mains, 1990; Dickerson and Noël, 1991). Molecular cloning has resulted in the isolation and characterization of the subtilisin Kex2-like serine endoprotease family (Seidah et al., 1990, 1991; Smeekens et al, 1991; Smeekens and Steiner, 1990). Kex2, the first member of the eukaryotic subtilisin endoproteases, was isolated from yeast (Julius et al, 1984). A search of the GeneBank database with the Kex2 sequence revealed similarities between Kex2 and a partial human gene sequence known as fur (Roebroek et ai, 1986; van den Ouweland, 1989, 1990). The human fur gene was later shown to encode a Kex2-like endoprotease named furin (Bresnahan et al, 1990). Prohormone convertase 2 (PC2) was isolated from human insulinoma cells (Smeekens and Steiner, 1990), PC1/PC3 was isolated from AtT20 cells (Smeekens et al, 1991; Seidah et al, 1992a), and PACE4 was isolated from human osteosarcoma tissue (Keifer et al, 1991). Some mouse homologues have been identified as mFurin (Hatsuzawa et al, 1990), mPC2 and mPC3 (Seidah et al, 1990), and mPC4 (Nakayama et al, 1992). Homologues have also been found in rat (Misumi et al, 1990), Xenopus (Korner et al, 1991), Drosophila (Roebroek et al, 1992), and Hydra (Chan et al, 1992). A PC5/6 has been isolated from a variety of tissues in rat and mouse (Nakagawa et al, 1993; Lusson et al, 1993). Kex2 and furin amino acid sequences contain transmembrane domains but the rest of the known proteases appear to be soluble enzymes. Several novel PC2 homologues have been isolated recently from Lymnaea stagnalis (Smit et al, 1992), Xenopus laevis (Braks et al, 1992), Sus scrofa (Seidah et al, 19922?). and Aplysia californica (Ouimet et al, 1993). Furins are hypothesized to be part of the constitutive secretory pathway, and the PCs part of the regulated secretory pathways (Seidah and Chrétien, 1992). Involvement of Kex2-like endoproteases in prohormone processing in vivo is supported by experiments expressing antisense PCI mRNA in AtT20 cells; PCI mRNA levels decreased and processing of POMC was blocked (Bloomquist et al, 1991), Studying endoproteases in a simple but well-characterized organism with a limited number of neuroendocrine-like endoproteases is advantageous for the understanding of peptide processing at the organismal level. Such an organism is

*

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans

11

the free-living nematode Caenorhabditis elegans, probably the most characterized metazoan, with many genetic mutants described, many strains available in culture, the complete cell lineage mapped from the egg to the adult, and the complete anatomy described at the ultrastructural level. However, very little is known about serine endoproteases in C. elegans. Recently, a cathepsin B-like cysteine protease gene fragment has been isolated (Sakanari et al., 1989), a gut-specific cysteine protease has been characterized (Ray and McKerrow, 1992), and the bli-4 gene has been reported to encode a protein structurally similar to Kex2, containing a transmembrane domain (Peters and Rose, 1991), that is involved in cuticle development (Peters et ai, 1991). Based on the functions of endoproteases observed in vertebrates, one would predict that at least two types of endoproteases are involved in peptide processing in C. elegans: a Kex2-like endoprotease distributed throughout the body in a variety of tissues with structural homology to furin and a second type localized to neural tissue and with structural homology to the PCs. To address this issue, a degenerate/nested PCR and cloning approach was undertaken. The isolation of a PC2-like putative prohormone convertase from C. elegans has already been reported in abstract form (GómezSaladín et al, 1993) and herein the complete cDNA nucleotide and deduced peptide sequences are reported, as well as localization by in situ hybridization.

METHODS

Maintenance of Nematodes C. elegans strain BA713, (fer-15, back-crossed to N2; kindly provided by Tom Johnson), a temperature-sensitive sterile mutant, were maintained on NGM medium at the permissive temperature of 15°C with a lawn of Escherichia coli strain OP50. Before organisms were used for RNA isolation they were cleared of bacteria. E. coli were first grown on high-nutrient medium (TB broth) overnight at 37°C and were transferred to S medium (Wood, 1988), which allows for survival, but not growth, of the bacteria. Nematodes were transferred to the bacterial culture in S medium and allowed to clear for 5 days. Use of this strain allows for isolation of specific stages of the life cycle for future developmental studies. Isolation of RNA Organisms were harvested by sedimentation at 4°C overnight and centrifugation in 30% sucrose for 5 min at 500g. The pellets were resuspended in 0.1 M NaCl and centrifuged for 2 min at 500g. Nematodes were then gently agitated in a shallow layer of 0.1 M NaCl at room temperature for 30 min to encourage digestion of bacteria remaining in their guts (Wood, 1988). This minimized bacterial contamination in RNA preparations. The organisms were centrifuged as

Gómez-Saladín, Wilson, and Dickerson

12

before, resuspended in 5:1 guanidinium isothiocyanate solution, and homogenized with a Polytron five times for 30 sec. The lysate was centrifuged through a 5.7 M CsCl pad at 42,000 rpm for 16 hr using an SW55ti rotor. The pellet was resuspended in diethylpyrocarbonate-treated dH 2 0 and ethanol-precipitated twice. The Promega PolyATract mRNA isolation System IV, which involves binding of poly(A) RNA to biotinylated oligodeoxythymidine and purification with streptavidin-magnetic beads, was used to select for poly-A RNA. Polymerase Chain Reaction (PCR) Eight micrograms of poly(A)+ RNA was isolated from a mixed population of C. elegans strain BA713. After poly(A) selection, 200 ng of mRNA was used for reverse transcription with GIBCO BRL Superscript RNase H-reverse transcriptase and a downstream primer (Endo-10, described below). The samples were incubated for 10 min at 23°C, 50 min at 42°C, and 5 min at 95°C. Degenerate primers were designed based on the consensus amino acid sequence of the active sites found within the subtilisin-like catalytic domains of all the known vertebrate and invertebrate Kex2-like endoproteases. One upstream primer, Endo-8 (5'AAYMRNCAYGGNACNMGNTGYGCNGGNGA3\ corresponding to the HGTRCAGE motif), and two downstream primers, Endo-9 (5'NCCNSWNGCCCANAYRWANATNSWNCC3\ corresponding to the SIFVWAS motif) and Endo-10 (5'RTGYTGNANRTCNCKCCANGTNARRT3', corresponding to the LTWRD motif), were used in nested PCR for amplification using Taq DNA polymerase in a Perkin Elmer thermal cycler. In the first PCR reaction (2.0 mM dNTP, 1 mM DTT, 3 mM MgCl2, and 4 ng//¿l of each primer), primers Endo-8 and Endo-10 were used, starting with three cycles of 1 min at 94°C, 2 min at 40°C, a 3-min gradual increase to 72°C, and 1 min at 72°C. The samples were then taken through 25 cycles of 1 min at 94°C, 1 min at 45°C, and 1 min at 72°C. This reaction produced a 600-bp DNA fragment. In the second reaction, primers Endo-8 and Endo-9 were used and the samples, which contained 3 p\ of the first reaction as template, were taken through 25 cycles of 1 min at 94°C, 1 min at 45°C, and 1 min at 72°C. This second reaction produced a 300-bp DNA fragment.

«

Cloning The GIBCO BRL CloneAmp system was used for cloning PCR products. Primers Endo-8 and Endo-9 were resynthesized with a 5' addition of four CUA repeats (Endo-8) and four CAU repeats (Endo-9). The new primers were designated Endo-11 and Endo-12 and were used in a third PCR reaction, using the 600-bp fragment from the first reaction as template, so that the 300-bp product now incorporated uracil-containing tails at each end. Uracil residues were removed with the enzyme uracil DNA glycosylase by incubating the 300-bp PCR product with the enzyme for 30 min at 37°C and the PCR products were annealed to the BRL plasmid pAMPl. Two microliters from the annealing reaction were

*

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans

13

used to transform competent DH5a E. coli, which were grown at 37°C for 40 min, then plated on nutrient agar with ampicillin overnight. Restriction Mapping and DNA Sequencing

»

DNA was isolated from 48 colonies and analyzed by restriction enzyme digestion with £coRI and BamHl. All samples screened contained inserts of the proper size (300 bp). One of the DNA preparations was precipitated with polyethylene glycol (6.5% PEG, 0.4 M NaCl) and sequenced using the U.S. Biochemical ATaq Cycle-Sequencing kit. The cloning site in pAMP-1 is flanked by the SP6 and T7 promoters. Primers to these promoters were end-labeled with 32 P-7-ATP and used for dideoxy sequencing with 100 ng of DNA template. Samples were separated by electrophoresis on 8% denaturing acrylamide gels and the nucleotide sequence of the PCR product assembled using the IBI Macvector program. The C. elegans PCR product contained a single Avail restriction site. AU DNA preparations were then digested with Avail and BamHl at 37°C for 2hr. Fragments were separated by native acrylamide gel electrophoresis and stained with ethidium bromide. Screening of the cDNA Library A mixed-population C. elegans oligo(dT)-primed cDNA library (generously provided by J0rgen Johansen) was screened as follows. Approximately 106 AZAPII phage (Stratagene) were plated on bacterial lawns of E. coli BB4 cells on 10 NZY plates. Plaques were transferred to nitrocellulose filters, which were screened in duplicate with the two C. elegans endoprotease cDNAs labeled by random priming with 32P-adCTP. Positive plaques were picked and rescreened. E. coli were coinfected with AZAPII phage from the positive plaques and defective helper phage. Phagemid DNA was excised in vivo from the AZAPII phage as described in the Stratagene cDNA kit. Plasmid DNA was isolated from several positive clones and PCR-sequenced using the GIBCO-BRL DNA sequencing kit. To sequence the entire length of both strands of the cDNA clone, deletions were produced with exonuclease-III and mung bean nuclease after restriction digest. Deletions were also obtained by restriction digests at unique sites within the cDNA insert. Deletion products were religated, subcloned, and sequenced. Fragments not available by deletion were sequenced with customsynthesized oligonucleotides.

RACE The 5' end of CELPC2 was obtained by rapid amplification of cDNA ends (RACE). Total C. elegans RNA (~1 /ig) was used as template in reverse transcription with the primer RACE-1 (5'CCGTCATCCATAATCGC3'), which

Gómez-Saladín, Wilson, and Dickerson

14

hybridizes to the mRNA in a region near the 5' end of the incomplete cDNA clone. The first-strand cDNA synthesis reaction mix [20 mM Tris-HCl, pH 8.4, 50 mM KCl, 2.5 mM MgCl2, 100/ig/ml bovine serum albumin (BSA), 10 mA/ dithiothreitol (DTT), 100 nM RACE-1 primer, 500/¿M dNTP, and 10U//A1 Superscript reverse transcriptase] was incubated at 42°C for 30 min and at 55°C for 5 min. After RNAse H treatment (10 min at 55°C) and purification, terminal transferase (10U//x,l) was used to add a poly(C) tail to the 3' end of the firststrand cDNA. This antisense strand was used as template in a PCR reaction involving the RACE-2 primer (5'CAUCAUCAUCAUGTTGTGATGTTCTTTCC3' just upstream of RACE-1) and the BRL Anchor primer, which hybridized to the poly(C) tail of the first strand. The PCR reaction mix (20 mM Tris-HC, pH 8.4, 50 mM KCl, 2.5 mM MgCl2, lOO^g/ml BSA, 200 nM RACE-2 primer, 200/¿M dNTP, and 0.05U/ju,l TAQ DNA polymerase) was incubated for 35 cycles of 1 min at 94°C, 1 min at 43°C, and 1 min at 72°C. The PCR products were then annealed to the vector pAMPl in the presence of uracil DNA glycosylase at 37°C for 30 min. Plasmids were transformed into DH5a E. coli, and colonies were picked for small-scale DNA preparations. Ten clones containing 600-bp inserts were identified by restriction enzyme analysis with EcoRl and BamHl. Three clones were sequenced and contained identical sequences. Primers to SP6 and T7 promoters and four sequence-specific primers were used to obtain the complete sequence of both strands of the RACE clones.

Northern Analysis A 30-fig sample of C. elegans total RNA was separated by electrophoresis on a denaturing 1.5% agarose gel containing 6.5% formaldehyde. After electrophoresis, RNA was transferred to a nitrocellulose filter. The filter was dried and UV cross-linked. Hybridization was conducted for approximately 16 h [in 50% formamide, 5x standard saline citrate (SSC), 4x Denhardt's reagents, 0.1% (SDS), 20 mM sodium phosphate, and 100 jug/ml denatured salmon sperm DNA] at 42°C with a 600-bp DNA probe, from the active site of CELPC2, which was labeled with 32P-adCTP by random priming. The filter was then washed at 60°C (in 0.1 x SSC and 0.1% SDS) and exposed to Kodak X-OMAT AR film.

In Situ Hybridization The in situ protocol was modified from that published by Ray and McKerrow (1992) and the Boehringer Mannheim protocol (Gerhard Dahl kindly supplied the reagents). A digoxigenin UTP-labeled antisense RNA probe was prepared by in vitro transcription using the CELPC2 cDNA clone as template. Plasmid DNA was linearized by digestion with Bglll and RNA was synthesized with T7 phage RNA polymerase. This probe represented 1000 bp of the 3' end of the CELPC2 cDNA, corresponding to the P domain (the region downstream of the catalytic

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans

15

domain) and the 3' nontranslated region but excluding the catalytic domain. As a control, 1 fig of pSPT.18 Neo vector DNA provided by Boehringer was in vitro transcribed as before, producing a nonspecific probe. C. elegans were harvested by sedimentation at 4°C overnight and centrifugation in 60% sucrose for 5 min at 500g. The pellets were resuspended in 0.1 M NaCl and centrifuged for 2 min at 500g. Nematodes were fixed in 4% formaldehyde in phosphate-buffered saline (PBS) with 50 mM EDTA for 30 min, followed by 10 min in methanol. Fixed nematodes were rehydrated in a graded series of PBS and treated for 20 min in 100 /¿g/ml of proteinase K in PBS at 37°C. Permeabilized worms were washed twice in 2 mg/ml glycine in PBS for 1 min and three times in PBS for 1 min. Worms were then postfixed in 4% formaldehyde as before. Fixed worms were transferred in a graded series to prehybridization solution (5x SSC, l x Denhardt's reagents, 50% formamide, 0.2 mg/ml tRNA) and incubated at 60°C for 2hr. After denaturation at 95°C for 10 min, the probe (~50ng/ml) was hybridized to the nematodes for approximately 16 h at 60°C. Control worms were hybridized with the nonspecific probe. After washing in PBS the nematodes were incubated in 1 % blocking solution in PBS for 1 hr. This solution was then replaced with 150 mU/ml antidigoxigenin sheep polyclonal antibodies conjugated to alkaline phosphatase in 1 % blocking solution for 1 hr. After washing in PBS and in alkaline phosphatase (AP; Boehringer Mannheim) buffer, the worms were resuspended in 375/j.g/ml nitroblue tetrazolium salt (NBT) and 167/u.g/ml 5-bromo-4-chloro-3-indolyl phosphate (X-phos) in AP buffer. Worms were stained for approximately 16 hr in the dark. After washing in PBS, the worms were transferred to glycerol in a graded series and mounted for microscopy. Micrographs were obtained with a Zeiss WL photomicroscope fitted with a Nikon FX-35WA camera.

RESULTS

Clone Isolation Reverse transcription, degenerate/nested PCR, cloning, and restriction enzyme analysis resulted in the identification of two types of partial cDNA clones, and one clone of each type was selected for DNA sequencing. The DNA sequences and deduced amino acid sequences were different for the two types of clones, but both had similarities to the conserved regions of known Kex2-like endoproteases. A search of GenBank produced no identical amino acid sequences from published prokaryotic or eukaryotic peptides. Both clones were used to screen a C. elegans cDNA library. Several cDNA clones were isolated and a 2-kb cDNA was selected for sequencing. This cDNA clone contained most of the coding sequence for a Kex2-like endoprotease but lacked approximately 500 bp of 5' coding sequence. The missing 5' region was obtained by rapid amplification of cDNA ends (RACE).

Gómez-Saladín, Wilson, and Dickerson

16

The full-length cDNA is 2527 bp long (Fig. 1). The 5' leader sequence is 22 bp long and is identical to the leader sequence reported for three C. elegans actin mRNA and thought to be trans spliced from the 5' end of a 100-bp RNA encoded by sequences found within the spacer DNA of the 5S RNA genes (Krause and Hirsh, 1987). The 3' nontranslated sequence is 546 bp long, ending with at least 24 adenosines, and the resulting coding sequence is 1959 bp long. The poly(A) addition signal (AATAAA) is located 16 bases upstream of the poly(A) tail. Codon usage is consistent with the literature (reviewed by Wood, 1988). Proline bias in CELPCS is as follows: CCA, 77%; CCG, 15% CCC, 4%; and CCT, 4%. Glycine bias is as follows: GGA, 91%; GGT, 4%; and GGC, 2%. This cDNA sequence has been submitted to GenBank with accession number U04995. The deduced prepropeptide is 652 amino acids long. The predicted signal peptide is 22 amino acids long with a hydrophobic core, and cleavage probably occurring at alanine-22 (following the rules of Von Heijne, 1983), thus the predicted propeptide is 630 amino acids long. There are two possible tetrabasic sites for propeptide cleavage: at arginine-79 and -107. The site most likely to be cleaved by furin is arginine-107 since the consensus sequence (Hatsuzawa et ai, 1992) is conserved at this position in all known PC2 sequences. Thus, the predicted mature protein is 545 amino acids long, including a subtilisin-like catalytic domain and no transmembrane domain. A segment of 22 amino acids just downstream from the active site and a segment of 9 amino acids just upstream of the active site in CELPC2 are not found in the vertebrate endoproteases (Fig. 2). The deduced amino acid sequence for the CELPC2 cDNA is very similar to the known Kex2-like endoproteases, especially at conserved regions within the active sites, but not identical to any one of them (Fig. 2). For example, the amino acid sequence contains the HGTRCAGE, VGVAY, and GIRML motifs characteristic of subtilisin endoproteases. A search of protein databases revealed significant homologies of CELPC2 to known invertebrate and vertebrate PC2 sequences. Overall, CELPC2 is —65% similar to Aplysia californica PC2, Lymnaea stagnalis PC2, Xenopus laevis PC2, human PC2, mouse PC2, rat PC2, and Sus scrofa PC2. The homology of CELPC2 to other PC2 within the catalytic domain is 85-90%. The overall similarity of CELPC2 to human furin is only 42%, and that to human PCI is 40%. No perfect match was found for the CELPC2 amino acid sequence during this search. Northern Analysis A single prominent band, approximately 2.4 kb, was observed on the autoradiograph (Fig. 3), between the two C. elegans rRNA bands (1.75 and 3.5 kb). This approximates the size of the full-length cDNA (2527 bp). In Situ Hybridization Nematodes hybridized with the 1,000-bp CELPC2 antisense digoxigeninlabeled RNA probe showed signal in the pharyngeal region and in the tail (Fig.

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans

17

TTTAATTACCCAAGTTTGAGGT ATG AAA AAC ACA CAT GTC GAC CTA ATA TGT GTG TTC CTG TCG ATT 67 Met Lys Asn Thr His Val Asp Leu H e Cys Val Phe Leu Ser H e 15

11

TTC ATC GGC ATT GGT GAG GCG GTC GAC GTC TAC ACC AAC CAT TTC CAT GTT CAT TTA AAA GAG 130 Phe H e Gly H e Gly Glu Ala Val Asp Val Tyr Thr Asn His Phe His Val His Leu Lys Glu 36 GGA GGT GGA CTG GAA GAT GCG CAT CGG ATA GCC AAA CGT CAC GGA TTT ATT AAT AGA GGA CAA 193 Gly Gly Gly Leu Glu Asp Ala His Arg H e Ala Lys Arg His Gly Phe H e Asn Arg Gly Gin 57 GTT GCA GCA AGT GAT AAT GAA TAT CAT TTT GTG CAA CCA GCA CTT GTT CAT GCA CGA ACC AGA 256 Val Ala Ala Ser Asp Asn Glu Tyr His Phe Val Gin Pro Ala Leu Val His Ala Arg Thr Arg 78

i

AGA TCA GCA GGT CAT CAT GCT AAA CTT CAT AAT GAT GAT GAG GTT CTT CAC GTC GAG CAG CTG 319 Arg Ser Ala Gly His His Ala Lys Leu His Asn Asp Asp Glu Val Leu His Val Glu Gin Leu 99

i

AAA GGA TAC ACC CGT ACA AAG CGA GGA TAT CGT CCG CTT GAA CAG CGA CTG GAA AGC CAG TTT 382 Lys Gly Tyr Thr Arg Thr Lys Arg Gly Tyr Arg Pro Leu Glu Gin Arg Leu Glu Ser Gin Phe 120 GAC TTT TCA GCA GTC ATG TCT CCA TCG GAT CCA CTT TAT GGA TAT CAG TGG TAC TTG AAA AAC 445 Asp Phe Ser Ala Val Met Ser Pro Ser Asp Pro Leu Tyr Gly Tyr Gin Trp Tyr Leu Lys Asn 141 ACT GGA CAA GCT GGC GGA AAA GCT CGT TTG GAT TTG AAT GTG GAA AGG GCT TGG GCA ATG GGA 508 Thr Gly Gin Ala Gly Gly Lys Ala Arg Leu Asp Leu Asn Val Glu Arg Ala Trp Ala Met Gly 162 TTT ACT GGA AAG AAC ATC ACA ACA GCG ATT ATG GAT GAC GGA GTG GAT TAC ATG CAT CCG GAT 571 Phe Thr Gly Lys Asn H e Thr Thr Ala H e Met Aap Asp Gly Val Asp Tyr Met His Pro Asp 183 ATT AAG AAC AAC TTT AAT GCC GAA GCT TCA TAC GAT TTC TCA TCA AAT GAT CCA TTC CCA TAT 634 H e Lys Asn Asn Phe Asn Ala Glu Ala Ser Tyr Asp Phe Ser Ser Asn Asp Pro Phe Pro Tyr 204 CCA AGA TAC ACT GAT GAC TGG TTC AAT TCC CAC GGA ACT CGA TGC GCC GGA GAA ATC GTT GCC 697 Pro Arg Tyr Thr Asp Asp Trp Phe Asn Ser His Gly Thr Arg Cys Ala Gly Glu H e Val Ala 225 GCC CGT GAC AAC GGT GTC TGT GGA GTC GGA GTT GCC TAT GAC GGA AAG GTT GCT GGA ATC AGA 760 Ala Arg Asp Asn Gly Val Cys Gly Val Gly Val Ala Tyr Asp Gly Lys Val Ala Gly H e Arg 246 ATG CTT GAT CAA CCC TAC ATG ACC GAT CTC ATT GAA GCC AAT TCA ATG GGA CAC GAG CCG AGC 823 Met Leu Asp Gin Pro Tyr Met Thr Asp Leu H e Glu Ala Asn Ser Met Gly His Glu Pro Ser 267 AAA ATT CAT ATC TAC TCT GCT TCA TGG GGA CCG ACT GAT GAT GGA AAG ACT GTT GAT GGA CCA 886 Lys H e His H e Tyr Ser Ala Ser Trp Gly Pro Thr Asp Asp Gly Lys Thr Val Asp Gly Pro 288 CGT AAT GCC ACT ATG CGA GCA ATC GTC AGG GGT GTC AAC GAG GGA CGT AAC GGA CTT GGA TCC 949 Arg Asn Ala Thr Met Arg Ala H e Val Arg Gly Val Asn Glu Gly Arg Asn Gly Leu Gly Ser 309 ATT TTT GTT TGG GCC AGT GGA GAC GGA GGA GAG GAT GAT GAT TGT AAT TGT GAT GGA TAT GCG 1012 H e Phe Val Trp Ala Ser Gly Asp Gly Gly Glu Asp Asp Asp Cys Asn Cys Asp Gly Tyr Ala 330 GCA AGC ATG TGG ACA ATC TCG ATC AAT TCA GCC ATT AAC AAT GGA GAA AAT GCT CAC TAC GAT 1075 Ala Ser Met Trp Thr H e Ser H e Asn Ser Ala H e Asn Asn Gly Glu Asn Ala His Tyr Asp 351 GAA TCT TGT TCA TCA ACC CTT GCT TCC ACT TTC TCC AAT GGA GGA CGA AAC CCA GAA ACC GGA 1138 Glu Ser Cys Ser Ser Thr Leu Ala Ser Thr Phe Ser Asn Gly Gly Arg Asn Pro Glu Thr Gly 372 GTC GCT ACT ACC GAT CTC TAC GGA AGA TGC ACC CGA TCT CAC TCC GGA ACT TCC GCT GCT GCT 1201 Val Ala Thr Thr Asp Leu Tyr Gly Arg Cys Thr Arg Ser His Ser Gly Thr Ser Ala Ala Ala 393

Fig. 1. Nucleotide and deduced amino acid sequences of CELPC2 cDNA. Double arrow indicates predicted signal peptide cleavage site. Single arrows indicate predicted tetrabasic cleavage sites. The catalytic active sites are shown in boldface letters. The poly(A) addition signal is underlined. Figures on the right indicate nucleotide (top) and amino acid (bottom) position numbers. This sequence was submitted to GenBank with accession number U04995.

18

Gómez-Saladín, Wilson, and Dickerson

CCA GAA GCG GCA GGA GTC TTT GCC CTT GCT CTT GAA GCC AAT CCA TCT CTT ACC TGG AGA GAT 1264 Pro Glu Ala Ala Gly Val Phe Ala Leu Ala Leu Glu Ala Asn Pro Ser Leu Thr Trp Arg Asp 414 CTC CAA CAC TTG ACA GTT CTC ACT TCA AGC AGA AAC TCC CTG TTT GAC GGA AGA TGC CGT GAC 1327 Leu Gin His Leu Thr Val Leu Thr Ser Ser Arg Asn Ser Leu Phe Asp Gly Arg Cys Arg Asp 435 TTC CCA TCT CTC GGA ATC AAC GAT AAC CAC CGC GAC TCT CAT GGA AAC TGT TCT CAC TTT GAA 1390 Phe Pro Ser Leu Gly Ile Asn Asp Asn His Arg Asp Ser His Gly Asn Cys Ser His Phe Glu 456 TGG CAA ATG AAT GGA GTT GGA CTC GAG TAC AAT CAC CTT TTC GGA TTT GGA GTT CTT GAT GCA 1453 Trp Gin Met Asn Gly Val Gly Leu Glu Tyr Asn His Leu Phe Gly Phe Gly Val Leu Asp Ala 477 GCA GAG ATG GTC ATG TTG GCA ATG GCC TGG AAG ACT TCC CCA CCA CGT TAT CAC TGT ACT GCT 1516 Ala Glu Met Val Met Leu Ala Met Ala Trp Lys Thr Ser Pro Pro Arg Tyr His Cys Thr Ala 498 GGA CTC ATT GAC ACT CCA CAT GAG ATC CCA GCC GAT GGA AAC TTG ATT CTT GAA ATT AAT ACT 1579 Gly Leu lie Asp Thr Pro His Glu Ile Pro Ala Asp Gly Asn Leu lie Leu Glu Ile Asn Thr 519 GAT GGA TGT GCC GGA TCT CAA TTC GAA GTC CGC TAC CTG GAA CAT GTT CAA GCA GTT GTC TCG 1642 Asp Gly Cys Ala Gly Ser Gin Phe Glu Val Arg Tyr Leu Glu His Val Gin Ala Val Val Ser 540 TTC AAC TCG ACT CGT CGT GGA GAT ACC ACT CTC TAC TTG ATC TCT CCA ATG GGA ACC CGT ACC 1705 Phe Asn Ser Thr Arg Arg Gly Asp Thr Thr Leu Tyr Leu Ile Ser Pro Met Gly Thr Arg Thr 561 ATG ATT CTT TCC CGC AGA CCA AAG GAT GAT GAT TCA AAG GAT GGA TTC ACC AAC TGG CCA TTC 1768 Met Ile Leu Ser Arg Arg Pro Lys Asp Asp Asp Ser Lys Asp Gly Phe Thr Asn Trp Pro Phe 582 ATG ACA ACA CAC ACA TGG GGA GAG AAT CCA ACA GGA AAA TGG AGA CTT GTC GCC AGA TTC CAA 1831 Met Thr Thr His Thr Trp Gly Glu Asn Pro Thr Gly Lys Trp Arg Leu Val Ala Arg Phe Gin 603 GGA CCT GGA GCT CAT GCC GGA ACC CTT AAG AAG TTC GAG CTG ATG TTG CAC GGA ACA AGA GAA 1894 Gly Pro Gly Ala His Ala Gly Thr Leu Lys Lys Phe Glu Leu Met Leu His Gly Thr Arg Glu 624 GCC CCA TAC AAT CTC ATC GAG CCA ATC GTC GGT CAA ACC AAC AAG AAG CTC GAC ACC GTT CAA 1957 Ala Pro Tyr Asn Leu Ile Glu Pro H e Val Gly Gin Thr Asn Lys Lys Leu Asp Thr Val Gin 645 AAA GCC CAC AAA CGC AGC CAC TAA ATGTACAAAATTCCCCAATTTCTTCCTTCCTTCAAAAAATTTCTAAGAATT 2032 Lys Ala His Lys Arg Ser His *»* 652 TCCTCTTCCTCCTCCTCTTTTTAACTAGTTGAAAATGAAACAGATCATCGAGCTTCCCAGTCGACCCGCCCCCAATATCCGGT 2116 GATCCGTATTGTTGTAGTCGCTTCGGGTGCCGCACACAGATTTACAGATTTTTATATTTTGTAAGTATATTCCATCAAATCGT 2200 CCATCACCACCATTCGGCCCACCCATTTCTTCTTTTTTCTATTTCTCCCTCTTCCTTTTTTTCTCTCCCCTTCATATTTTTAT 2284 TGTCAGTTTTTCGACCCGATCATCACATCACAATTGTTAAATGAGCTGAGCCACACACACACAAACACAAATGACAGACACAA 2368 AACATCATTACCATCTGAAACGAAAACCGAAAAGAGCCCTTAAAATCTCTCTCACTCCATCATCACACCATCTTCATTCGCCA 2452 TGCAAATCCCTGCGCGCACAAGTCGAACCACCCATAATAAATCGTTCGGAATTGTTAAAAAAAAAAAAAAAAAAAAAAAA

Fig. 1. (Continued).

2527

19

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans 64 CPC2 HPC2 LPC2 HPCl HFUR

MKNTHVDLICVFLSIFIGIGEA-MKGGCVSQWKAAAGFLFCVMVFASAERPVFT jpLBFJlHK-fiffEDKAROV| WISFFLtWSRKVLVSLCLLCWA-tSVPGLGKEFSSAaSlWLJEfrHDmEDVARr MERRAWSLQCTAFVLFCAWCALNSAKA KRQFVNEWAAEIPffiP^ASAIj MELRPWLLWWAATGTLVLLAADA QGQKVFTNTWAVRIPfflPAVANSy

KLPFL -GL ETYlflPLLG ÏLNLj

IFGDY-

100 VQP, YHN( THAGVPl LFKHNHPI HKRGDTKiaSLSPHRPRHSI

IPLEQfOESQFDFSAVMSP SOTY blNEIDINM NMIF IKDAAKlflTVNKQHIGLKAKPKLPrMlF ALRDSALNLF —hMlW VYQEP T®ESSGCBT1SSS

Fig. 2. Alignment of the amino acid sequences. C elegans PC2 (CPC2), human PC2 (HPC2) (Smeekens and Steiner, 1990), snail PC2 (LPC2), (Smit et al, 1992), human PCI (HPCl) (Seidah et ai, 1991) and human furin (HFUR) (van den Ouweland et al., 1990). Gaps introduced to optimize the alignment are denoted by dashes. Putative N-glycosylation sites, filled circles; RGD motif, filled rectangle; PC2-specific motifs, asterisks.

Gómez-Saladín, Wilson, and Dickerson

20

7.5 4.4 2.4 1.4

Fig. 3. Size determination of CELPC2 mRNA by northern analysis. Approximately 30 fig of C. elegans total RNA were separated by electrophoresis in a 1.5% agarose gel containing 6.5% formaldehyde and transferred to a nitrocellulose filter overnight. The filter was then UV cross-linked and hybribized with a 32P-adCTP-labeled cDNA probe. Relative migration of C. elegans rRNA is indicated on the right. The 28S rRNA is 3.5 kb and the 18S rRNA is 1.75 kb. Relative migration of RNA standards is indicated on the left.

4). Adult worms showed intense signal in a group of cells circumscribing the isthmus and terminal bulb regions of the pharynx as well as another group in the tail. Staining was very consistent among adult nematodes but more variable in larvae. Control worms showed no significant staining (Fig. 5.). DISCUSSION A number of prohormone convertases have been identified in vertebrates and invertebrates. A novel PC2-like convertase from the nematode C. elegans, designated CELPC2, is described herein. The deduced amino acid sequence of CELPC2 is very similar to the known PC2 and contains a subtilisin-like catalytic domain. Significant homology to prohormone convertases was found near the active sites, aspartic acid-174, histidine-215 (in the conserved region HGTRCAGE) and serine-390. Other significant regions of homology include the IYSASWGP (271-277), SIFWASG (309-316), and LTWRD (409-414) motifs within the active site. PC2-specific residues are found in CELPC2, such as serine-214, proline-251, methionine-253, threonine-279, arginine-289, and aspartate-317. An alignment of all known endoproteases (a total of 20) shows that these residues are found only in the PC2 type endoprotease. There are four putative A/-glycosylation sites in CELPC2 (Fig. 2). Of these sites, two are unique to CELPC2 (asparagine-167 and -451), one is conserved in invertebrate PC2 (asparagine-290), and one is conserved in vertebrate PC2 (asparagine-542). The RGD motif, thought to be a receptor recognition signal for cellular matrix proteins, is present in CELPC2. However, the function of this conserved region, as well as many others, remains to be determined. The structure of CELPC2

cDNA Encoding a Kex2-like Prohormone Convertase in C. elegans

21

I

. :i

.

.

. 3 *!

;