a-Mannosidase, in Dictyostelium discoideum - The Journal of ...

21 downloads 64 Views 5MB Size Report
Jul 11, 1991 - John SchatzleS, John Bush$, and James Cardellis. From the Department of ...... Lipman, D. J., and Pearson, W. R. (1985) Science 227, 1435-.
T H E J O U R N AOFL BIOLOGICALCHEMISTRY

Vol. 267, No. 6, Issue of February 25, pp. 4000-4007, 1992 Printed in U.S.A.

(0 1992 by The American Society for Biochemistry and Molecular Biology, Inc.

Molecular Cloning and Characterization of the Structural Gene Coding for the Developmentally Regulated Lysosomal Enzyme, a-Mannosidase, in Dictyostelium discoideum* (Received for publication, July 11, 1991)

John SchatzleS, John Bush$, and JamesCardellis From the Department of Microbiology and Immunology, Louisiana State University MedicalCenter, Shreueport, Louisiana 71130

The gene coding for the Dictyostelium discoideum obvious transmembrane regions were identified, sevlysosomal enzyme, a-mannosidase, has been cloned and eralshort hydrophobicaminoacid stretcheswere sequenced. To accomplish this, the mature60- and 58- found to belocalized in and around the Pro I1 region, kDa subunits of the enzyme were purified and suband these may responsible be for attachmentof precurjected to liquid-phase N-terminal amino acid sequenc- sors tomembranes. ing. Sequence information was obtained for both of the mature subunits, and a 48-mer oligonucleotide was synthesized based on the determined amino acid seInmammalian cells,lysosomalenzymes are transported quence of the 58-kDa subunit. Using this oligonucleotide as a probe, a n 8-kilobase Hind111 fragment of and targeted to lysosomes by at least two mechanisms (1). genomic DNA was isolated and subjected to Sanger The best characterized lysosomal enzyme targeting mechadideoxy DNA sequencing. The first 4400 nucleotides nism involves the generation of mannose 6-phosphate resicontained the complete a-mannosidase gene and 1100 dues on the N-linkedoligosaccharide sidechains of lysosomal nucleotides of 5”flanking DNA. Primer extension enzymes by the combined actions of pre-Golgi and/or Golgi analysis indicatedthat transcription begins at multiple localized enzymes (1).These phosphorylated lysosomal ensites -48 to -64 nucleotides upstream of the first nu- zymes are recognized by mannose6-phosphatereceptors cleotide of the predicted translation initiation codon. (MPR)’ which shuttle these proteins from the trans-Golgi A single open reading frame(ORF) of 3015 nucleotides network t o prelysosomal/late endosomal compartments (1). was found that was interrupted by a single intron and The acidic lumenal conditions within these compartments that contained the amino acid sequences of the N ter- promote theuncoupling of the receptor/ligand complex which mini of the two mature a-mannosidase subunits; a pol- allows the MPR to recycle back to the trans-Golgi network yadenylation signal was also found just downstream of (2). the termination codon. Apotentialcleavablesignal Mechanisms for lysosomal enzymetargeting have also been sequence was identified in the first22 amino acids of described which are independent of MPRs (1).For instance, the predicted precursor protein, and two propeptide regions (Pro I and 11) were identified that were im- cell lines isolated from I-cell disease patients are deficient in mannose 6-phosphate-generating enzymes mediately upstreamof the N termini of the 60- and 58- one of the (GlcNac-P-transferase) and do not phosphorylate mannose kDa mature subunits, respectively. These propeptide regions are not present in the mature protein and are residues on N-linked oligosaccharides attached to lysosomal therefore predicted to be proteolytically removed as enzymes (3-5). Therefore, lysosomal enzymes in these cells lysosomal targeting signal, and the membrane associated 140-kDa precursoris trans- lack the mannose 6-phosphate the majority of these lysosomal enzymes are mistargeted and ported to lysosomes and processed to the soluble 60and 58-kDa mature formsof the enzyme. In fact, po- secreted. However, some enzymes such as acid phosphatase sites were identified flank- are targeted correctly. In addition, other tissues from these tential proteolytic cleavage I and ProI1 regions. Pro I, which immedi- patients apparently correctly sort most of their lysosomal ing the Pro ately follows the signal sequence, consists of 18 amino enzymes. Finally transmembrane proteins, such as the lysoacids, most of which are highly charged and hydrosomal enzymeacid phosphataseand lysosomal associated philic residues, whilePro 11, found in the central por- membrane proteins, reach lysosomal organelles independent tion of the precursor, is very hydrophobic. While no of MPRs in a variety of cell types (6, 7). AnotherMPR-independentpathwayoperatesinyeast * Research by the authors was supported by Grant DK 39232 (to J. C.) from the National Institutes of Health and by funding from where vacuolar enzymes are transported and targeted to lyof the LSUMC-S Center for Excellence in Arthritis and Rheumatology. sosomal-like vacuoles by amechanism that is independent The costs of publication of this article were defrayed in part by the carbohydrate side chains (8). Soluble vacuolar proteins such payment of page charges. This article must therefore be hereby as carboxypeptidase Y and proteinase A contain vacuolar marked “aduertisement” in accordance with 18 U.S.C. Section 1734 sorting information within the primary amino acid sequence solely to indicate this fact. of their propeptide domains (9-11). However, no consensus The nucleotide sequencefs)reported in thispaper has been submitted sequence has been foundbetween the two propeptide regions to the GenBankTM/EMBL Data Bank withaccessionnumberfs) of these proteins (8).Additionally, studies on the plant vacM82822. 4 These two individuals contributed equally to thiswork and should uolar storage protein, sporamin, showed that the propeptide be considered senior co-authors. To whom all correspondence should be addressed Dept. of Microbiology and Immunology, LSU Medical Center, 1501 Kings Hwy., Shreveport, LA 71130. Tel.: 318-674-5756.

The abbreviations used are: MPR, mannose 6-phosphate receptors; kb, kilobase(s); ORF, open reading frame; Pro I and 11, propeptide regions I and 11.

4000

Characterization of the Lysosomal cu-Mannosidase Gene

400 1

region of the precursor protein was required for correctvacuolar targeting (12). This “pro” region has no apparent simiAY or proteinase A propeptide larity to either carboxypeptidase sequences, but it is unusualbecause it contains many charged *,.and hydrophilic amino acids (14)which may be important for .Levacuolar targeting in plants. Therefore, MPR-independent lysosomal targeting pathways existin these systems, but they are not as well characterized as the MPR-dependent targeting pathway. Dictyostelium discoideum is a haploid eukaryotic organism , that can be manipulated biochemically and genetically, has FIG. 2. Southern and Northern blot analysis. Northernblot an extensive lysosomal system, undergoes a relatively simple analysis (panel A ) was performed as descrihed under “Experimental developmental cycle, andcontainsnodetectableMPRs. Procedures’’ using 2 p g of poly(A) RNAfrom axenically growing cells Therefore, this organism can serve a asuseful system to study (lane 2) or from cells using bacteria as a food ~ o ~ ~ (rI rmep 1 ). Blots both the developmental regulation of lysosomal proteins as were probed with the ’T-end-laheled 48-mer oligonucleotide. Southwell as alternative pathways forlysosomal enzyme transport ern blots (panel R ) were performedas descrihed under “Experimental Procedures”using genomic DNA cut withF h R I ( [ n nE~ ) and Hindlll 13 and 14). In axenically and targeting (reviewedinRefs. (lane H ) . Blots were prohed with a nick-translated 2.3-kh n-mannosgrowing cells, a-mannosidase (oneof three well characterized idase cDNA insert. lysosomal enzymes in this organism) is synthesized on membrane-bound polysomes, cotranslationally inserted into the endoplasmic reticulum and N-glycosylated to yield a final the single long ORF reveals a number of findings including: 1) the location of two propeptide domains positioned in front membrane-bound glycoprotein of 140 kDa (15, 16). The preof the mature subunits in the precursor protein, 2) several cursor polypeptide is then transported to the Golgi complex regions in the precursor thatcould interact with membranes, where N-linkedoligosaccharides are modified by the addition 3) multiple potential endoproteol-ytic cleavage sites, and 4 ) of phosphate and sulfate groups (16,17). These modifications several DNA sequence elements upstreamof the 5’ transcripare probably not necessary for the correct localization of D. tion start site that could act as regulatory elements. discoideum lysosomal enzymes (23).A small percentageof the protein is constitutively secreted in precursor form,while the EXPERIMENTAL PROCEDURES.’ remainder is proteolytically processed into an 82-kDasoluble RESULTSANDDISCUSSION intermediate anda 58-kDa soluble mature form of the protein (16, 18, 19). This initial cleavage,which occurs in the late Isolation of a-Mannosidase cDNA and Genomic C1onP.s-To Golgi and/or endosomal compartments (16, 20, 21), may be describe further the molecular mechanisms remlating Iysorequired for correct localization of enzymes tolysosomes (22). soma1 enzyme targeting in Dictyostelium, the gene coding for The 82-kDa intermediateform is then processed first toa 80- the lysosomal enzyme, n-mannosidase,was isolated hy using kDa form which is then cleaved to a 20-kDa peptide and a an oligonucleotide probe generated in the following fashion. 60-kDa mature subunit. These eventsoccur within the dense a-Mannosidase 60- and 58-kDa mature subunits were purified acidic lysosomal compartments, and the last cleavage is cat- using a combination of size-exclusion,ion-exchange, and alyzed by a cysteine proteinase (22). Although the maintehydroxylapatite column chromatography, suhjected to sodium nance of acidic vacuolar compartments is required for com- dodecyl sulfate-polyacrylamide gel electrophoresis, and elecplete processingof lysosomal enzymes, low pH is not essential troblotted to Immobilon-P filters. These filters were stained for correct sorting of the hydrolases (40). The mature holo- with Coomassie Blue, and the mature forms of the protein enzyme that results from thisprocessing is alysosomally were excised and subjected to liquid phase N-terminal amino localized heterotetramer composed of equimolar amounts of acid microsequencing. Fig. 1 shows the hlotted and stained (18). the 60- and 58-kDa subunits mature forms of a-mannosidase (60 and 58 kDa) and the The expression of the genecodingfor a-mannosidase is resulting N-terminal aminoacid sequences derived from each regulated duringdevelopment of D. discoideum, aprocess subunit. induced by starvation which culminates in the formation of The N-terminal amino acid sequence information derived two differentiated cell types. The accumulation of a-mannos- from the mature 58-kDa subunitwas used to synthesize a 48idase activity begins immediately upon starvation of growing mer oligonucleotide that recognized a 3.6-kh mRNA when cells (at titers less than 10“ cells/ml) using bacteria as a food used in Northernblot analysis (Fig. 2). Rased on the molecular source, while in contrast, the gene is fully induced in cells mass of the primary translation product of a-mannosidase growing axenically at titers greater than10’’ cells/ml (24-27). (120 kDa), we predicted that the a-mannosidase mRNA would This increase in enzyme-specific activity during development be 3.0-3.6 kb in size. Additionally, this mRNA was present in is paralleled by an increase in both the rate of synthesis of axenically growing cells (lane 2 ) hut was absent in cells using the precursor and an increase in the cellular concentration of bacteria as a food source (lane 1 ) which agrees with previous a-mannosidase mRNA (26, 27). The gene is regulated at the studies concerning the regulation of n-mannosidase during level of transcription (27) in response to a protein termed the growth and development (25). This oligonucleotide was used prestarvation factor (28) which is produced and secreted by to screen a cDNA library (constructed in Lambda Zap11 using both growing and developing cells. However, bacteria may poly(A) mRNA isolated from cells developing for 2 h ) which interfere with the detection of prestarvation factor by cells, resulted in the isolation of fourcDNAclones. The largest thus accounting for the lack of expression of a-mannosidase clone (M4) contained anEcoRI insert of 2.3 kh and hybridized in cells using bacteria asa food source. Portions of this paper (including “Experimental Procedures” and T o begin to determine themolecular factor(s) essential for Figs. 1, 6, and 9) are presented in miniprint at the end of this paper. targeting of a-mannosidase to lysosomes and the regulation Miniprint is easily read with the aid of a standard magnifving glass. of a-mannosidase expression, the gene coding forthis protein Full sizephotocopies are included in themicrofilmedition of the has been cloned and sequenced. An analysis of the gene and Journal that is available from IVaverlv Press.

1

I

I

:

It

Characterization of the Lysosomal a-Mannosidme Gene

4002

to a 3.6-kb mRNA possessing the same expression pattern predicted for a-mannosidase (data not shown). Sequence analysis of the M4 cDNA clone revealed an ORF of 2240 nucleotides that contained the 58-kDa N-terminal peptide sequence confirmingthat thiswas an a-mannosidasespecific cDNA. However, this sequence analysis also revealed that the cDNA was missing both the 5' and 3' ends of the full-lengthcDNA, and using the M4 cDNA as aprobe in screening a variety of cDNA libraries failed to result in the isolation of a full-length cDNA. Southern blotanalysis of D. discoideum genomic DNA digested with EcoRI indicated the a-mannosidase2.3-kb M4 cDNA hybridized to a single band of 2.3 kb (panel B of Fig. 2) consistent with the presence of two internal EcoRI sites in the a-mannosidase gene. In contrast, this cDNA probe hybridized to a single band of 8 kb when genomic DNA was digested with HindIII (Fig. 2). Therefore, in order to clone the full-length gene, genomic DNA was digested with HindIII and size-fractionated on sucrose gradients. Those fractions containing the 8-kb DNA fragment that hybridized to the amannosidasecDNAprobe (as revealed by Southern blot analysis) were pooled and cloned into the HindIII site of pBluescript to generatea subgenomic library. This library was screened with the a-mannosidase cDNA, and a genomic clone with an insert of 8 kb was isolated. Restriction enzyme analysis of this clone and the a-mannosidase cDNA revealed that they were identical throughout the 2.3-kb EcoRI fragment corresponding to the original cDNA isolate (Fig. 3). Sequencing and Characterization of the a-MannosidaseGenomic Clone-The 8-kb genomic clone was double-digested with HindIII and EcoRI, and the resulting three fragments (Fig. 3) were individually subcloned into pBluescript. Nested exonuclease 111-mung bean nuclease deletions were generated for each of the subcloned fragments. These templates were then sequenced using the Sanger dideoxy sequencing method,

A

TC

and Fig. 3 indicates the sequencing strategy used. Analysis of the sequence revealed an ORF of 3015 nucleotides beginning with an ATG (the adenosine is designated as nucleotide position +1 in Fig. 4) thatconforms to theconsensus sequence for D. discoideum translation initiation codons (AAAATGG uersus AXAATGG for the consensus). Primer extension analysis using an end-labeled oligonucleotide complimentary to positions +35 to +66 revealed that transcription begins around nucleotides -64 to -48 and thata TATA box is found centered at nucleotide -87 (Fig. 5). The presence of multiple bands could represent multiple start sites of transcription or premature termination or slippage of reverse transcriptase due to the high A-U content of the 5'-untranslated region of the mRNA. An intron begins a t nucleotide position +157 in the transcribed portion of the gene and ends at nucleotide +294. This intron contains the consensus splice donor and acceptor sites for D. discoideum introns (63) and splits the Nterminal peptide sequence of the 60-kDa mature subunit. The open reading frame begins again at nucleotide position +295 and ends with a stop codon at nt+3154 that is followed by a consensus polyadenylation signal (AAUAAA, underlined in Fig. 4). The translated ORF predicts a polypeptide of 1005 amino acids with a calculated molecular mass of 114 kDa, which is inclose agreement withthe size of the a-mannosidase primarytranslation product of 120 kDa. Theamino acid sequence for both the 60- and 58-kDa N termini are found in the translated ORF (underlined inFig. 4), confirming this is the gene coding for lysosomal a-mannosidase. There are 12 potential N-glycosylation sites consistent with the observation that the cellular 140-kDa precursor contains 9-10 N linked oligosaccharides (15). Sequence analysisof the approximately 1100-nucleotide5'flanking region of the a-mannosidase gene revealed a high percentage of A and T nucleotides, consistent with what is observed for regions flanking other D. discoideum genes (Fig.

11.11

+

E

c

S

S

E

1 cDNA

B

FIG. 3. The dideoxy DNA sequencingstrategy for the complete a-mannosidase gene. PanelA represents the restriction map of the genomic and cDNA clones of a-mannosidase. Restriction enzyme sites are designated as follows: EcoRI, R (panel A ) and E (panel B ) ;SpeI, S; HindIII, H, ClaI, C. Sanger dideoxy DNA sequencing was performed using either double- or single-stranded DNA as templates. The arrows with closed circles represent sequencing using oligonucleotides as primers, while thestraight arrows represent sequencing using various exonuclease deletion clones. The a-mannosidase open reading frame is indicated by the enlarged hatched box within the 8-kb HindIII genomic fragment. An intron exists within the gene and splits the open reading frame as indicated by a white box. The positions of the initial methionine (ATG), internal EcoRI sites ( R ) ,HindIII boundary of the genomic fragment ( H ) ,and the stop codon TAA (*) are also presented in this schematic figure.

4003

Characterization of the Lysosomal a-Mannosidme Gene

-1066

CTATCGATAAGCTTGATAAGTACCAGATGATTCMTAGTACCCATTCTATTATTCT~TTTACTTGAGTTGTATCMTTTTAAATGGT~ACTTTCMCAWGCCMTTC T T C M T A C C T T T A T T A T T T A C M T G G T C T A T T A T C C A T ~ C T C T ~ T G M T C A C T G ~ C T ? ? G T A C C T A T T T T A T T ~ T G ~ A C A T M ~ T C T A G T T G G-846 TAT

GAMTCTC1TTTTGTTCTATTT~GATG??TTMTTAGTTTTTTWTT~TGAAAT~TGAAATG~GT~TCTATT~GT~~TMTl ATTACTT??TACCTTGGTTATATTCMTGATTTTGACAGTGTCAGGCGATAGGATGTAGCMTTMTG~TGAAAT??CT~TG~CT~TTCCATGGTAGATTTCT~

-626

TGACATAGTTMCTTCTTTGTCCATCGCMTCACTACCMTGMCATCTAT~GTTTAGTTATTTTTTTTTTTGATWTMGGMTT~Cl G

~

T

~

T

T

~

T

~

T

T

G

~

c

~

T

~

?

?

M

~

A

A

A

G

- 4M0 6

G

~

~

A A A A T A A A G A T T M T T A A A G A G G M G ~ T T A G ~ T M C A A T ~ T ~ T ~ G G M T ~ G ~ G T ~ C T A T A W A T T A T ~ ~ T T ~ TTCCCAAAATAMCMCCGATATCWCGGATMT~TGMT~TATAAAT??TATCACTTACCGTTTATAGATTTTTTTATT~TTMTTMTT~TTAT -186

TTATTTTTTMTTTTAAAATAAATTATTTAT??TTAGCCTGGT~TTTTTT~TTTTGAT~TATTAGATTTWACAT~TG~TT~AT~TATCTTTGGTT GMTGGGTTCCMTCATT~CCCCCTACCAAATTPTTWATWTTTTATTTTTTATTTTATT??AT??TT~TT~TTTTTTWTT~TTWT~T~~ >>>>>>>>>>>>>>>>>>> t34 "TWTMTAA GCCACACACACATAAAAATCCTMTAAAMMTTATTTATATTMTTTTTI TC IWITIATXOW S!TU M V I K K L F I L I F GTTTGTTTTTMTTATCMTG~TTMTGG~C~T~TGATATT~TC~CC~CTATCATCMCATTATTAAATG~CATATT~TGCT C L F L I I N E I N G K K T K I N Y I K K S K P K L S S T L L N V H I V A -254 CATACTCATGATgtatgtataattttttttttttaaaaaaaaaaataaaataaaataaggaaaaaaaaaaaaaatatatttatttattctaa~tctaatctttaatttC H T H D ttttdttttttatttttttttttttataaaaaaataata~ATGTTG~TGGTT~CAGTTGATGMTAWATTATGGATCAAATATGTCMTTGCAWTGCAGGTG (53-751 D V G W L K T V D E Y Y Y G S N M S I A F A G +474 TTCMTATACTTTGGATACAGCCATTACCTCTTTATTGGC~TCCAGAAAGAAAGTTTATTTATGTTG~TTGCATTTTTCC~GATGGTGGGATGMCMAGTACA V O Y T L D T A I T C L L A N P E R K F I Y V E I A F F Q R W W D E Q S T ACMTGCAAAATATTGTTAAGGGGTTTFCTTGGTTCG~GTGGTCMTTGMTTCATAMTGGTGGGTATTGTATGMTGATGAGGCTACTACCTA~ATGACGATACTATTGA 1113-148) T M Q N I V K G L V G K W S I E F I N G G Y C M N D E A T T Y Y D D T I +694 CCAAATGACATTGGGCCATCAATTCCTTTGGGAGMTTTCGGAGTTATGCC~GATAGGCTGGCACATAGATCCATTTGGCCATAGTGCMCACMGCTCGTATCTTTG D Q M T L G H Q F L W E N F G V M P K I G W H I D P F G H S A T Q A R I F GTCMTTAGGATTCGATGCATTCATTATCGGTCGTATGGATTATCMGATATCGAAGCTCGTTTAGAGMTMGC~TGGAGTTCAT~GGAGATCGACCCAAAGTACA (186-222) G Q L G F D A F I I G R M D Y Q D I E A R L E N K Q M E F M W R S T Q S T CCAGAGMTCMGTTTTCACATCAGTACTACGTGCMTGTATTGTACTCCAGATGGTTTCMCTTTGAGCMGGTGATGACCCMTTCMGATGACCCAAATTTATTCGA +914 P E N Q V F T S V L R A M Y C T P D G F N F E Q G D D P I Q D D P N L F TAATMCGTAGACTCMGAGCTGMCAGTTCACTCMGTGGCACTTGMTATGCAACTCATTATCGTACCMTMTCTACTMTTCCG~TG~TGTGATTTCGCTTACC (259-295) D N N V D S R A E Q F T Q V A L E Y A T H Y R T N N V L I P F G C D F A Y TTMCGCTCAAATGTATTACAAAMCATTGACAAACTMTCGCACATATCAATTC~TCCGGAT~TATGGTTT~T~GTTATATTCMCTCCATCMTTTATATA+1134 L N A Q M Y Y K N I D K L I A H I N S N P D K Y G L N L L Y S T P S I Y I GATGCTGT~TGATGCCMTCTAGTATGGGMGT~CTGATGATCTC~CCCATATGCTGATMTGMTTTACTTA~GGACTG~TATTTCCTTACTCGTCCAGC

(12-48)

-

(333-368)

D A V N D A N L V W E V K T D D L F P Y A D N E F S Y W T G Y F V S R P A T T G A A A G G C T A C G T T C G T C M A A T A A T G C T C T A T T A C A T G T A G T T G M C A M T G T T G G T M C T A G T A ~ M T C T T A T G C C M G T A G T A G ~ T ~ G A A C M ~ G G W Gr1354 ATG A L K G Y V R O N N A L L H V V E Q M L V T S S N L M P S S R S E Q L V D ACATTGTTATAATGAGAGAGG~AAT~CGTATTGCTCAACATCATGATGCCGTATCAGGMCAGAACMCMCATGTTGCAGATGATTATGCA~CGTCT~CAATTGGT (406-442) D I V I M R E V M G I A Q H H D A V S G T E Q Q H V A D D Y A E R L S I G +1574 AATTGTGCTTCATTGGMACTATTAATACAGTTGTTGGCACATTACTMCTGCCAATGGTMTTC~TCGGCCGCTGCCACTCCMCCAT??CC~CTGCCCATTATT N C A S L E T I N T V V G T L L T A N G N S K S A A A T P T I S F C P L AMTCMAGTATTTGTCCAGCAACTGATCCACTTTCATCAGGTACCTCAGTTCCAGTGTTMTTTATAATAG~TAAGTTGGACTCGTMTGMCCAGTTCGTACCTTAA (479-5151 L N Q S I C P A T D P L S S G T S V P V L I Y N S L S W T R N E P V R T L TTCCMTTGCAMTGTTACAGTMCTTCCTCATCCMTGGTTCMTTACATCTCMGTTMTC~TTMTGGTACCTTTATTCTAGAGTTTTTAGCCACMTCCCACCA +1794 I P I A N V T V T S S S N G S I T S Q V N Q I N G T F I L E F L A T I P P (553-5881

TTAtGTTATTCMCCTATATTATMCATCTACAGCATCTGATT~GTTGMCCAMTTCMTTCCAGCMTMTMTTCMGATGAAAWATAGTTTCAG~GGTGGTM L G Y S T Y I I T S T A S D F V E P N S I P A I I I Q D E I I V S G G G ~2014

K I N E K V S Y N D P I I L E N D Y I N V Q F S S Q D G S I L S I T N K T CAGGGGTTACTTCATCAATTACTCAPIGAATATATTTGGTATMTCCAAGTGWGGTMTGATGATTCTGCTCMTGTAGTGGTGCTTATATC??TAGACCTGTTGMGAT ~.~~ ~.~ (626-662) S G V T S S I T Q E Y I W Y N P S V G N D D S A Q C S G A Y I F R P V E D +2234 TTTGCTTATCCTTACMTAGCCACACCMGTGTATCGATTATTAGAGGTGMATCTCATCM~ATTAGMGATTTTGGTCAAATGAAAT~TTCAAACCTTTAGATT F A Y P Y N N A T P S V S I I R G E I S S S I R R F W S N E M V Q T F R ATATTCAMTGCTGATCATTTAGAGGTTGAAGAMTCATTGGTCCMTTGATATTAGTGATGGMTTGGT~AGATTGTATCACGTTATACCACTACATTGGTGACTG (699-735) L Y S N A D H L E V E E I I G P I D I S D G I G K E I V S R Y T T T L V T ACCAGACATGGTATTCCGATTCTCAAGGTATGGAAATGC~GAGAATAACCAACTATCGTCCATCATGGMTTTMCCGTGGTTC~CCMCCAGTGGTMTTATGTA +2454 D Q T W Y S D 5 Q G M E M Q K R I T N Y R P S W N L T V V Q P T S G N Y V CCTGTGMTGCTATTGCCTACATTCMGATC~MTCAATCATTACAATTCACCATTGTCACTGATAGMGTCGTGGATGTGCATCGTTMG~ATGGTCMTTGGATAT P V N A I A Y I Q D P N Q S L Q F T I V T D R S R G C A S L R D G Q L D 1773-808) CATGATGCATCGTCGTACTTTGAAAGRTCATGATGGTCGTGGTGTTGGTCMCCMTG~TGAGTCMCTC~TCGTTACAACTAGTMATTGA~TTCCATGATATTTCAA+2674 M M M H R R T L K D D G R G V G Q P M N E S T Q I V T T S K L I F H D I S CTTATGCTCMTCTCATTATCGTCCTGCTGCACTCTCTTTATCACATCCATTACTCCCMTGTTTACAACCACTCAACMTCTTC~CGATTGGMTTCTCMTATCAA ~~

1846-882)

S

Y

A

Q

S

H

Y

~~

R

P

A

A

L

S

L

S

H

P

L

~

L

~

P

M

F

T

T

T

Q

Q

S

S

N

'

D

W

N

S

Q

Y

~

e2894 CGTGTTTACTCACCATTGACTTCAGCCTCACCTTTACCAMTGGTTTAAAGATTCAMCTCTACMTGGTTAGATAATCMGATMTACCATACTTTTACGTATTGAMA G V Y S P L T S A S P L P N G L K I Q T L Q W L D N Q D N T I L L R I E TATTPATCAAATTGATGGTCAAGATTCACAAGATCCACAMCTATCACTTTAGATCTTTCMCTATTTTPTCMCMTTACCATCACTTCTGCMCTGMATGAATTTAA ,919-9551 N I Y Q I D G Q D S Q D P Q T I T L D L S T I F S T I T I T S A T E M N L CTGGAGTTCAAMATTATCMATTTATCAAGATTA~ATGGAAMCTGTTGATGGTMAAACTATGATCATAMTCATCATCATCAACTAMGAAGATTCTTCAMTGGT , 3 1 1 4 T G V Q K L S N L S R L K W K T V D G K N Y D H K S S S S T K E D S S N G TTTGTTTTTACTTTTTCTCCAATCAAATCAGMCTTTTATAATTACTACTAATTAA~AAMAAMAAMAA~AMAAMAAATTTTACAMGTAGAT~CTTA~ F V F T F S P I K S E L L . 393-10051 .3334 BBTTATAATTGTATATTTAAATTTATCCTTTTTGTATTTTTTAAGTTTTCTCTTTTCTAT~TTTTTTTTA~MAAMAAMGTGTTTCAGGTCCTMCTATCATC TTTTAACTTTTTTAAGTTTCATATGGAATTACTAAATTTTCTTTTTATTTTTTTATTTAC~TTTTTCTTTTAGGTCGATTACTTTTCATGATTCTGTGAAGTGA

FIG.4. The complete DNA sequence of the a-mannosidase gene. The numbers to the left designate the amino acid positions from 1 to 1005 of the ORF, while the numbers on the right represent the nucleotide position with the +1 position corresponding to the first nucleotide of the translation initiation codon. The amino acid sequences determined for the 60- and 58-kDa mature protein N termini are underlined in the ORF, while the stop codon is represented by an asterisk. The predicted polyadenylation signal (AATAAA) is also underlined. 4). However, there are several areas (e.g. nucleotide positions -403 to -387 and -193 to -155) that arerelatively G/C-rich. Interestingly, the nucleotide sequence centered at position -174 contains several motifs (TTGNTTG) that have been shown to be important in the developmental regulation of

other D.discoideum genes (59). The a-Mannosidase Precursor Is Composed of Five Domains-Fig. 7 is a schematic diagram of the organization of the a-mannosidase precursor polypeptide including a prediction of the regions that are hydrophobic and that contain

Characterization of the Lysosomal a-MannosidaseGene

4004

propeptides because they are removed post-translationally and are not part of the mature forms of the a-mannosidase enzyme. Previous studieshave shown that thea-mannosidase precursor is tightly associated with membranes,therefore suggesting it may be a transmembrane protein (16, 44, 45). Tc Initiation However, other than the N-terminal signal peptide, the hySites -41 drophilicity profile of a-mannosidase failed to reveal a hydrophobic amino acid stretch longer than 15 amino acids which could constitute a membrane spanning domain (43) (Fig. 7). In addition, the C-terminal region of the precursor is not hydrophobic, suggesting the absence of a signal for attachment of a glycolipid moiety (64). Interestingly, Fig. 7 shows that the a-mannosidase amino acid sequence does contain several smaller stretches (less than 15 residues) of hydrophobic amino acids predicted to form &sheet structures located in or near the ProI1 region. These regions could conceivably play a role in association of the precursor with endoplasmic reticulum and Golgi membranes. The locations of predicted secondary structure in the aG A T C 1 2 mannosidase precursor were determined using the programs FIG.5. Primer extension analysis to determine the start siteof Chou and Fasman(34), and Robson and Garnier (35). This only four regions in the of transcription. An "'P-end-labeled oligonucleotide complimentary analysis revealed thatthereare to nucleotides +35 to +66 was hybridized to poly(A) RNA isolated protein greater than 10 amino acids long that are predicted from axenically growing cells (lune 2 ) and extended using reverse to form a-helices, while in contrast there are 16 regions greater transcriptase. Lane I containsreactions using tRNA. The lunes than 10 amino acids long predicted to form @-sheets. This labeled G, A , T, C represent a sequencing reaction with ss DNA using suggests that the protein may have few predicted globular the same oligonucleotide as a sequencing primer. domains and is probably not very compact. The first 22 amino acids at the N terminus of the ass ProII mannosidase precursor have the characteristics of a signal sequence (46). Two lysines are found near the beginning of the sequence followed bya stretch of hydrophobic amino acids (Fig. 8A). The signal sequence does not have any of the features of other uncleaved signal sequences that act anchor to Amino I proteins to membranes, and the "-1, -3 rule" of von Heijne 508 1005 Acid # 2 S J L 4 1 hydrophobic predicts cleavage to occur at the glycine residue a t position pro, gly rlch reglon +22 (43,46). The ProI domain lies between the signal sequence and the start of the 60-kDa mature protein subunit at position 23-40 .- 2.00 ._ 1.00 and is unusual since it contains predominantly charged and r 0.00 2 -1.00 hydrophilic amino acids (Fig. 8A). For instance, 7 of the 18 -2.00 > I I -3.00 1; ! I 1 amino acids are lysine residues; in fact, almost 20% of the -4.00