Margaret S. Robinson ... hydroxylapatite columns (Pearse and Robinson, 1984). ...... at the Laboratory of Molecular Biology for advice on techniques; and John.
Cloning of cDNAs Encoding Two Related 100-kD Coated Vesicle Proteins (c -Adaptins) M a r g a r e t S. R o b i n s o n Medical Research Council Laboratory of Molecular Biology, Cambridge CB2 2QH, England
Abstract. Coat proteins of *100-kD (adaptins) are components of the adaptor complexes which link clathrin to receptors in coated vesicles. The c~-adaptins, which are found exclusively in endocytic coated vesicles, separate into two bands on SDS gels, designated A and C (Robinson, M. S., 1987. J. Cell Biol. 104:887-895). Two distinct cDNAs (sequences 1 and 2) encoding the two oL-adaptins were cloned from a mouse brain cDNA library. Southern blotting indicates that there is one copy of each of the two a-adaptin genes, and that there are no additional closely related genes. Based on the size of the predicted protein products of the two genes (108 and 104 kD), the relative abundance of the two messages in brain and liver, and
the reactivity of a sequence 1 fusion protein with different antibodies, it was possible to conclude that sequence 1 codes for A and sequence 2 for C. The two protein sequences are strikingly homologous to each other (84% identical amino acids), the major difference being an additional stretch of 41 amino acids, rich in prolines and acidic residues, inserted into the COOH-terminal half of A. In situ hybridization carried out on mouse brain sections indicates that the same cell type may express both transcripts, but that their relative expressions vary. Antipeptide antibodies are now being raised to find out whether the proteins are localized in functionally distinct populations of endocytic coated vesicles.
OST eukaryotic cells contain at least two types of clathrin-coated pits and vesicles. Coated vesicles that bud from the plasma membrane are involved in receptor-mediated endocytosis: they concentrate certain receptors and their bound ligands, allowing these proteins to be efficiently internalized. Endocytic coated vesicles also participate in the recycling of secretory granule membrane, and thus are particularly abundant in neuronal cells (Heuser and Reese, 1973). A second type of coated vesicle is associated with the Golgi apparatus and is thought to play a part in the targeting of newly synthesized proteins to lysosomes and secretory granules (Griffiths and Simons, 1986). In addition to clathrin, receptors, and ligands, coated vesicles contain clathrin-associated proteins, which have at various times been called assembly proteins (Zaremba and Keen, 1983), 100K/50K complexes (Pearse and Robinson, 1984), and, most recently, adaptors (Pearse, 1988). The adaptors are believed to mediate the interaction between clathrin and the cytoplasmic tails of receptors (reviewed by Pearse and Crowther, 1987), and therefore to determine which membrane proteins are to be recruited into a coated vesicle for transfer to another membrane compartment of the cell and which are to remain behind. Two types of adaptors have so far been identified, called HA-I and HA-II because of their relative elution from hydroxylapatite columns (Pearse and Robinson, 1984). Both have been shown to consist of a heterodimer of proteins of •100 kD, recently named adaptins (Pearse, 1988), and two
associated smaller proteins, all apparently at a 1:1:1:1 molar ratio (Keen, 1987; Ahle et al., 1988). Each adaptor contains one copy of a fl-adaptin (fl or fl) and one copy of a specific adaptin: an o~-adaptin in HA-II and a 3'-adaptin in HA-I (Ahle et al., 1988). The ot-adaptins are the best characterized so far. Two major t~-adaptin bands can be seen on SDS gels of bovine brain HA-H adaptors, and these have been designated bands A and C (band B, which runs in between the two t~-adaptin bands, is the fl-adaptin). A and C have been shown to be closely related to each other on the basis of peptide mapping, NH2-terminal sequencing, and monoclonal antibody cross-reactivity. However, while C appears to be expressed in all tissues, A has only been detected in brain (Robinson and Pearse, 1986; Robinson, 1987). Antibodies against the adaptins in the HA-II adaptor provided the first clue that different adaptors might be found in different types of coated vesicles. Cells that were stained with polyclonal antisera which reacted with both or- and /3-adaptins showed labeling of both plasma membrane and Golgi-associated coated vesicles, although different antisera gave somewhat different patterns (Robinson and Pearse, 1986). However, cells that were stained with monoclonal antibodies against the two ct-adaptins (A and C), which did not cross-react with/~, showed labeling exclusively of plasma membrane-associated coated vesicles (Robinson, 1987). Subsequently, Ahle et al. (1988) stained cells with monoclonal antibodies they had raised against fl and 3'- They showed that fl is found in both Golgi and plasma membrane-associ-
M
© The Rockefeller University Press, 0021-9525/89/03/833/10 $2.00 The Journal of Cell Biology, Volume 108, March 1989 833-842
833
ated coated vesicles, while 3' occurs only in the Golgi subset of coated vesicles. These results indicate that adaptor HA-II, containing a heterodimer of/3 and either aA or O~c, docks onto the plasma membrane, while HA-I, containing a heterodimer of/3' and % is targeted to the Golgi apparatus. Such a mechanism could allow the cell to sort different sets of receptors into coated vesicles budding from different membrane compartments. Many questions still remain to be answered. For instance, what are the precise functions of the different proteins making up the adaptor complexes? Are there additional types of coated vesicles in the cell, and if so, do they contain different adaptors? Why is there a special type of HA-II in brain, conraining A instead of C? Are A and C distinct gene products, or are they alternatively spliced products of the same gene? Are they localized in different cells, or in different parts of the same cell? As a first step toward addressing some of these questions, I have cloned and sequenced mouse brain cDNAs coding for adaptins aA and ac, the two related proteins found in the HA-II adaptor, cDNA clones encoding a/3-adaptin have also been isolated; these are now being sequenced and characterized by P. Parham and co-workers and will be described in a subsequent paper.
Materials and Methods Isolation and Characterization of Clones Adaptins tXAand txc were purified from bovine brain coated vesicles by "Iris extraction followed by a combination of gel filtration, hydroxylapatite chromatography, preparative gel electrophoresis, and SDS/hydroxylapatite chromatography, as described (Robinson and Pearse, 1986). The pooled A- and C-containing fractions from the last column ('x,500/~g protein) were dialyzed against 20% ethanol containing 0.5 mM 2-mercaptoethanol, lyophilized, and dissolved in 500 #l 70% formic acid. Proteolysis was carried out with 'x,5 mg CNBr for 4 h at room temperature, after which the sample was lyophilized, dissolved in 100 #l 0.1% trifluoroacetic acid, and applied to a C18 Nucleosil microbore HPLC column. The peptides were eluted with a gradient of 0-50% CH3CN in 0.1% trifluoracetic acid, and the peak fractions were collected manually. Sequencing was performed with a gas phase sequencer (model 89013; Applied Biosystems, Inc., Foster City, CA) with on-line HPLC identification of the phenyithiohydantoin-derivitized amino acids. Two separate mixed oligonucleotides were made for peptide 34 (see Table I): 5' AGCTGCTTCCAGCGCTGGAAGAAGTCCTGGCTTGCCAT 3' A T T A G 5' AGCTGCTTCCAGCGCTGGAAGAAGTCCTGGGATGCCAT 3'. A T T T G Recombinant DNA techniques were generally those described by Maniatis et al. (1982). A bovine brain eDNA library constructed in )~gtl0 was kindly donated by A. E Jackson and E Parham (Department of Structural Biology, Stanford University School of Medicine, Stanford, CA) (Jackson et al., 1987). Oligonucleotides were end labeled with 32p and used to screen 5 x 10s plaques, washing the filters in 2 x SSC at 42°C. Positive clones were plaque purified, digested with Eco RI to determine the size of the inserts, and Southern blotted onto nitrocellulose. Radioactive probes were made from the different inserts either by nick translation or by subcloning into M13 mpl9 and synthesizing a radiolabeled second strand which could then be cut with a suitable enzyme and gel purified. A hgtll mouse brain eDNA library, kindly provided by Y. Citri (Weizmann Institute of Science, Rehovot, Israel) (Martinez et al., 1987), was screened with the radiolabeled bovine brain inserts. Selected mouse brain inserts obtained from this screen were subcloned into a Bluescript plasmid vector (Stratagene Cloning Systems, San Diego, CA) for restriction mapping. Sequencing was carried out on consecutive and overlapping restriction
The Journal of Cell Biology, Volume 108, 1989
fragments subcloned into M13 mpl8 or mpl9, using the dideoxy chain termination method (Sanger et al., 1977). To sequence the entire coding region of both cDNAs in both directions, synthetic oligonucleotides were used as primers to fill in the gaps.
Northern and Southern Blots For Northern blotting, poly A-containing RNA was purified from mouse brain and mouse liver by the lithium/urea method (Auffray and Rougeon, 1980), subjected to electrophoresis (15 #g/lane), and blotted onto Hybond paper. The blots were probed with M13 fragments and washed at 65°C in 0.1x SSC. POly A-minus RNA from each of the two tissues was used as a control. Labeling of the tissues was quantified by cutting out the lane of interest and measuring the radioactivity, and then subtracting the radioactivity present in a control lane as background. The mean ratios from three separate experiments were taken. Southern blots were performed on genomic DNA purified from mouse liver, using nick-translated restriction fragments as probes. Low stringency labeling was carded out by incubating with the probe in 35% formamide at 42°C and washing in 2x SSC at 65°C, while the high stringency incubation was done in 50% formamide at 42°C, followed by a wash in 0.1x SSC at 65°C.
Production of Fusion Protein The hgtll phage containing both sequence 1 and 2 inserts were streaked out onto a lawn of Y1090 host cells. Induction of fusion protein and labeling of the filters were carried out as described by Huynh et al. (1983). The primary antibodies used for screening were a mixture of the monoclonal antibody AC1-Mll (Robinson, 1987) and an affinity-purified rabbit polyclonal antiserum against total HA-II 100-kD proteins, prepared essentially as described (Robinson and Pearse, 1986). Phage that produced a signal were then used as lysogens to infect Y1089 cells (Huynh et al., 1983). Crude cell lysates were subjected to electrophoresis and blotted, and the blots were labeled with ACI-MI1 (Robinson, 1987).
In Situ Hybridization Freshly dissected adult mouse brains were frozen in dry ice, and 12-#m sections were cut using a cryostat. The sections were treated essentially as described by Goedert (1987), except that they were l~0cked with acetic anhydride after the pronase digestion step, and extracted with chloroform after serial dehydration in ethanol. Hybridization was carried out using single stranded 3:p-labeled M13 probes in a hybridization mix containing 50% formamide for 16-20 h at 42°C, and the slides were washed in 0.1× SSC at 60°C. The probes were prepared by subcloning Hind III/Eco R1 fragments (bases 2,719-3,298 for gene 1 and bases 2,449-3,096 for gene 2) into both mpl8 and mpl9 and synthesizing a labeled second strand which was then cut with Eco R1 (mpl8) or Hind III (mpl9). The mpl9 probes were thus in the antisense direction, while the mpl8 probes were in the sense direction and were used as controls. The slides were exposed on Fuji x-ray film for 3 d to make contact prints, and were then dipped in emulsion and exposed for 2-4 wk before being developed.
Results Isolation of Two Distinct cDNAs The approach that was taken for cloning a-adaptin cDNAs was to use oligonucleotidcs based on peptide sequences to screen a bovine brain eDNA library. The NHvterminal sequences of both proteins have been determined and are known to be identical (Robinson, 1987). For internal sequencing, the c~-adaptins were separated from/3-adaptin by hydroxylapatite chromatography in the presence of SDS (Robinson and Pearse, 1986), digested with CNBr, and chromatographed on an HPLC column. Although there was some overlap in the elution profile, five of the peptide peaks gave useful sequence, and in some cases, both a major and a minor sequence could be determined. Table I lists the six sequences that were obtained.
834
Table L Comparison of Bovine Peptides and Mouse Protein Sequences Deduced from the cDNAs NH2 terminus/Peptide 10: Sequence 1: Sequence 2:
P P P
A A A
V V V
S S S
K K K
G G G
S? D D
G G G
Peptide21: Sequence 1: M Sequence 2: M
G C G
L L L
A A A
L L L
H H H
N? C C
I I I
A A A
N N N
V V V
G G G
S S S
R R R
E E E
Peptide34': Sequence 1: M Sequence 2: M
D D D
S S S
V V V
K K K
Q Q Q
S S S
A A A
S A A
L L L
E C C
L L L
L L L
R R R
L L L
Y Y Y
Peptide26': Sequence 1: M Sequence 2: M
G G G
E E D
N W W
T T T
A A S
R R R
V V V
V V V
H H H
L L L
L L L
N N N
Peptide39: W? Sequence 1: M C Sequence 2: M C
T T T
L L L
A A A
S S S
S S S
E E E
F F F
S S S
H H H
E E E
A A A
V V V
K K K
T T T
H H H
I I I
E D E
T T T
V V V
Peptide34: Sequence 1: M Sequence 2: M
S A S
Q Q Q
D D D
F F F
F F F
Q Q Q
R R R
W W W
K K K
Q Q Q
L L L
S S S
L L N
P P P
Q L Q
Q Q Q
E E E
V A V
Q Q Q
A A A
I I I
N N N
A A A
L L L
K K K
Peptides 34 and 39 were both used to design synthetic oligonucleotides, but only peptide 34, which later was found to be nearer the COOH terminus of the protein, proved useful for library screening. A mixed oligonucleotide was made from this sequence, incorporating the first 12 amino acids plus a methionine codon at the 5' end. Nine positive clones were obtained, with inserts ranging in size from 0.5 to 1.2 kb. Interestingly, they appeared to fall into two categories. A probe made from the 0.5-kb insert hybridized well to itself but poorly to all the other inserts, while a probe made from one of the other inserts hybridized well to all but the 0.5-kb insert. When some of the inserts were sequenced, the reason became apparent: they coded for two different but closely related proteins, both of which contained sequences nearly identical to that of the peptide. Sequence 1 (Table I) differs from peptide 34 in that it has an alanine substituted for the serine at position 2 and an alanine for the valine at position 19; while sequence 2 has an asparagine substituted for the leucine at position 14. (The substitution seen at position 16 in sequence 1 does not occur in the bovine cDNA; Table I gives the mouse sequences 1 and 2.) For all but the ser/ala substitution, both amino acids could in fact be detected on the sequencer. The most likely explanation for this finding is that the cDNAs are the products of two genes, one of which codes for A and the other for C. To obtain the complete sequences of the two proteins, a mouse brain cDNA library was used because it had longer inserts. The bovine brain cDNA inserts were used as probes, and 20-30 positive clones were isolated for each of the two transcripts. The sequences of the two mouse cDNAs are shown in Fig. 1, as well as their predicted protein products. A single screen yielded a clone containing the entire coding sequence of cDNA 2, while for cDNA 1, the 5' end of the coding region was obtained by rescreening the library with a Pst 1 fragment (bases 443-1,012) made from an insert obtained in the first screen. Restriction mapping was carried out on the longest inserts to look for possible polymorphisms
(e.g., alternative splicing). Although all of the inserts obtained for cDNA 2 produced similar restriction maps, 4 of the 15 cDNA 1 inserts that were mapped appeared to differ at their 3' or 5' ends. Sequencing these inserts indicated that they were the result of incomplete (rather than alternative) splicing: typical splice donor or acceptor sites were seen, and the putative introns contained in-frame stop codons. Although the two cDNAs were cloned from mouse, all of the bovine peptide sequences can be easily identified (see Table I and Fig. 1). These include the NH2-terminal sequence, beginning with the proline at amino acid position 2. The proline is immediately preceded by methionine in both cases, which probably serves as the initiator and is then cleaved off in the mature protein. The few slight discrepancies between the peptide and the cDNA sequences could be due either to species differences or to contaminating peptides coeluting from the HPLC. Moreover, some of the CNBr fragments of A and C may have eluted at slightly different positions, causing some of the peptide sequences (e.g., sequence 261 to correspond more closely to sequence 1 and others (e.g., sequence 39) to correspond more closely to sequence 2.
Robinson c~-Adaptin cDNAs
835
Sequence Analysis Three protein databases were searched for proteins homologous to those encoded by the two cDNAs, but no significant matches appeared. However, as shown in Fig. 2, the two proteins are remarkably homologous to each other: 783 out of 934 amino acids are identical, and most of the substitutions are conservative ones. The homology is particularly striking towards the NH2-terminus: only a single substitution (lys for arg) is seen for over 100 amino acids. Least homologous is the proline-rich region between amino acids 650 and 750 of sequence 1, where an additional stretch of 41 amino acids, including 11 prolines and 10 acidic residues, occurs in sequence 1 with no correlate in sequence 2. Partial sequences have also been obtained for the 3' ends of the two bovine cDNAs. Out of 93 amino acids in bovine
-205
~-*,..~n.~.~G
190
R
P
A
V
S
g
91 A / ~ 31 $ R
I
N
$
E L A N
O
O
O
g
R
O
L
A
V
F
I
S
D
I
R
$
K
F
g
G
D
K
A
E
I
C I T G G A ~ T ~ I G T G T G T L O G Y S $ K g Y V
I
R
N
C
181 ~ N ; C I G C ' I ~ " I " I C ~ ' K ~ T G ~ C A 61 K L L r I F L L G
H
D
I
D
' I G C ~ G F G H M E A V N L L S
271 91
K
$
C
G S
g
E
A
M P A V ~ K G D G N R G L A V F I S D I R N C K S K E A E I 3 1 K R I N K E L A N I R S K F K G D K A L D G Y S R K K Y V C
C ~ NK YTE
6 1 K L L F I F L L G H D I D F G f l M E A V N L L S S N R Y T E
K
O
I
G
¥
L
F
I
8
V
L
V
N
S
N
S
E
L
~"CGGCITATCAACAACC,C~IV.lU~.IAA~CCK; I R L ! N N A l K N O L
271AAGCAGA~rrr~_TC'I~._ _ _ _ 9 1 K Q I G Y L F I S V L V N S N S E L I R L I N N A I K N D L
361 ~ 121 A
$
R
N
P
T
F
N
C
L
~ A
L
H
C
3
A
N
V
G
8
R
E N G E A r
361 1 2 1 A S R N P T F M G L A L N C I A N V G S N E M A E A F A G ~
451 A T T C C ~ 153 I P R
I
S
A
A
L
C
L
~
G V V
S
p
451ATCCCCA~ ~ A ~ r ~ r ~ 1 8 1 3 P N I L V A G D T M D S V K O S A A L C L L R L Y R T S p
S
L
541 G A C ~ 1 8 I D L V P N G D W T S R V V H L L N D Q H L G V V T A A T S L
S
S
A
2 1 1 I T T L A Q K N P E E F g T S V S L A V S R L S R I V T S &
C
Y
p
p
2 4 1 8 T D L Q D Y T Y Y F V P A P W L S V K L L R L L O C Y p p
K
S
N
K V
811 2 7 1 P D P & V R G R L T E C L E T I L N K A O E P P K S R R V Q
T
L
k
S
S
991 ~'-~'~i 3 3 1 N O L G O F L Q N R E T N L R Y L A L E S ~ C T L A S S E F
S
V
R
O O
L
V
A
G
D
S
N
D
S
V
K
~
541 C~CI'IGGIGCCC~ 181 O L V P N
G
£
W
T
A
R
V
V
8
L
LN
211
I
T
C
L
C
K
K
S
P
D
O
F
g
T
C
I
S
L
A
V
8
R
L
S
m
l
v
241
S
T
D
L
0
O
I
T
Y
Y
F
V
P
A
P
W L
S
v
K
L
L
R
L
L
O
811 C(~7~GGA 271 P E D A
A
V
X
G
R
L
V
E
C
L
E
T
V
L
N
K A
0
E
P
P
991 331
G
O
r
L
0
N
R
E
T
N
L
N
¥
L
A
L
S
N
C
1081 T T C T C C ~ C C ~ C~.A~C CACATI~AC~T~ 361 F $ H £ A V K T N I D T V
I
N
A
CY_AACJ~CC~C~ L g T £ R D
V
D O N
L
A A D
R
L
y
K
A
T A A V
3 0 1 N S N A K N A V L r E A I S L Z I H H D S E P N L L V R A C C
N
O
L
E
-GG~uGCT $ A A
1171 C ~ T G C C A 391 D L L Y A
i~i~,-~,~CCGC~CK3AT~CCAAGCAC~-*-I..;-n~~-d,~.~GA 15 C O R S N A K O 1 V S E 14 L
R
Y
L
£
T
A
1261 C ~ 421 R E
V
L
W
Y
v
O
T
T C C ~ I L N L
D
T Y V
1351 ~ 451 R
I
E
I
A
G
g
1441 A A C ~ C A G I G T T T ~ ~ 481 g T V F E A L 1531 ~ 511 2
A
G
1621 841
L
L
STY
1711 571
NA
~ D
DV
V
A
S
~
g
L
E
~ V W
A
V
L
~ O I
D
Y
N
V
K
V
G
G
S
L
L
n
S
$
F
N
~.-ia.-z-~.a.~. L F P
E
T
g
A
T
l
0 G V L R A G 8
S
S V A $
P
A
P
_ P
V
O
I
N
r
I
N
.
L O Q R A V
E
E
E
tx.~-ix,~ E S S 3
8
N
D
"
L
A
R
O
V
E
N
A
LTL L
T
R
L
~ K R
P
T
g
. ~
S
T
V
G
TC.CCA~. A I
O
Y
I
rZ ~ C ~ C ~ L O E F G N
C L
C
S
~ V
0
¥
D
, .
Y
v
$
F
A
S
V
A
N
O
S
0~
Y
x E N
R
~
g R
, C
1801 Cs~A,;~~ ~ ~ , ~ ~ 601 E N P P F P
E Y
x
r P
I
A
T
Y
G V
S
A
R A
G
S
A
L
T
P
P
8
A
I 7 1 C T ~ - ~ G ~ C C A ~ T T ~ 3 ~ I L L Y A N C D R S N A Q Q I V A E N L S Y L E T A D Y S I R
1
2 6 1 r ~ G C . . ~ A A ~ ~ 4 2 1 £ E I V L g V A I L A E K Y A V D Y T W Y V D T I L N L I R
A
1351 A ~ TCAA TC,C/GC~I~ 4 5 1 1 A ~ D Y V S E E V W Y R V I Q I V I N R D D V Q G Y A A K
L
4 8 1 T V F E A L Q A P A C N E N L V K V G G Y I L G E F G N L I
L
1531 G C I ~ ~ $ 1 1 A G D P R S S P L I Q F N L L f l S K F N L C S V P T R A L L 1621 r a x , ~ z ' ~ % ~ v - ~ z ~ u u ~ ~ 8 4 1 L S T Y I K F V N L F P E V K A T I O D V L R S D N O L K N
£
1711 5 7 1 A D V E L O O R A V E Y L R L S T V A S T D I L A T V L N E
D
D
1801 0 0 1 R P P F P E R E S S I L A K L K K K K G P S T V T O L E E T
D
L
8 9 1 ~ C ¥ ~ a x , ~ n . a ~ O l ~ E 6 3 1 8 R E R S I D V N G G P E P V P A S T S A A S T P N P S A D
A
2071 ~ ' i ~ * ~ * v ~ 6 9 1 A S & V A P L A P G S E D N F A R r V C K N N G V L r E N Q
L
~ & A
~ I S H N A V K T H I E T V I N A L R T E E D V S V R Q E A V D I
I
OLR
r*x,u T D V L A T V J P
E
C
1 631
S
R
R
O
T
8
I
N
G
P
661
L
O
L
K A A P
P
PAA
2071 891
A
0
P
8
L
O
P
T
P
2161 721
D
P
A
P
A
A
D
P
-*x,~-* G P E
D
I
G
P
~
Z
P E A
2251 AM~.,*'.,~-~-us"..ra~.w* 781 N S G V L F
E
N
0
0
I
G
V
N
S
E
~.~'~,~t T V V
N
P
2341 781
G
811 841 871
N
K
T
S
V
R
Y
A
A
0
V
O
G
O
S
V
N
F
E
Y
O
O
T
A
A
O
D
F
F
~
$
A
K
K
_ A
V
S
2701 901A
E V T
2791 ATCA 931 I I
O
_ T
2881 961
E
P
S
K
0
F
E
L
0
N
PPA E A r
L
P V G G N L
8
E
L
E
L
L V D V
P
A I)
E
r
E
Q
N
G
O
L O T 0
P
P
£ F
S
D G P T
S
P
N
A
L
L
L
L
N
K
r
v
c
N
x,h
G
R
N
Y
L
F
T Y
E
~
I
L
L
G
L
A
V
P
P
A
P
T
G
P
P
P
S
S
G
G
G
L
L
V
D
V
F
S
1
1
1
L
P
A
O
V
0
O
V
L
N
3
E
C
L
R
D
r
L
T
P
P
L
L
A
O
S
h
T
h
K
L
P
V
T
I
N
$
•
r
0
P
T
E
S
R W K ~
L
$
L
P
L
0
E
A
0
E
l
F
K h
N
O
8 7 1 G F G S A L L E Z V D P N P A N F V G A G I I N T £ T T 0 1
rn.-z L L
G
O
8
A
L
L
D
N
V
D
P
N
P
E
N
F
V
G
A
G
9 0 1 G C L L R L E P N L O A O N Y R L T L E T S K D T V S O R L
L
O
V
~,~x,* G C L
L
R
L
E
~ P N
A
0
A
0
N
Y
R
L
T
L
E
T
9 3 1 C E L L S E O F *
N
N
L
C
L
A
q
0
lax.-* r *
E
P
S
~
~
N
D
2251 ~u~z-~l-~7 7 5 1 N ~ T P T L I C A D D L Q T N L N L O T K P V D P T V D G G
2
L A V O T K
G
8 1 1 F O N V S V K L P I T L N I F F O P T E N & $ ~ D F F ~ N W 2521 ~ 8 4 1
~ N
~
L
S
N
P
~
£
V
~
N
_
I
F
S
A
R
N
P
N
D
T
~
I
T
R
A
g
I
T~ATT I
-*'n~*~ F
E
L
2971 2971 3061 ~ ' ~ a ~ ' ~ ' P ~ ' ~
3241 A
~ ~ * ~..,* a',.A.TCIT,t
"
I
T
~
-
z
~
.
,
~
u
-
~
3
~
1
~
T I G T ~
Sequence 1
Sequence 2
Figure 1. Sequences of the two mouse brain cDNAs and their predicted protein products. The underlined regions of the proteins correspond to the peptides that were sequenced (see Table I). Sequence 1 has a polyadenylation signal and a poly-A tail, and thus has a complete 3' untranslated region.
sequence 1, 90 are identical to those in mouse, while 322 out of 333 amino acids are identical in sequence 2. These findings are consistent with results obtained from monoclonal antibody studies: most but not all such antibodies made against the bovine adaptins cross-react with rat and human (Robinson, 1987). The two proteins contain some unusual features, including the proline-rich acidic part of sequence 1 referred to earlier,
The Journal of Cell Biology, Volume 108, 1989
some clusters of basic amino acids at positions 55-57 and 615-620, and a long predicted a-helical domain between amino acids 350 and 450. There are no hydrophobic stretches long enough to span a lipid bilayer, consistent with biochemical data which indicate that these are not integral membrane proteins. Several potential phosphorylation sites for either calmodulin-dependent protein kinase (e.g., amino acids 381-384) or protein kinase C (e.g., amino acids 24-26) can
836
l i:PP:A:AV:VS:KS:GKD:GG:M:DR:GG:L:MA :VRF:IG:S:LD:AI:RV:NFC:KI:SS:KD:E:~A:ERI:NK:RC:I:KN:KS:EKL:EA:N:AI:ER:SI:KKF:RK:IG:N~:KK:AE:L :DLG:AY:NS I R S K F K G D K A L D G Y S ~ I
:::::::
1
i 56
111 166
[~
SAALCLLRLYK
~
D
166
,,i
T
550 [L F p EIVlK A T I QIDIV L RIS 0IS QLIKI~I A [) V E L Q Q I{ A V E Y LIRILSlTIV A S T OIIIL A T {r L E ~" ~I {~ b 606
P ERES
S I LAKLKR
GAASA
DDS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
728 783
--
~ ~1~ v ¢ ~ .I.1~ v i{ ~ ~ ~ ~ L L ~ "~
E F I~ Q N L G R M F liF Y G N K T SJTI~,._~L~I_N__~TIPTILZ C A DID L 0 TINILIN LL~PLY.ID P TIV D G ; i V
I S
T E AJP[V~N
I Q[~
Y 6 G TJF[Q.JNV S VIK L PII]TILIN K F F 6 P T £ MAISIO U F FI
Figure 2. Comparison of the predicted protein sequences of the products of cDNA 1 (above) and cDNA 2 (below). The two proteins are 84% identical, with regions of greater and lesser homology. The NH~ termini are especially similar; while towards, the COOH termini, an additional stretch of 41 amino acids is inserted into sequence 1. be identified in the two proteins. Although it is not known which if any of these sites are used, studies have shown that at least some of the adaptins are phosphorylated in vivo (Keen and Black, 1986; Bar-Zvi et al., 1988).
Identity of the Two Encoded Polypeptides Assuming that one of the two cDNAs codes for A and the other for C, which is which? Two lines of evidence indicate that cDNA sequence 1 codes for A and cDNA sequence 2 for C. First, the predicted molecular weight of the protein product of cDNA 1 is 107,605, while that of cDNA 2 is 104,017. Apparent molecular weights of A and C based on their mobility on SDS gels vary according to the gel system
Robinson et-Adaptin cDNAs
used, but A consistently migrates more slowly than C, with molecular weights estimated at -,112,000 for A and 105,000 for C (Pearse and Robinson, 1984; Ahle et al., 1988). Second, we have shown that C is expressed in many different tissues, while A has so far only been detected in brain (Robinson, 1987). To find out whether either of the two cloned cDNAs is brain specific, mRNA was purified from mouse brain and mouse liver, and equal amounts of each were subjected to electrophoresis and blotted. The blots were then probed at high stringency with labeled fragments from the 3' end of inserts containing sequences 1 and 2 (Fig. 3). The probe for sequence 1 labeled a single band in each tissue of,~4.5 kb, while the probe for sequence 2 labeled two bands in each tissue, of '~4 and 5.5 kb. Both probes labeled brain
837
Figure4. Westernblot of bacteria expressing sequence 1. Cells were lysogenically infected with a X-gtll phage carrying part of cDNA sequence 1 fused to the/~-galactosidase gene, and the blot was probed with a monoclonal antibody against the c~-adaptins. Positions of molecular mass standards are indicated. The highest molecular mass band labeled with the antibody is ,,o180 kD, which is the predicted size for this particular fusion protein.
Figure 3. Northern blots carried out on mouse liver and mouse brain. Poly A-selected RNA was probed at high stringency with antisense DNA from the 3' ends of the two cDNA: bases 2,719-3,298 for sequence 1 and 2,449-3,096 for sequence 2. Positions of rRNA bands are indicated.
Although a positive result would be more meaningful than a negative result for this experiment, unfortunately none of the cells infected with the sequence 2-containing phage produced any detectable antigen. Nevertheless, the finding that protein encoded by sequence 1 reacts with antibodies against the o~-adaptins provides additional and independent confirmation that the correct cDNA has been cloned.
Search for Other Genes
more strongly than liver, but when the labeling was quantified it was found that expression of sequence 2 in liver was 56% of that in brain, while expression of sequence 1 was only 9 %. Western blotting of whole tissue homogenates indicates that C is more abundant in brain than in liver (Robinson, 1987), consistent with sequence 2 coding for C; but the finding that a small amount of sequence 1, presumably coding for A, is expressed in liver as well as in brain was unexpected. It may be that the amount of A produced in liver has so far escaped detection by Western blotting. Another possible way of identifying the two cDNAs is by using monospecific antibodies to study their protein products. Since the cDNAs were cloned in the phage Xgtll, which can be used as an expression vector, there was a chance that some of the DNA might be inserted in such a way as to allow the expression of fusion protein. Accordingly, the phage were grown in a suitable host and the plaques tested for the presence of antigen. Two of the phage containing sequence 1 produced plaques that could be labeled both by an affinitypurified polyclonal antiserum and by monoclonal antibody AC1-Mll, which recognizes both A and C (Robinson, 1987). Fig. 4 shows a gel and Western blot of Escherichia coli cells after a lysogenic infection with one of the two phage, labeled with AC1-Mll. Although several labeled bands can be seen on the blot, indicative of proteolysis, the highest band runs with the mobility predicted for this particular construct. As expected, neither of the two C-specific monoclonal antibodies, which recognize different epitopes from each other (Robinson, 1987), labeled protein encoded by sequence 1.
Are there any other genes that are homologous to those coding for A and C? When the bovine brain library was screened with oligonucleotides, only eDNA sequences 1 and 2 were isolated, but it is possible that the codon bias of the oligonucleotide prevented the detection of other, related cDNAs, or that such genes might not be expressed in brain. Thus, Southern blots of mouse genomic DNA, digested with five different restriction enzymes, were probed with 800-900-bp fragments derived from the 5' ends of sequences 1 and 2 under both high and low stringency conditions (Fig. 5). At high stringency (Fig. 5, a and c), both of the probes produced fairly simple patterns, generally labeling one band strongly with weaker labeling of one or more additional bands, although only a single band is labeled in the two Eco R1 tracks. These results indicate that there is a single copy of each of the two genes in the mouse genome, but that both genes contain restriction sites for most of the enzymes. In some cases, restriction sites can be found in the coding region (for instance, both cDNAs contain Pst 1 sites near the 5' end), while in other cases, the restriction sites presumably occur in introns. At lower stringency (Fig. 5, b and d), each of the two probes labeled additional bands, but essentially all of these bands can be accounted for by cross-hybridization between the two genes. For instance, in the Hind llI tracks, both of the probes labeled a heavy band and a light band at high stringency. At low stringency, an additional band appears in each of the Hind HI tracks (Fig. 5, b and d, asterisks), and
The Journal of Cell Biology,Volume 108, 1989
838
Figure 6. Contact prints of mouse brain sections hybridized in situ with probes for sequence 1 (a and c) and sequence 2 (b and d). Two pairs of consecutive serial sections from different parts of the brain are shown. The probes were the same as those used in Fig. 3. Arrows point to part of the dentate gyms of the hippocampus (long arrows) and the habenular nucleus of the thalamus (small arrows) where differences in the intensity of the labeling can be seen. Figure 5. Southern blot of mouse genomic DNA probed at high and low stringencies. Mouse liver DNA was cut with the indicated restriction enzyme, and the blots were probed with fragments from the 5' ends of the two cDNAs: bases -205 to 708 for sequence 1 (a and b) and -150 to 683 for sequence 2 (c and d). High stringency labeling is shown in a and c; low stringency labeling is shown in b and d. Asterisks in b and d indicate bands appearing at low stringency in the Hind III track which can also be seen at high stringency with the other probe. it exactly corresponds in size to the heavier band seen at high stringency with the other probe. Because no strong new bands appear, it seems unlikely that there are any other o~-adaptin genes in mouse.
Localization of Transcripts What function is served by having two different o~-adaptins, one of which is mainly expressed in brain? One way of approaching this question would be to find out which cells in brain express sequence 1 and which express sequence 2, and whether the same cell can express both genes. To localize mRNA for each of the two sequences, in situ hybridization was carried out on mouse brain sections, using antisense DNA as probes. Two pairs of consecutive serial sections are shown in Fig. 6, labeled with probes for se-
Robinson c~-AdaptincDNAs
quence 1 (Fig. 6, a and c) and sequence 2 (Fig. 6, b and d). Both probes labeled some parts of the brain more strongly than others, with the hippocampus, cerebral eorl~, olfactory bulbs, and cerebellar grey matter particularly in evidence. These are all regions where neuronal cell bodies are concentrated, indicating that both genes are expressed in neurons. However, some differences can be seen in the two labeling patterns, one of which is indicated by the arrows. The long arrows point to part of the dentate gyrus of the hippocampus, while the short arrows point to the habenular nucleus of the thalamus. The probe for sequence 2 labeled both of these structures with about equal intensity, while the probe for sequence 1 gave relatively little labeling of the habenular nucleus. Control sections (not shown), which were incubated with sense DNA, showed only very faint uniform labeling. The same slides were dipped in emulsion for resolution at the cellular level. Fig. 7 shows another region of the hippocampus from the same sections that appear in Fig. 6, a and b. Both probes strongly labeled the cells of the hippocampus, but the relative labeling of the two different cell types, the pyramidal and granule cells, differs in Fig. 7, a and b. The probe for sequence 2 (Fig. 7 b) labeled the pyramidal cells much more strongly than the granule cells, while the probe for sequence 1 (Fig. 7 a) labeled both types of cells more equally. Thus, the two genes appear to be expressed in the
839
Figure 7. Microscopic view of the sections shown in Fig. 6, a and b. The granule cell layer (G) and pyramidal cell layer (P) of the hippocampus are indicated. Bars, 100/zm.
Previous studies on the adaptins have made use of protein chemistry and monospecific antibodies to try to define the different types of proteins and their functions. In spite of the progress that has been made, many basic questions, such as how many different adaptins there are and how they are related to each other, have remained unanswered. By cloning and comparing the cDNAs coding for proteins, one can ad-
dress such questions directly; and in addition, tools then become available for learning more about protein function. Adaptins ctA and Ctc are the first to be characterized in this way. Before the cDNAs were cloned, there seemed to be a good chance that A and C would turn out to be alternatively spliced products of the same gene. The two proteins had been shown to be closely related, with identical NH2-termini; and there were precedents for brain-specific splicing giving rise to a larger protein (Martinez et al., 1987; Jackson et al., 1987), consistent with the finding that A was only detectable in brain. However, the present study shows that the two pro-
The Journal of Cell Biology, Volume 108, 1989
840
same populations of neuronal ceils, but their relative expression differs in different cell types.
Discussion
teins are in fact encoded by two different genes. The two predicted protein sequences are highly homologous, sharing the same NH2-termini as well as the five other peptides that were sequenced; and both the similarities and the differences that were observed in the two sequences are entirely consistent with biochemical and immunological results. Three lines of evidence indicate that, of the two cDNAs that were cloned, sequence 1 codes for A and sequence 2 for C. First, sequence 1 encodes a larger polypeptide than sequence 2. Second, expression of sequence 1 in brain is about tenfold greater than in liver, while expression of sequence 2 is about twice that in liver. Finally, the protein product of sequence 1, expressed as a fusion protein, is recognized by antibodies that cross-react with A and C, but not by C-specific antibodies. A computer search showed no significant homologies between the ot-adaptins and any published protein sequences, nor has any homology been revealed so far between the sequences of the ot and/~-adaptins. One might expect 3,-adaptin to be somewhat related to oL, since it is thought to play an analogous role in the HA-I Golgi adaptor. However, it appears that A and C have no very close relatives, and thus are probably the only two a-adaptins. Southern blotting carried out under low stringency conditions, so that A and C are cross-hybridized to each other, did not reveal the presence of any new genes. Moreover, restriction mapping provided no evidence for alternative splicing within the coding region of the two genes. Thus, the observation that both A and C split into two bands on some gel systems (Ahle et al., 1988) is most likely due to posttranslational modifications such as phosphorylation. Interestingly, Northern blotting indicates that there are two species of mRNA for C. None of the clones that were isolated for C had a complete 3' end, as judged by the lack ofa poly-A tail, but one such clone had a 3' untranslated region of ,ol.5 kb; thus, this particular mRNA species must be over 4.5 kb. Since the two bands seen on Northern blots are ,o4 and 5.5 kb, it is possible that there may be alternative splicing at the 3' untranslated end. What, if any, are the functional differences between A and C? The only clue available at present is that A is primarily expressed in the central nervous system. However, although Western blotting has so far only detected A in brain and spinal cord (Robinson, 1987; and unpublished observations), Northern blots of liver mRNA show a small but reproducible amount of transcription of A as well as C. It is possible that only a subset of cells in liver express A: for instance, nerve cells, although the level of expression of A in liver seems to be too high to be attributable to nerve cells alone. In the case of brain, it has been possible to compare the overall distribution of the two messages by in situ hybridization. Both were shown to be most abundant in areas where neuronal cell bodies are concentrated. Moreover, one can conclude that the same neuronal cell types (e.g., the pyramidal cells of the hippocampus) express both genes, although the relative expression of the two genes varies according to cell type. Thus, although higher resolution studies are needed, it seems likely that the same cell can express both A and C. It is not clear how a cell would benefit from the coexpression of two closely related oe-adaptins, but one possibility is that the proteins may be used to form two functionally distinct populations of endocytic coated vesicles. For instance, one ot-adaptin may be used at the synapse and another in the rest of the cell, or
Robinson ct-Adaptin cDNAs
they may even be found in the same part of the cell but in different coated vesicles. Antibodies are now being made against synthetic peptides in order to compare the distribution of the two proteins in brain. It will also be important to find out whether A is translated in other tissues such as liver, and if so, whether it is present in the same coated vesicles as C. It is known that different adaptors associate with different membrane compartments, but precisely how they do so is not clear. Pearse has shown that adaptors bind to the cytoplasmic tails of selected transmembrane proteins, such as the receptors for mannose-6-phosphate and low density lipoprotein (Pearse, 1985, 1988), but it seems unlikely that they are anchored by this interaction alone. The same proteins that are concentrated in coated pits on the plasma membrane are apparently excluded from coated pits during their biosynthesis when they pass through the Golgi apparatus (Griffiths et al., 1985). If adaptors were targeted to a particular membrane compartment solely through their binding to receptor tails, one would expect HA-II adaptors to be recruited to the Golgi apparatus as well as to the plasma membrane. Thus, it seems that there must be another docking mechanism to ensure that adaptors always go to the appropriate organelle. Clearly, more protein chemistry is needed to determine how the different proteins in the adaptor complex associate with each other, as well as with other components of the coated vesicle such as clathrin, receptors, and possible docking sites. The ability to express the cloned cDNAs in E. coli, where they can be experimentally manipulated, should be useful for such studies. In addition, it should now be possible to learn more about adaptor function by expressing the ot-adaptin cDNAs in tissue culture cells. With such an approach, one could try to create dominant mutants, or look at the role of phosphorylafion, or find out whether the two a-adaptins are interchangeable by expressing A in cells that ordinarily only make C. Moreover, if-t-adaptin turns out to be related to the ot-adaptins, as predicted, one could construct chimeric proteins to try to find out more about how they are targeted to the appropriam membrane compartment. For although we are beginning to understand something about how adaptors function in the sorting of other proteins, we still know nothing about how they themselves are sorted. I thank Barbara Pearse for encouragement and advice; Fred Northrop for sequencing all the peptides; Tony Jackson, Peter Parham, and Yoav Citri for letting me use their eDNA libraries; Mike Runswick for help with the HPLC; Michel Goedert, Steve Hunt, Bill Wisden, and Maria Grazia Spillantini for teaching me how to do in situs; Tony Crowther for his computer expertise; Thierry Bogaert, Jon Glickman, Olof Sundin, and many others at the Laboratory of Molecular Biology for advice on techniques; and John Kilmartin for invaluable discussions. Received for publication 19 August 1988 and in revised form 28 November 1988. References
Able, S., A. Mann, U. Eichelsberger, and E. Ungewickell. 1988. Structural relationships between clathrin assembly proteins from the Golgi and plasma membrane. EMBO (Fur. Mol. Biol. Organ.) J. 7:919-929. Auffray, C., and F. Rougeon. 1980. Purification of mouse immunoglobulin heavy-chain messenger RNAs from total myeloma tumor RNA. Fur. J. Biochem. 107:303-314. Bar-Zvi, D., S. T. Mosley, and D. Branton. 1988. In vivo phosphorylation of clathrin-coated vesicle proteins from rat reticalocytes. J. Biol. Chem.
841
263:4408--4415. Goedert, M. 1987. Neuronal localization of amyloid beta protein precursor mRNA in normal human brain and in Alzheimer's disease. EMBO (Eur. Mol. Biol. Organ.)J. 6:3627-3632. Griffiths, G., S. Pfeiffer, K. Simons, and K. Matlin. 1985. Exit of newly synthetized membrane proteins from the trans cisternae of the Golgi complex to the plasma membrane. J. Cell Biol. 101:949-964. Griffiths, G., and K. Simons. 1986. The trans Golgi network: sorting at the exit site of the Golgi complex. Science (Wash. DC). 234:438--443. Heuser, J. E., and T. S. Reese. 1973. Evidence for recycling of synaptic vesicle membrane during transmitter release at the frog neuromuscular junction. J. Cell Biol. 57:315-344. Huynh, T. V., R. A. Young, and R. W. Davis. 1983. Constructing and screening cDNA libraries in ),gtl0 and hgtl I. In DNA Cloning. Vol. 1. D. Glover, editor. IRL Press Ltd, Oxford, UK. 49-78. Jackson, A. P., H.-F. Seow, N. Holmes, K. Drickamer, and P. Parham. 1987. Clathrin light chains contain brain-specific insertion sequences and a region of homology with intermediate filaments. Nature (Lond.). 326:154-159. Keen, J. H. 1987. Clathrin assembly proteins: affinity purification and a model for coat assembly. J. Cell Biol. 105:1989-1998. Keen, J: H., and M. M. Black. 1986. The phosphorylation of coated membrane proteins in intact neurons. J. Cell Biol. 102:1325-1333. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.
545 pp. Martinez, R., B. Mathey-Prevot, A. Bernards, and D. Baltimore. 1987. Neuronal pp6(F-~rc contains a six-amino acid insertion relative to its nonneuronal counterpart. Science (Wash. DC). 237:411--415. Pearse, B. M. F. 1985. Assembly of the mannose-6-phosphate receptor into reconstituted clathrin coats. EMBO (Eur. Mol. BioL Organ.) J. 4:12572460. Pearse, B. M. F. 1988. Receptors compete for adaptors found in plasma membrane coated pits. EMBO (Eur. MoL Biol. Organ.) J. 7:3331-3336. Pearse, B. M. F., and R. A. Crowther. 1987. Structure and assembly of coated vesicles. Annu. Rev. Biophys. Biophys. Chem. 16:49-68. Pearse, B. M. F., and M. S. Robinson. 1984. Purification and properties of 100kd proteins from coated vesicles and their reconstitution with clathrin. EMBO (Eur. Mol. Biol. Organ.) J. 3:1951-1957. Robinson, M. S. 1987. 100-kD coated vesicle proteins: molecular heterogeneity and intracellular distribution studied with monoclonal antibodies. J. Cell Biol. 104:887-895. Robinson, M. S., and B. M. F. Pearse. 1986. Immunofluorescent localization of 100K coated vesicle proteins. J. Cell Biol. 102:48-54. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA. 74:5463-5467. Zaremba, S., and J. H. Keen. 1983. Assembly polypeptides from coated vesicles mediate reassembly of unique clathrin coats. J. Cell Biol. 97:1339-1347.
The Journal of Celt Biology, Volume 108, 1989
842