Complete sequence and in vitro expression of a ... - Semantic Scholar

3 downloads 68 Views 3MB Size Report
phosphatidylinositol-linked N-CAM isoform from skeletal muscle. C. HOWARD BARTON, GEORGE DICKSON, HILARY J. GOWER, LEWIS H. ROWETT,. WENDY ...
165

Development 104, 165-173(1988) Printed in Great Britain © The Company of Biologists Limited 1988

Complete sequence and in vitro expression of a tissue-specific phosphatidylinositol-linked N-CAM isoform from skeletal muscle

C. HOWARD BARTON, GEORGE DICKSON, HILARY J. GOWER, LEWIS H. ROWETT, WENDY PUTT, VICKI ELSOM, STEPHEN E. MOORE, CHRISTO GORIDIS* and FRANK S. WALSH Institute of Neurology, Queen Square, London WCJN3BG, UK •Author's address: Centre d'Immunologie INSERM-CNRS de Marseille-Luminy, Case 906, F-13288 Marseille, Cedex 9, France

Summary

Neural cell adhesion molecules (N-CAMs) are a family of cell surface sialoglycoproteins encoded by a single copy gene. A full-length cDNA clone that encodes a nontransmembrane phosphatidylinositol (PI) linked N-CAM of A/..125X103 has been isolated from a human skeletal muscle cDNA library. The deduced protein sequence encodes a polypeptide of 761 amino acids and is highly homologous to the N-CAM isoform in brain of MrHOxXO3. The size difference between the 125xlO3/Wr skeletal muscle form and the 120x 103Afr N-CAM form from brain is accounted for by the insertion of a block of 37 amino acids called MSD1, in the extracellular domain of the muscle form. Transient expression of the human cDNA in COS cells results in cell surface N-CAM expression via a putative covalent attachment to Pi-containing phospholipid. Linked in vitro transcription and translation exper-

iments followed by immunoprecipitation with anti-NCAM antibodies demonstrate that the full-length clone of 761 amino acid coding potential produces a core polypeptide of AfrllOxlO3 which is processed by microsomal membranes to yield a 122x 103Mr species. Taken together, these results demonstrate that the cloned cDNA sequence encodes a lipid-linked, PIspeciHc phospholipase C releasable surface isoform of N-CAM with core glycopeptide molecular weight corresponding to the authentic muscle H S x l O 3 ^ N-CAM isoform. This is the first direct correlation of cDNA and deduced protein sequence with a known PIlinked N-CAM isoform from skeletal muscle.

Introduction

heterotypic adhesive interactions via a homophilic binding mechanism (Edelman et al. 1987). Perturbation experiments with anti-N-CAM antibodies have implicated this molecule in a variety of events including migration and guidance of axons, neural tissue formation and nerve-muscle interactions (Edelman, 1985). It is known that the N-CAM gene is present as a single copy and the diversity of N-CAM RNAs and isoforms found during development and in specific cell types can be accounted for by specific patterns of alternative splicing and polyadenylation site selection (Cunningham et al. 1987; Owens et al. 1987; Barthels etal. 1987; Santoni etal. 1987; Dickson et al. 1987). In addition to changes in amino acid sequence, N-CAM is also subject to a variety of cell-

Specific homotypic and heterotypic cell-cell interactions occurring during embryogenesis and tissue formation are modulated by the temporal and spatial patterns of expression of cell adhesion molecules (CAMs) (Edelman, 1985). Two main adhesive systems operating either dependently or independently of calcium have been described (Edelman, 1986; Takeichi, 1987). Of the calcium-independent CAMs the best characterized is the neural cell adhesion molecule (N-CAM) (Edelman, 1986; Rutishauser & Goridis, 1986; Nybroe et al. 1988; Goridis & Wille, 1988; Walsh, 1988). N-CAM is a cell surface sialoglycoprotein that is involved in both homotypic and

Key words: N-CAM, cell adhesion, skeletal muscle, immunoglobulin gene superfamily, amino acid sequence.

166

C. H. Barton and others

and isoform-specific post-translational modifications including glycosylation, phosphorylation and sulphation (Nybroe et al. 1988). Two categories of N-CAM forms have been described in neural and skeletal muscle tissues to date. These are, first, those with transmembrane topology of Mrs 180 and 140X103 in brain, and 145xlO3 in muscle; and second, nontransmembrane species of M r 120xl0 3 in brain and 125 and 155xlO3 in muscle, all of which are anchored to the external surface of the plasma membrane through covalent linkage to phosphatidylinositol (PI) via a specific glycan (Cunningham et al. 1987; He et al. 1987; Nybroe et al. 1988). The mechanism generating these two different membrane-associated groups of NCAM isotypes has been clearly shown to be based on alternative RNA splicing and differential polyadenylation site selection (Cunningham et al. 1987; Goridis & Wille, 1988). In addition, a lipid-linked isoform(s) in skeletal muscle containing an additional tissuespecific domain (MSD1) in its extracellular region has been predicted by cDNA sequencing studies (Dickson et al. 1987). This insertion occurs at a recognized splice junction according to the chick (Owens et al. 1987), and human (J. Thompson, unpublished observations) N-CAM gene structure and represents the first alternative splicing event, so far detected, within the extracellular domain that is not associated with membrane attachment. Indeed the difference in molecular size between the brain N-CAM-120 and muscle N-CAM-125 forms is likely in part to reflect expression of this muscle-specific domain.

1987) and positive clones were plaque purified by recloning at least three times.

DNA sequence analysis Recombinant DNA from y phage was prepared by standard methods and EcoRI cDNA fragments were subcloned into the £coRI site of M13mpl8 and recombinants detected by using IPTG and X-gal. M13 recombinants were selected in both orientations by restriction mapping. Replicative form plasmid DNA was prepared by alkaline lysis and purified by two rounds of CsCl equilibrium density gradient centrifugation (Maniatis et al. 1982). A series of overlapping subclones were generated by Xba\/Sph\ double digestion of the replicative form plasmid DNA and deletions generated with exonuclease III and SI nuclease (Henikoff, 1984). Overlapping subclones were sequenced by the dideoxy chain termination method (Sanger et al. 1977). Full cDNAs were sequenced at least twice on opposing strands. Subclone sequences were read manually, aligned and merged to generate full-length clones using the Microgenie DNA sequence analysis software package (Queen & Korn, 1984).

Northern and Southern analysis Poly(A) mRNA (2 /Jg), purified from tissue or cultured cells as described before (Dickson et al. 1986), was fractionated by glyoxal agarose electrophoresis (Maniatis et al. 1982) and blotted onto Genescreen as described by the manufacturer. Hybridizations using 32P-labelled whole plasmid or gel-purified insert were performed as described previously (Dickson et al. 1987).

Cell-free transcription-linked translation

Materials and methods

The two EcoRI fragments of CHBl were isolated by partial digestion and subcloned into the EcoRI site of the bacterial expression vector pGEM (Promega Systems, Liverpool, UK). The continuity and relative orientation of the recombinant DNA was ascertained by restriction mapping. Following linearization of the recombinant plasmid, transcription was initiated with either T7 or Sp6 RNA polymerase depending on the orientation of the CHBl insert. Conditions of the transcription reaction were as described by the manufacturer with template DNA at 50;igml~', 1 unit of RNasin and a reaction time of 60min at 40°C. Synthetic RNA was removed from DNA template and translated in a rabbit reticulocyte lysate in vitro translation system (NEN) with [35S]methionine as the radiolabel. Optimal incorporation into TCA-precipitable material was achieved by titration of both Mg2+ and K+ concentrations with optimal concentrations of 0-7mM and 52niM, respectively. Synthesized products were separated on polyacrylamide gels with or without immunoprecipitation with anti-N-CAM (Moore et al. 1987). Proteins bands were identified by fluorography at -70°C.

Cloning of human skeletal muscle N-CAM cDNAs

DNA transfection studies

P-labelled cDNA probe 911 was used to isolate additional human fetal muscle N-CAM cDNA clones from a library in Agtll. The library was prepared from poly(A)+mRNA isolated from midfusion human muscle cultures (Dickson et al. 1986) by standard procedures (Huynh et al. 1985). Hybridizations were performed as before (Dickson et al.

A Hindlll fragment corresponding to the entire large HindlH portion of CHBl plus pGEM polylinker regions was ligated into the unique W(>idIII site of p4.4.4 (Gunning, 1987) which follows the /S-actin promoter. Sense-orientated constructs were identified by restriction mapping and purified plasmid DNA was used to transfect monkey kidney

While complete cDNA sequence corresponding to the protein coding region for brain N-CAM-120 is available (Barthels et al. 1987), the full sequence corresponding to its 125xlO3Mr muscle counterpart has been lacking. In the present study, a cDNA containing the complete coding segment of a musclespecific N-CAM isoform was isolated and sequenced. Transcription, translation and processing of the cloned coding sequence in vitro and cell transfection results in the synthesis of a cell surface N-CAM glycopeptide of Afr122xl03. This corresponds in size to an isoform observed in skeletal muscle cells both in vitro and in vivo that can be released by Pi-specific phospholipase C (Moore et al. 1987).

32

Tissue-specific phosphatidylinositol-linked N-CAM

poly(A)+mRNA (Dickson et al. 1987). The cDNA library was screened by plaque hybridization with a mouse brain cDNA probe, 911, which was isolated by screening of a young postnatal mouse brain cDNA library with an oligonucleotide probe to the N-CAM yV-terminal protein sequence (J.-C. Chaix & C. Goridis, unpublished results). The 911 probe contained both /V-terminal-coding and 5' untranslated sequence. The largest cDNA clone isolated, CHBl, was composed of two EcoBJ fragments of 1-6 and

cells via the calcium phosphate precipitation method. Transient expression was examined after 48 h using indirect immunofluorescence with rabbit antibodies to N-CAM (Moore etal. 1987). Results N-CAM cDNA isolation and sequencing N-CAM clones were isolated from a Agtll cDNA library constructed from human fetal muscle E P I I

5'

PP

P J

E I

N A

K

I

I

H

S E

I I TAG

ATG

I

6 C

167

II

II

I

II

I I

I I

_ N-linked carbohydrate

I

J Cysteine residues

MSD1 III

D

IV

432 1•o

•S

0-

ll.JJ.it, 1

111

300 bp

O.-1-

X - 2 -3-4-

Fig. 1. Structure of human skeletal muscle N-CAM cDNA clone CHBl and its predicted protein. (A) Restriction map. Restriction endonuclease cleavage sites are shown for ApaLl (A), EcoRI (E), Hindlll (H), Kpnl (K), Nhel (N), Pstl (P), Sac\ (S). The cDNA is orientated as indicated, also shown are the initiation ATG and termination TAG codons. The scale bar shown is 300 bp or 100 amino acids. (B) N-linked carbohydrate attachment sites. Six consensus sites for Nlinked glycosylation within the central region of the derived protein sequence are shown by vertical lines. (C) Position of cysteine residues. The eleven cysteine residues are shown by vertical lines. The ten iV-proximal cysteines are implicated in five intramolecular disulphide bridges with the remaining cysteine being the COOH-terminal residue of the native protein. (D) Domain structure of human skeletal muscle N-CAM-125. Two hydrophobic domains (solid) were identified from hydropathy analysis (see below), with the /V-terminal being the signal peptide and the COOH-terminal region conforming to the requirements for membrane attachment via PI linkage. Thefiveimmunoglobulin homology units delineated by the typical disulphide bridges are numbered I-V and the muscle-specific domain (MSD1) is shown (vertical lines). (E) Hydropathy analysis. Hydropathy analysis was performed using the algorithm of Kyte & Doolittle (1982) with hydrophobic and hydrophilic residues above and below the line, respectively. The distinctive hydrophobic signal peptide and COOH-terminal regions are clearly within the hydrophobic region.

168

C. H. Barton and others

1

2

kb

2-9-

Fig. 2. N-CAM transcripts identified by CHB1. Poly(A)+ mRNA from human fetal myotube cultures was subjected to Northern blot analysis and hybridized with the whole CHB1 cDNA probe (lane 1) or with a Henikoff deletion subprobe encoding the phosphatidylinositol linkage and some 3' untranslated sequence (lane 2). The size classes of the RNAs are indicated on the left.

1-2 kb and these were subcloned in both orientations into M13 mpl8 for sequencing and detailed restriction site analyses. Restriction endonuclease mapping clearly indicated thai the small (1-2 kb) EcoRl fragment of CHB1 was similar to a previously described human muscle N-CAM cDNA, A9-5 (Dickson et al. 1987) whose sequence spans coding regions corresponding to membrane-proximal and COOH-terminal domains of a nontransmembrane lipid-linked muscle N-CAM form (Fig. 1A). The large (1-6 kb) EcoRl fragment of CHB1 was shown by Southern blot analysis to be the source of the hybridization signal with the mouse brain probe 911, and DNA sequence analysis (see below) of the 1-6 kb fragment confirmed its identity with 5' sequence of chick and mouse N-CAM cDNAs (Cunningham et al. 1987; Barthels et al. 1987). A Northern blot analysis using the entire CHB1 cDNA clone as hybridization probe identifies characteristic N-CAM mRNA transcripts of 6-7, 5-2, 4-3 and 2-9 kb in human myotube RNA (Fig. 2). A subfragment probe encoding the COOH-terminal domain of the proposed protein hybridized to N-CAM RNAs of 5-2,

4-3 and 2-9 kb. This is consistent with previous studies (Dickson et al. 1987) indicating that these transcripts contain the necessary coding sequence to allow N-CAM attachment to membrane by a phosphatidylinositol linkage and with only RNA transcripts from muscle containing the MSD1 region. CHB1 encodes N-CAM-125 The entire coding sequence of CHB1 (Fig. 3) was found to be highly homologous with mouse and chick cDNAs corresponding to brain N-CAM-120 (88 and 82%, respectively) with the exception of the previously described muscle-specific domain MSD1 (Dickson et al. 1987). The major open reading frame of CHB1 predicts a 761 amino acid polypeptide of core /V/r83xl03. Significant discrepancies in the predicted MT of N-CAM polypeptides and their observed migration by SDS-PAGE have been reported (Barthels et al. 1987; Cunningham, 1987). In this respect the difference between mouse brain N-CAM-120 (725 amino acids, predicted Mr 79X103) and the putative muscle isoform N-CAM-125 is accounted for entirely by the 37 amino acids of the MSD1 domain. With the exception of this domain the percentage homology at the amino acid level increases to 92 % and 87 % for mouse and chick sequences, respectively. In the human N-CAM sequence, the selected translational initiation codon exhibits two nucleotides of the five-base initiation consensus of Kozak (1984) and is followed by a putative signal peptide of 19 predominantly hydrophobic amino acids (Fig. IE) similar to that described for mouse brain N-CAM (Barthels et al. 1987). The subsequent 17 amino acids are identical to those found by direct protein sequencing of the NH2-terminus of rat brain N-CAM (Rougon & Marshak, 1986), with a predicted Nonterminal leucine residue in the mature polypeptide. The central region of the molecule contains six consensus sites for N-linked glycosylation (Fig. IB). At the COOH-terminus of the predicted human muscle N-CAM polypeptide a stretch of hydrophobic amino acids is found (Fig. IE). This conforms to the requirements for covalent anchorage to the plasma membrane via a P] containing glycan moiety (Cross, 1987) as described for mouse brain N-CAM-120 (Barthels etal. 1987). In this respect, the N-CAM-125 isoform of mouse muscle has been clearly shown to be released from intact cells by Pi-specific phospholipase C treatment (Moore et al. 1987). Given these structural features, its predicted Mr and similarity to N-CAM-120, it is thus likely that CHB1 carries the entire coding region for the human muscle N-CAM-125 isoform, directly comparable to brain N-CAM-120 but incorporating the MSD1 domain.

Tissue-specific phosphatidylinositol-linked N-CAM In vitro expression of CHB1 coding sequence Further evidence to suggest that the determined sequence of CHB1 encodes a complete human skeletal muscle N-CAM polypeptide was obtained by translating mRNA from in vitro transcribed cDNA. The intact CHB1 cDNA was subcloned into the Gemini vector (Promega) via a partial EcoRl digest of the original phage clone. Capped sense and antisense RNA corresponding to CHB1 cDNA was synthesized by initiation from both SP6 and T7 promoters in the presence of cap analogue (Melton et al. 1984) and then translated in vitro using a rabbit reticulocyte lysate. A major anti-N-CAM reactive product was observed at HOxlO3 Mr (Fig. 4). Several smaller specific translation products at 65 and 30X103 Mr were also observed and may correspond to proteolytic fragments or aberrant initiation and/or termination reactions. Anti-sense RNA failed to generate immunoreactive N-CAM (not shown). In addition, translation in the presence of dog pancreas microsomes led to processing of the llOxlO3^/,. primary product to yield a 122 x 103 MT form migrating by SDS-PAGE just below desialo N-CAM-125 (Fig. 4B). Neuraminidase treatment (not shown) of the 122xlO3/Wr form resulted in no further mobility change, indicating that the level of sialylation is identical to the immunoprecipitated, metabolically labelled N-CAM forms in Fig. 4. Thus the CHB1 cDNA encodes a complete in-frame N-CAM polypeptide whose molecular weight corresponds to the core polypeptide of N-CAM-125 in skeletal muscle myotubes. The minor mobility difference probably arises due to the failure to attach a Pi-tail or for the in vitro system to perform tissue-specific glycosylation events. In order to establish that the cloned cDNA contains the appropriate sequence to express an N-CAM isoform destined for attachment to the cell surface via a putative Pi-linkage, the Hindlll fragment from the pGEM vector was subcloned into a eukaryotic expression vector (Gunning et al. 1987). The resulting construct of the appropriate orientation was used to transfect monkey kidney (COS) cells by the calcium phosphate precipitation method. Two days after exFig. 3. Nucleotide and derived amino acid sequence of the full-length N-CAM coding region for a nontransmembrane isoform from human skeletal muscle. Nucleotides are numbered on the right from the initiation ATG codon and amino acids are numbered on the left from the first methionine residue. The hydrophobic signal peptide is underlined (amino acids, 1-19), the ten cysteine residues are circled, probable /V-linked carbohydrate attachment sites are shown as black dots, the muscle-specific sequence (MSD1) (amino acids 598-635) is shown by double underlines and the COOHterminal hydrophobic tail (amino acids 742-761) is boxed.

169 -85

SCACUAATTTACCOCGGCAACAA

-25 35

D L 95 Q 33

93 113

V

D

I

V

P

COTTOGAGAaTCCAAAlTL'l'lL'l'lATOCCAACTGGCAfinBntTOCCAAAQATAAAGACAT V G E f X r r L O O V A a D A K D K D I

155

TCACCATCTATAACOCCAACATCCACOACCCCOO L T I Y I A H I D D A G

275

.TOCOCCAACCCCACAOOAGTTCCGGOAGGG P T P Q I F R E G

395

UITOTOGTCAOCTCCCTCCCACCAACCATCATCTGGAA

4 95

UTTTACJUU I Y I X

I

r

Q X

GGAAOATGCCGTaAT"

C •>

LATGTCATCCTOUAAAJU2ATGTCCGATTCATAOTCCTGTCCAACAA

515

CTACCTGCAGATCCGGGGCATCAACAAAACAQATQAAflQCACrTATCGCTCTaAOQCCAO 173 Y L Q I R G I K X T D E O T Y R O E O R

375

AATCCTGGCACCOCCCGACATCiUtCTTCAACIlACATTCAOGTCATTCTaAATGTCCCACt: I L A R G E I H r X D I Q V I V H V P P

615

193

TACCATCCAGGCCAGGCIU2AATATTGTaUkTG(XACCCCCAACCTCGGCCAaTCCGTCAC T I Q A R Q H I V H A T A H L O Q 8 V T

«95

213

LATOCCOAAGOCTTCCCiGACCCCACCATOAGCTQOACAAAOOATOOGGA CCTGCTrOTOCQAT< L V © D O r P E P T H B H T K D G E

7 35

233

• 15 293 273 293 313

Q

I

E

Q

E

E

D

D

B

X

Y

i

r

S

D

D

S

B

Q

L

QJUXATCAAAAAQOTQQATAAQAACCACGACGCTQAgrACATCTGCATTQCTQAGAACAA T I K K V D K H D E A E Y l O l A I 1 | X

175

GGCTOOCOJU3CACaATOCaACCATCCACCTCAAAGTCTTTOCAAAACCCAAAATCACATA A G t Q D A T I B L X V r A K P K I T Y

935

TGTAaAaAACCJLGACTGCCATGGAATTAaAGGAGCAGGTCACTCTTACCTGTGAAGCCTC

995

V

E

H

Q

T

A

H

E

I

.

E

E

Q

V

T

I

.

T

O

'

*

B

CGOA

Suggest Documents