The Human Serglycin Gene - The Journal of Biological Chemistry

0 downloads 0 Views 2MB Size Report
Jul 5, 2016 - genes, the high level of serglycin mRNA in HL-60 cells probably is a consequence of the low level of methyla- tion of intron l and the 6”flanking ...
THEJOURNALOF BIOLOGICALCHEMISTRY

Vol. 267, No. 19, Issue of July 5, pp. 13558-13563, 1992 Printed in U.S.A.

The Human Serglycin Gene NUCLEOTIDESEQUENCE AND METHYLATION PATTERNIN HL-60CELLS AND T-LYMPHOBLAST Molt-4 CELLS*

HUMAN PROMYELOCYTIC LEUKEMIA

(Received for publication, February 28, 1992)

Donald E.HurnphriesSB, Christopher F. NicodemusOll, Vera Schillerll, and Richard L. Stevenssllll From the $Department of VeteransAffairs Outpatient Clinic, the SDepartment of Medicine, Harvard Medical School, and the VDepartnent of Rheumatology and Immunology, Brigham and Women’s Hospital, Boston, Massachusetts 021 15

The complete nucleotide sequence of the 16.7-kb hu- serine and glycine (1-29). Serglycin proteoglycans have been man gene that encodes the peptide core (serglycin) of isolated that contain 0-linkedheparin (1-3), heparan sulfate a secretory granuleproteoglycan was determined, thus (29), chondroitin sulfate A (8, 25), chondroitin sulfate di-B representing the first proteoglycan peptide core gene (9, ll),chondroitin sulfate D (27), chondroitin sulfate E (4, to be sequenced in its entirety. The exons, intron 1, 13, 15, 201, and even a chondroitin sulfate with trisulfated and intron 2 comprised 7,63, and 40% of the gene, disaccharides (21). The unique regulation of the differential respectively. Nineteen Alu-repetitive DNA sequences post-translational modification of serglycin is not understood, were interspersed in the gene, accounting for 28% of but it appears to be associated with the type of protein with the total nucleotides in intron 1and 40% of the nucle- which the serglycin proteoglycan interacts in the secretory otides in intron 2. The nucleotide sequence was then granule. used in an examination of the methylation pattern of Molecular biology studies have been carried out on the the human serglycin gene in human promyelocytic leukemia HL-60 cells that contain serglycin mRNA and cDNAs and/or genes that encode rat (5-7,16), mouse (17-19, in T-lymphoblast Molt-4 cells that do not. With polym- 24, 26), and human (12, 22, 23, 28) serglycin. Based on the erase chain reaction methodology, 13 DNA probes of deduced amino acid sequences of the cDNAs, the initially 260-880 base pairs in length were generated that translated serglycin peptide core is M , 17,600 in the human, corresponded to unique, non-Alu sequences spaced M , 16,700 in the mouse, and M, 18,600 in the rat. Thepeptide core of all known serglycin proteoglycans is encoded by a throughout the entire human serglycin gene.When blots containing genomic DNA digested with HpaII or single gene, located on chromosome 10q22.1 in the human Map1 were examined with these genomic probes, it was (12, 30) and on chromosome 10 in the mouse (16). The discovered that the 6’-flankingregion and intron 1of serglycin gene contains three exons. The first exon encodes the serglycin gene in HL-60 cells were both substan- the 5”untranslatedregion of the mRNA and thehydrophobic tially less methylated than intron 2. In contrast, the signal peptide of the translated protein. The second exon is entire serglycin gene in Molt-4 cells was highly meth- predicted to encode the amino terminus of the protein once it ylated. Because hypomethylated genes generally are leaves the endoplasmic reticulum, and the third exon encodes transcribed more efficiently than hypermethylated the characteristic serine/glycine-rich, glycosaminoglycan atgenes, the high level of serglycin mRNA in HL-60 cells tachment region for which this proteoglycan peptide core is probably is aconsequence of the low level of methyla- named. Although no evidence exists for differential exon use, tion of intron l and the 6”flankingregion of the ser- transcription of the serglycin gene in rat L2 yolk sac tumor glycin gene in these cells. cells is initiated at a site distinct from that in other cells, resulting in a mRNA that contains a substantially larger 5’untranslated region (7). Although the exons of the mouse and human serglycin gene are only -50% conserved, a 119-base The proteoglycans that arestored in the secretory granules pair (bp) region that immediately precedes the transcriptionof many hematopoietic cells contain a peptide core (termed initiation site is nearly the same in both species (18,28). This serglycin)’ with a protease-resistant, glycosaminoglycan-at- finding implies that the5”flanking region contains cis-acting tachment region consisting predominately of alternating regulatory elements critical for the expression of serglycin in hematopoietic cells. With deletion analysis and site-directed * This work wassupported by National Institutesof Health Grants mutagenesis, three motifs in the 504-bp 5”flanking region of AI-23483, HL-36110, and RR-05950 and grants from the Medical Research Service of the Department of Veterans Affairs and theHyde the mouse serglycin gene were identified that likely regulate and Watson Foundation. The costs of publication of this article were its constitutive transcription in fibroblasts and hematopoietic defrayed in part by the payment of page charges. This article must cells. Because the amount of construct2 mRNA was substantherefore be hereby marked ‘‘advertisement” in accordance with 18 tially less than the endogenous level of serglycin mRNA in U.S.C. Section 1734 solely to indicate this fact. transiently transfected rat basophilic leukemia cells, at least The nucleotide sequence(s) reported in thispaperhas been submitted one more positive cis-acting element is present somewhere in to theGenBankTM/EMBLDataBankwith accession number(s) the serglycin gene. The element that resides at residues -250 M90058. 11 To whom correspondence should be addressed Harvard Medical to -190 suppresses transcription, whereas the other two eleSchool, Seeley Mudd Bldg., Rm. 617, 250 Longwood Ave., Boston, ments at residues -118 to -81 and residues -40 to -20 MA 02115. Tel.: 617-432-1512; Fax: 617-432-0979. ’ The abbreviations used are: serglycin, the serine/glycine-rich peptide core of secretory granule proteoglycans; bp, base pair(s); kb, kilobase(s); PCR, polymerase chain reaction; SDS, sodium dodecyl sulfate.

* The DNA construct used in these transient transfection experiments consisted of the 504-bp 5’-flanking region of the human serglycin gene linked to the promoterless human growth hormone gene (19).

13558

Gene

Human Serglycin

stimulate transcription. As indicated bygel mobility shift assays, hematopoietic cells that transcribe the serglycin gene possess in their nuclei trans-actingfactors that recognize these three elements (19). A different profile of trans-acting factors is present in fibroblasts, which do not express the serglycin gene. Although the exon/intron organizations of the mouse and human serglycin genes have been deduced, the complete nucleotide sequence of this gene has not been determined in any species. In thepresent study, we have determined the nucleotide sequence of the entire 16.7-kb human serglycin gene. We also demonstrate that both the 5“flanking region and intron 1 of this gene are substantially less methylated than intron 2 in HL-60cells, a human promyelocytic leukemia cell line that expresses abundant amounts of this transcript. In contrast, the entireserglycin gene appears tobe heavily methylated in Molt-4 cells, a human T-lymphoblast cell line that fails to express the transcript.

13559

added, and the lysates were incubated at 50 ‘C for 3 h more. The digests were extracted 4-6 times with Tris-saturated phenol and then with chloroform. The genomic DNAs in the resulting solutions were precipitated with ethanol. Samples containing -10 pg of DNA either were dissolved in 10 mMMgC12 and 20 mM Tris, pH7.4, and digested with HpaII (GIBCO-BRL), or were dissolved in 10 mM MgCl, and 50 mM Tris, pH 8.0, and digested with MspI (GIBCO-BRL). These two restriction enzymes both cleave the unmethylated nucleotide sequence 5’-CCGG-3’, but only MspI cleaves this sequence if the internal C is 5-methylcytosine (38). The digests were electrophoresed in 1% agarose gels andtransferred (39) to Duralonmembranes (Stratagene). The resulting DNA blots were hybridized with randomprimed, PCR-derived, 250-880-bp probes that correspond to 13 different regions of the human serglycin gene. These DNA probes were generated with either the HL-60 cell-derived serglycin cDNA (12) or the serglycin genomic subclones (28) as templates. The probes used in this methylation study were specific to the human serglycin gene because each hybridized to a single genomic fragment. RESULTSANDDISCUSSION

The entirenucleotide sequence of the human serglycin gene was determined in both directionswith the use of previously EXPERIMENTALPROCEDURES described plasmid subcloned fragments of this gene (pB5SR2, Nucleotide Sequencing of the Human Serglycin Gene-Genomic pB5SH3, pB5RK, pBKH,and pBKS) (28). Because there was fragments of the human serglycin gene, 1-11 kb in size (28), were no overlapping clone comprising the 3’ end of pB5SR2 and subcloned into Bluescript plasmid (Stratagene), and a “walk” approach was used to determine the nucleotide sequence of the entire the 5’ end of pB5RK, a 0.69-kb DNA fragment was generated gene. Double-stranded DNA sequencing (31, 32) was performed in by PCR methodology from the original phage clone XHG-PG6 the sense and anti-sense directions of the gene with the SequenaseTM that included this region of the gene, and this fragment was Version 2.0 kit (U. S. Biochemical Corp.) and [a-”S]dATP (-1000 sequenced to ascertain continuity of the nucleotide sequence. Ci/mmol; Amersham Corp.). With each subclonedgenomic fragment, The sequence of the human serglycin gene contains approxithe sequence of the first -300 nucleotides in one strand of the DNA insert was determined with universal T3 and KS primers (Strata- mately 18,500 nucleotides and includes approximately 1.8 kb gene). The nucleotide sequence of the first -300 nucleotides in the of 5‘-flanking DNA, 1.2 kb of exons, an 8.8-kb intron 1, and opposite strand was determined with SK and T7 primers. Two 18- a 6.7-kb intron 2 (Fig. 1). 24-mer oligonucleotides that correspond to regions -40 bp from both Because the human serglycin gene is the first proteoglycan ends of the obtained nucleotide sequences were synthesizedona peptide core gene to be sequenced in its entirety, it is not Cyclone DNA Synthesizer (Milligen, Novato, CA) and were used as possible to compare the nucleotide sequence of its introns primerstodeterminethe nucleotide sequence of the next -250 nucleotides in each direction of the insert. Approximately 200 oligo- with those in other proteoglycan peptide core genes. The two nucleotides complementary to different regions of the human sergly- introns of the human serglycin gene contain 19 Alu short cin gene were used as primers to determine the entire sequence of the repetitive DNA sequences (40), and two other Alu repetitive 16.7-kb gene in both directions, as well as 1.8-kb of 5”flanking DNA. sequences arepresentin the 5”flanking region. A 70-bp A polymerasechain reaction (PCR) (33) was performed with the gene Donehower element that has been detected in >30 human AMP kit (Perkin-Elmer Cetus Instruments) in a thermal cycler to genes (41) was found approximately two-thirds of the way confirm the overlap nucleotide sequence of the plasmid subclones pB5SR2 and pBRK (28). A 0.69-kb DNA fragment was amplified intointron 1. However, no Kpn-repetitive DNA elements with the phage genomic clone XHG-PG6 used as a template. This were detected in the human serglycin gene despite the fact DNA was subcloned into pCRlOOO (In Vitrogen, San Diego, CA) and that comparable amounts of Kpn-repetitive DNA and Alusequenced with M13-40 and T3 primers. The data were analyzed repetitive DNA are present in the human genome. with the Caltech DNA analyses software and with programs at the Although there are >500,000 Ah-like DNA sequences in Biomolecular Engineering Research Center at Boston University. RNA and DNA Blot Analyses-Total RNA was isolated according the human genome, their biological functions are notknown. to the method of Chomczynski and Sacchi (34) from HL-60 cells The typical Alu repeat is -300 bp in length and is a dimer of similar, but distinct, left and right arms. Both arms of Alu (CCL-240; American Type Culture Collection, Rockville, MD) and from Molt-4 cells (CRL-1582; American Type Culture Collection). DNA are related to 7SL RNA and are linked by an adenosineRNA (-5 pg) was denatured in formaldehyde/formamide before rich spacer (42). Alu DNA repeats can be grouped into two to electrophoresis in 1.3% agarose/formaldehyde gels. The separated four distinct subfamilies (43-46). A comparison of computer RNA was transferred (35) to Zetaprobe nylon membranes (Bio-Rad), and theresulting blotswere incubated at 43 ‘C in hybridization buffer alignments of Alu sequences has yielded an Alu consensus (50% formamide, 0.75 M NaC1, 0.05 M sodium phosphate, 5 mM sequence and a set of diagnostic positions that are correlated EDTA, 5 X Denhardt’s buffer, 0.2% SDS, and 100 pg/ml herring in the various subfamilies. The serglycin gene-derived Alu spermDNA) containing random-primed [cI-~’P]~CTP-(-3000 Ci/ sequences were aligned with the Alu consensus nucleotide mmol, Du Pont-New England Nuclear) labeled cDNAs that encode sequence of Jurka and Smith (44); their locations and charhuman serglycin (12) or mouse 0-actin (36). After 24 h of hybridiza- acteristics are depicted in Fig. 2A and Table I. Fifteen diagtion, the RNA blots were washed at either 42 or 55 ”C with 30 mM NaCl, 3 mM sodium citrate, 0.1% SDS, 1 mM EDTA, and 10 mM nostic positions were examined t o determine if the Alu elesodium phosphate, pH 7.0). Autoradiography was performed with ments wereof the “S” or the “J” subfamily. Because the J subfamily of Alu elements is more similar to 7SL RNA than Kodak XAR-5 film (Eastman Kodak, Rochester, NY). Methylation of the Human Serglycin Gene in HL-60 Cells and Molt- the S subfamily, the J type probably is a more primitive Alu 4 Cells-Human genomic DNA was prepared as described by Samelement (44). Eleven of the Alu elements were of the S type, brook et al. (37). HL-60 cells and Molt-4 cells were each centrifuged whereas eight were of the J type. The two Alu elements in the at 1500 X g for 10 min at 4 “C. The supernatants were removed, and the cells were washed twice with ice-cold 0.14 M NaCl, 2.7 mM KCl, 5”flanking region were of the S type. The Alu elements were 25 mM Tris, pH 7.4. The cells were suspended in 10 mM Tris, pH 8.0, present in both orientations. Thirteen were oriented in the sense direction of the gene, whereas the other six were oriencontaining 0.1 M EDTA, 0.5% SDS, and 20 pg/ml pancreatic RNase and were incubated for 1 h at 37 “C. Proteinase K (100 pg/ml) was tated in the anti-sense direction.

Human Serglycin Gene

13560

ACCACCCCTATTACACTGTGAGACTCTGTCCTTAGCAGGAAATGGTAAAGGAATCAGCAG CAACCCTAAAACAACCCAAGGGAAACTCCTCCCTCCTATCAAAGGCGAGCCAAGCGCCAG GATCACAATGGAAAGTGCCCCACGTGCATTAAGCCCTCTGGGGGAGCGGIAGGG~TAGAC ATGTTGAAATGGAGAGTGGTTGAATCTTGAAAGGAAGTAAGGAAAATGTATATGCCACAA GGGCAAGCATCTATCATACGCTTGTATTTATTGTATAACTTTATTTTATTTTATTTTATT ATTTTTAGITTTTTGAAATAGAGTCTCTCTCTGTGACCCAGGCIGGACTGCAATGGCACG ATCTTCCCTCACTCCAACCTCTGCCTCGCGGGTTCAAGCGATTCTCCTACCCCAGCCTCC CAAGTACCTGGGACTACAGGCACTCGCCACCACGCCCGGCTAATTTTTTCTATTTTTAGT CCACACGGGGTTTCACCCTGTTAGCCAGGATGGTCTCGATCTCCTGACCTTGTG~TCC~C

-1,844 -1.664 -1,484

6.457 4.637 4,817

-1.304

CCACCTCAGCCTCCCAAAGTGCTGGGATTACAGGTCTGGACCACAGCACCTGGCCTTTAT

TCTATAACTTTAATATGGAGCCTTAGTCCTCCAGAAATCTGCACCATGCCCAGGGAAAAT GCTGTTCCTTTATTTACATGAGGCAGAGGTTTGTGTTGTTTGCTGAGGTCCAGCAGCTCA CTCTCACCACAACCTCTCCCTCGTTCCTGATTTCTGGCTTA~TCAAGGCCAGAGGACCCA

4.997

-1.124

AAACTCC~GTTGAAAATTTTGACTAGGAGAACAGATAAAAAIAGAAATGCAATCCATCTT AACAAATGCTCAATGTGAAGCACAGGGGAATGCAAAATAGTTTTTCTGTTTTACCACACT CTTTCTAACTGAAAATCCTATGTTCTTGAGATAGAGAGAGAGCCAATCATCCAATGCAAG AAACTGAGTTTAAGAACAAAGTGTTTGCTGGGCACGGTGGCTCACGGCTGIAATCCCAGC ACTTTGGCAGCCTCACCCAGGCGGATGGCCIGAGGTTGGGAGTTCGAGNCCAGCCTGGCC AACATCGTCACACGTCCTCTCT~CTAAAAATACAAAATTAGTCCAGCGC~GTGGTGTACC CCTCTAATCCCACCTACTCAGG~GGCTGACGCAGGAGAATTGCTTGAACCCTGGAGATGG AGCTTGCAGTGACCTCAGATCGCNCCACTGCTCTCCAGCCTGCCTGACAGAGTCAGACTC

5.177

-944 -764 -584

5.357 5,537

CATCTCAAAAAAAAAAAAAAAAAAAAAAAAAAGAAGAAAAAGAAGAAGAAGAAGAAACTG

TTCATCTGAAATCCCACAACTCATTCTTGAAGGTTAGAGCTCAGCTTTGAAGTTTCACTT CACCACCTTGGCTCACTCAGGTATGTTACTCCCCGGTGAAAAAGAAAATGAACAGAATGT TTTATCTTCAAAGTGCTTCCTGACGAAAAGGCAGCACCTAGATCCCTTATCTCATAAAAA ATGCACCACATTCTTAATATTAGCAATCTAGTATTTAGATTGTTACCTGAAGA~AGGAAA AACAAACTCTCCCAAATGCTGATTCTACTGTTTCGGTGGGAAAAAAAAATGTCTTGCAGG CAACTCCCAAACAACAAAACTTTTGAAAAAGCACGCCTGGCGGCACTCCACTACAGTTTC ATAATCCGTATGAATAGTTATTTTACTCTCTTCCCCCCACCCCCTTTCTTTCTGGGTTTT GATCTCCATCTCTTTCTATTIGTTCAGGAAATTGTCACGTGTCTTCTGGGCACGCTTTGA GGTTTTCGAACATTTTCTAAAACGGACAGAGAGCACCCTCCTAC

-404 -220 -44 I

5.717 5,897 6.077 6,257

I>

ATTTCCTAATCAAGAA CTTCCCCTGCAGCTCCCAGAGCTAGACTAAGTTGGTCATCATGCAGAAGCTACTCA~~TG CAGTCCCCTTCTCCTGGCTCTTGCCCTCATCCTGCTTCTGGAATCCTCAGTTCAAC

11x01 1

6.437 6.617

137 317 497 677 857

GTAA GACTCACCACTCTTCTTCCCCAGCCATCTTCTCTGTAAGCCCTGTGGTCCATGCAAGTCA TTATATTCATTTTAAGGCATAGAATGTATAATATTCTGACAAAGGAGCCAAAGAAGAAGG ATTTCCCCTCGCTCAACCCTTTAATATGAGTTCTGTTAAGTTTGGTACCAAGAAAAATTA AACTCTCTCCCCTCTGCAGTCTTGT~~ACTCTTACAATGATTGAAATGTGCTATTTTGGG ATGAAAATGTGACCTTTATAAATTTTAAAAGCTCAAAAAAGGAATCTAG~~AATGACTCC TGTGCCTGTTCCATCCAGGAGATGGCACCTTTCACTGTTGGGGGGTGTCTGCCTACCCCT AACTCTCTACATCAGCCCCAAGTTTTAGTGCGCTGTG~CGGTCTCATTGTTATTTTA~CA CTGGGACACCTTATATTCCAATTGGGGTGAATCTGACTCTGTGTATTITCTTTTCTTTTT TTTTTTTTTAAAGATAAACTTGGTTCTT~CTCAAAACTCAATTATGGTTAGACATAGTTC ATGTAAAACCTCTCACATTTTAAAGAGA~GGCCAAATAATTTGGTATTTGTGCTCTTGCT CAGACIACCATCATATTCGGAAATATCTTCCTAGGITTATCTACCATTTAGTGTTGTTTA CTCAGACTCAAACAACTTAAAACCTGTAATGACTAAGACAATGAAAATGA~AGCCTTGTA ACAAAAATACAATTTOTTATTCITTGGC~AATAAGGAATCATGTCTAAATAAGACGCAGG TCATGCCTTCATAGAG~GATGGCTGAACCTATACT~CAAAAACACTAGCTTCCGCC~AAT

6.797 6,977 7.157 7,337 7,517

1.03i 7,697 1.217 7.877 1,397 1.577 1.757 1.937 2.117 2.297

8.057 8,237 ,417 ,597 ,777 ,957

I

2.477

3,017 3.197 3.377

9,137 9.317 9,497 9.677

3.917 4.097 4,271

BION 2

CTAAGTGGACTITT TCTCTAATTAATTAATTAATTACTTATTTATTTGAGACGGAGTTTCACTTTTCTTGCCCA GCCTGGAGTGCAATCGCGCAATCTTAGCTCAC~GCAACCTCCGCCTCCTGGGTTCAAGCG ATTCTCCTGCTTCACCCTCTGGAGCAGCTGGGATTTCAGGCGCCTGCCACCATCCCCAGC TAATTTTTTTTTTTTTTTTTTTTTGAGACGGAGTCTCACTCTGTTCCTCAGGCTGGAGTG CAGTGGCGCAATCTCGGCTCACTGCAAGCTCCACCTCCTGGATTCACGCCATTCTCCCGC CTCACCCTCCCCAGITACCTGGGACTACAGGCACCCCCCACCACGCCCGGCTAATTIITTT GTATTTTTAGTACACACGGGGTTTCACCTTATTAGCCAGGATGCTCTCGATCTCCTGACC TACTGATCCCCCCGCCTTCACCTCCCAAAATGCTGCGATTACAGGCGTGAGCCACTGCCC CTGCCCTAATTTTTTGTATTATTAGTAGAGACGGGCTTICATCATCTTGGCC~GGCTCCT CTCAAACTCCTGACCTCAGGTGATCCACCCACCTTGCCCTCCCAAAGTCTTCCCATTACA ACCATCAGCCACTCTACCCOGCCTTTTCTCTAATTTTAAAGTGTCTGTAATTTCACAACC TCTTGGCAC*CATGTCCGIGT~TTTTT~TTCAAGCTGTCC~~~GTGTTTTGCTTCGAGCT

3.557 3.737

I

GTTATCCTACCCACACAGCCACGTACCAATGGGTGCGCTGC AATCCACACAGTAATTCTGCAAACTGCCTTGAAGAAAAAGCACCAATGTTCGAACTACTT CCAGGTGAATCCAACAACATCCCCCGTCTGAGGACTCACCTTTTTCC

2.657 2.837

CCCTATTCTA~CTTTAAACTGCTACTTTTTGGAGTGTTGTAAGAAGGACAATTTATATAA AATGTTCGCACATACTCGCTGCTGCTGTTATATGAAIGGGCACAAAATCTGTCTACATTT TGCCTTTTACCAAATTTACAATCTATTTAGTTAAAACCTTCTTAGGGCGGGTCG~CTGCA GTTCCTCATTCCTCTAATCTCAGCACACTGGGAGGCCAAGGCAGGAGGATTGCTTGAGCC CAGGTCTTTGAGACCACCCTGGGCACATAGTGAOACCCCCATCTCTCCAAAAAACAAACA AACAAACAAAAACAAAACAAAACTAGCIGGGCGTTGTGGTGCCCCTGTATICCCAGCTAC TCAACACCCTCCGGTGGGAG~~TGOCITG~GCCCAGG~GTTCAAGGTTGCAGTGAGCTAT GATCACAGTACTGCACTCCAGCTTGGGCAGCCGACTGAGACCCTGTCTCGAAAAAAAAAT AAAAATAAAAACTTCTTAGGACAGAGTCATTACAAGCTCTCTAGTAGATACTTAGTAACA ATGTCCCTTCCTCGCCCAG

10,037 10.217

CCCATATGACATGTGCCACTATACATGATTCACCTATGTTTTTGAAATTTTTTTTGTGGA TGGTACACAGGAGCATTGAGCACTTTTCATCAACAGGTATTGAAAATGATTGAACATICT TTTATTTCTGTAAACACAACACACTATATATA~AAATCCAATAATTAACTGAATGGATAA GCAAAATGTCCTATAAGCATACAAAGGAATATTATTCGGTCATAAAAAGAATGAAGTACT GATACATCCTACAACATAGATAAACCTTGGAAACATTATGCAGAGCCAAGGAAGGCCAGA CACCAAAAGCCACATATTGTATGATTCCATTTAGATGAAATGTCCAGAATACGCAAATCC CTAGAGGCAGAAAGTAGATTAGTGGGTTACAGGCCCTGGGGAAAGGGAGGAATAAGGAGT GACTGCTAATCCCT'IITCAGGGTTTTTTTTGGACGAGGTGATTAAAATGTTCTTCTGCCAG

10,397 10.577

CTCTCCTCCCTCATCCCTCTAATCCCAGC~CTTTCGC~GGCCGAGGCGGGAGGATTGTTT GACCCCAGGAGTTTGAGGCCAGCCTGGGCAACAT~CTGAGACGCTATCTCIAT~TCAA~~ ACATTTTTTATATTAAAAAAATGTTCTTCAAGTAGTTGGTAATTATTTTTAAAAATGCCC ACGTGCAC~~CCCTCATCCCTGTAATCCCAGCACCTTGGGAGGCTGAGGTGGGAGGATCCC

FIG. 1. Complete nucleotide sequence of the human serglycin gene. Sequence numbering starts atexon 1.The nucleotides comprising the three exons of the gene are bracketed.

Both introns contained substantial numbers of Alu sequences, butthe distribution of the A h types andtheir orientation were biased. Intron 1 contained only one Alu element oriented in the anti-sensedirection, but almost half of these repetitive DNA sequences in intron 2 were orientated in theanti-sense direction. Likewise, 75% of the Alu sequences in intron 1were of the S type, whereas >50% of the elements in intron 2 were of the J type. All of the repetitive elements were complete in the first intron, whereas two Alu

elements in thesecond intron contained only the left half. In total, Ah-like sequences accounted for-28% of the total nucleotides in intron 1and -40% of the nucleotides in intron 2 of the human serglycin gene. Normally, Alu elements are found only every 3-5 kb in the human genome (43). The human serglycin gene is unusual because it contains an average of >1 Alu element/kb. Increased densities of Alu elements in the human C 1 inhibitor gene have been suggested to be responsible for deletions of part of this gene in two

Human Serglycin Gene A AAACGAACAATAATIAGCCAAGACI~GTAAAACAAAAAICAAAICTCIICTITIGATCAC ATAAAACTTGCTTTAAACTIGCAAAAAAGACCTGAIATAAATICAIAAGIAACAAAAAAI TGAATTAIATTACAAACCATTAATTCAAIGAAIACTAAAGCTATGTAGGAIGTAGCAAAA TAIACATATTAACAAAAGGAIIATCATAAAAGTTIIAAICTCCAGGCICAAACCIAGAAA ATCACICTCCTCAAACCCAGGGTIAAICAICATGCTCCAAACCAGGIACATITCACAICA CTTTCGCAICCTCCCAACITICTCIIIIGIIIIIIITIIITTITTGAGACAGGGICTCCT

10.937 11,117

11,477 11,657 11,837 12,017 12.197 12.377

12.737 12,917 13.097 13.277 13.457 13.637 13.817

14,177

14.537 14.717 14.897 15.077 15.257

GGTCTGCATIICTCACCCICCCAGGGCIGIGGCTGACTTIGGCCAAIGGGACGCAAGCAC CCCACACTGACACCTTCGCACGAAGGGAGAGAGGTATGTTICIICTCCITACTCCCTCCC TGGGCIGGCACCTTCGCCAGGACTCIGTTIIGCCCAIGGCCTCAGCTCCCACCAGAIGCC TCTAGTCCCIGCCCICACGAAAIAGACAACCTCCIICCACTAICGCIGIAGCCCAACCAG CCAACIAIITTIIITCITTCTTTCTITCTIICCIITIITIITITTACAGAGICTCACTCT TGTTCCCCAGGCTCGAGTCCAGIGGIGCGATCTCAGCICACIGCCACCICTGCCTCCCGG GTTCAACTGATTCTCCTGCCTCAGCCTCGAGGGTGTGCCACTATGCCCAGCIAAITTITC TATTTT1GGTACACACCGCCTIIIGCCAICITG~CCAGGCIGCICTTGAACTTCTCACCC CAAATCATCTGCCTGCCTCGICCICCCAAAGIGCIGGCAITACAGGCAIGACCCACIGTC CCTCGCCCAACCAAATATTTICIICCIAITGCIAAICICTGGGIIACCICGCIATCCCCC ATTTAIGCTICACITCICCTCCATCACCTGIATGAGGAAIICCCTCIGIGTIAAAIATCTG CAGAACITTCCTCATIGCACCCIGGCIGTTGCAGCIICCAAGGCCACCICICTTTCTCGC TGGIATCCIITTCCCATCCATCTTCTCCAGGACITCCAI~CIGCAGTTATCTCTCTGAAC TCAGTCICTTCITCCCATCAGIATAGGGGTGGACTIIAGIATCICCTAIGTTTACGCAAC AICTCTCCIITCACTCTGCGTCITCICCAGIGGITGCCCITCICIGCICCTCTTCACAAI AACACCICCTGAAAGGGCCACCCATGCCIGCCCCCICCTIICCTCACCCCCICTGTGGCT GGACTTCTCITCCTACACICCACCCTGGITGACAAAGICACIGAITACITCTCTATITIC ACCTTACITCAICCIIAATTGCCITCAAAAACAGCIAACTGGGCCATCCATGIAAICCCA GCACIICGCGACCCCAAGCCAGGAGCAICACIIGAGCCGAGGAGIICAGCACCACCCIGC CTGGCCAACATACTGACACCCTATCTACAAAAAAIAGAAAAAITAGCCGCGCCIICTGAC ICATGCTTCTGCTCCCACCTACAA~GGAAGCTGAGGIGGGAGGAIGCCIICAGTCCCGCA

CTCCACTGCTTTTTTTCCCATTTTTCTTTCATACTTCAG

15.977 16,157 16.337 16,517 16.697

1

I

I

78 9

10 1112 1314151617

2021

19

I8

I

I

L

r 2

B

C

D

E

F

G

H

I

J

K

L

M

TABLE I Distribution and type of A h elements in the human serglycin gene Twenty-one Alu elements were detected in the nucleotide sequence of the human serglycin gene (Figs. 1 and 2A). Of these, 19 were identified in the introns. Thirteen were of the S type, and 8 were of the J type. In two instances, only approximately one-half of an Alu element was inserted in the gene. These elements were oriented in the sense (F) or anti-sense (R) direction relative to the rest of the human serglycin gene.

15.617

AAAGACGACAATCCAGGACTT GAATCC~AICTICCCACTTTCIGAGCACIACICTGGAICAGGCTTCGGCICGGGCICC~~ CICTGGAICAGGATCTGGCAGTGGCTICCTAACGGAAAIGGAACAGGAIIACCAACIACI ACACCAAACTCAIGCTTTCCAIGACAACCITAGGTC~CTTGACAGGAAICTGCCCICA~A CAGCCACCACTTGCGTCAACAIGGATTAGAAGAGGAITITAIGTTATAAAAGA~~ATIIT CCCACCTTGACACCACGCAATGIAGTIAGCATAITITAIGTACCATGCITATATGATTAA ICTICCCACAAACAATIITAIAGAAATTTTIAAACAICTCAAAAAGAAGCTTAAGTTTIA TCATCCITIITIITC~CATGAAITCITAAAGGAIIATGCITTAATCCTGIIAICTAIC~T AITCTTCITCAAAATACCTGCAIIITTTGGTAICAIGITCAACCAACAICAITAIGAAAI 1 . 0 1 IAAITAGATTCCCAIGGCCAIAAAAIGGCIIIAAAGAATAIATATAIATIITIAAAGIA~ CTTGAGAACCAAATIGCCACGIAATATTTCATACCTAAATIAAGACICT~ACII~~A~~~ TCAATTATAAICAIATGCCCCTIIICTIAIAAAAACAAAAAAAAAAIAATGAAACACA~I GAATTTCTACAGIGGCGCTAIITGACAIATITTACAGGGIGGAGTGIACTAIAIACTAII ACCTTIGAATGIGTTICCAGAGCTAGIGGAIGIGITIGICTACAAGTAIGAII~CT~T~A CAIAACACCCCAAAITAACTCCCAAAITAAAACACAGIIGIGCIGTGAATACCICATACI CCTTIACCTTTTTTTCCTCCATAICIGICIATITTCAAATGTTACTAIAIAIIAAAGCA~ AAAIATAACC

I I l l I I l l

FIG. 2. Location of the Alu elements and the HpaII/MspI sites in the human serglycin gene. A , the locations of the 21 Alu elements in the 5’-flanking region, intron 1, and intron 2 of the serglycin gene are depicted. The locations of the two Alu elements containing only the left arm are identified with a 1/2. The three exons are boxed (D). B , the locations of the HpaIIIMspI sites (5’CCGG-3’) in the serglycin gene are indicated by the vertical lines. The letters depict the location of the probes used to determine the extent of methylation of these sites. Sites in the serglycin gene in HL-60 cells that are at least partially methylated are indicated by closed circles, and nonmethylated sites are indicated by open circles.

GCACCCCTCTTGICCCACCT*CTCGGG*GGCTCIGAGGCAGGAGAAICACTIGAACCTGC~A

15.797

6

I I

U 2 kb

CGTCCAGCTTCCAGIGACCCGAGATGGCACCACTGCACICCAGCCIGGGCAACAG~GCAA CACTCTCCCTCCAAAAATAAAAATIAAAATGAIIICITAAGTAAAITTCAAATATAGAAI GTATAICCTAGTCATAACAAAAITAACACTGTTIAIGCAAGICTGCAAIACGIAGAIGT~

15.437

5

I

A

CTAGCICCCACACCCCACCCIAAA~IAGCCACCI~~ICACICCCTAIIACAIT~~T~A~T TICATTTCICTCTGCCACCIAIGATITICCTGATTIATITAIICACTTIICAIIGICTGT CITCCCCACTAAAATAAAAACIICIIGAGAAGGGGCITCAICGATCTGCCTCIGIICTAT CCCAGGCCCTCAAAACAACGACCAGATATTCAACAAAIAIIIATIGAAIGCGIACAIGAA TIAAAACTCTAATTCGTTCTAIGCTGGTGGITTAITATITTCATGGAGGAAAIGACTIGI ACCCTGICACACICACCITTTGICTCIGATGCITIGTTGCCCTGIICTGTCACCGAGGGC TGICCTCATICCICTCCCCAIITICIGCICIITGAATITCIAATCATCACACTCAACCCA CAAGCCACCCTTACCITTCAGCACICTICACCIGAAIGAGIGCAAGTTGCAGGCAGGGIC ATTTI~ICATAGCAAAITGAAIGTITATAIGCIGGTAAATAIAAAGCIIAGCTITTTACA AACAAIITCTCAAAACTCAGCTTTGTTGAAGCCCIGIAAAITGIIAGAACITTTATGGAA AITIIAATIIACCAAAAAAIGTCATCIGTIIGGGCIGACTTAGIIGITAGTTGIITCICC TTTCITTTITTIGCTGGACGGTAIGGAGIIIIGCICIIGIAACCCAGGCTGGAGTGCAGT GGCCCGATCTCGGCTCACICCAACCTCCGGCTCCIGGGTTUAAGCGAIICTCTCACCIC~ GCCTTCCGACTACCTCGGATIACAGGCATGCACCACCACACTTGGCTAAITIIIGIATTT TAAGTAGACACCGCGIITCACTAIGTTGGTCAGGCTGGTTICGAACICCIGACCTCA~GT CATCACCCACCITGCCCTCCCAAAGIGCIIGGAITACAGACATGAGCCACCACACCCGCC CAACAGCACIICTTTTAAAAAIGAIITCITGGGCCGGGTGCAGIGGCTCACACCTCIA~T CCCAGCACITTGGGAGGCTGAGGIGGGTGGIICACAAGGICAGGAGTIIGAGATCAGCCT CGCCAATAICCTGAAAC~CCAICICTACIAAAAAIACAAAAAITAGCCAGGCAT~GI~~~

14,357

4

I

R

GCCCCIITCTGTAAAAAAAAAAATGCTIGACCACICCTICCITGAAAIGCIITTTICI~G AGGCTICCAICCCCTCCCTIATCCTGTIICITCCTACTICTCIGGIIGIGCIIIITCCTC TCCTCACTATTTAACATGTTGGIGTGACCCTGGCICIGGCCIGGGCCCCCITCICT~T~T ACCTGCIITCICTCGACGACCTCCAICGGTIGCAIGGCTTTAACTACCAAATCIGTGATT

13.997

3

Y

AACTCATCTTCCTGCCTCAGCCICCCAAGIAGCGGGACCACAGGCACACAGCACCATGCC CATCTAATIAAAAAAAITTTIIITIGTAGAGACAGGGGTCTCIGTACAITTCCCAGGCIG CICATCTACICCTAACCTCAAGCAGTCCTCCCACCTCAGCCICCCAAAGTCCCGCAAIIA CAGTCAIGAGCCACCATTCCCAGCGCTGGTGACTTICTCCATCACIGGIGACIIICICCA TCACTCGTATICACICCATTAGIGAIGACATCATTACAAICIICAATATGCAACIITGTA GTCCIACTCTICCAITCIIACIIIAAAGCCCTCAGCATIAAGITIGAAIGTAAIAIIACAC CATCCTTCAIIACITTAAAICAIIGGIIICAATAGIAATICATITAAAICIAAAAIGTTA CCCTGCACTCCCTCAIGCCTGTAAICCCCCCAGTTIGGCAGACTGAGGTGCGAGAATCAC TTCAGGCCAAGAATTTCAGACCAGCCIGGGCAACACGGCAAGACCCCAICICTAAAAAII AGTCCCCCGCCGCCTGTGCCICACCCCIGTAAICCCAACACIITCGGAGCCCGAGGCGCA IACCTTCAGGTCAGGAGITCAAGAICAGCCIGGCCAACAIGGCGGAAACCCAIIICIACI AAAAATACAAAAATTAGCTGGGCATGGTGGCACGCCIGTAATCCCACCIATIGAGAGCCC CAGCCACCCACACTGGGACGCCAACGCACGCAGAIIGCTIIGAGACCTGCCIGGGTAACA IGCAGAAATCCTGICTCIACAGAAAAATACAAAAAITAGCCAAGCAIGGAGAAACCICGT CICTAIAGAAAGACACAAAAACTACCCATGCATGCCTGTGGICCAGCTACICGAAAGGCT GACATCCCAGCAITGCIIGAICCTCACAGGTCAAGGCIGAAGIGAGCCATGCICTGGCAC IGCACICCACCCACCGTCACAGACIAAAACCTTGTCICAAAATAAAIA~ACACAIIIAAA ATAAAIAAATACIAIIAAAACTAAAATIAAA~AATA~AAIA~AATCTTAAGAGAAT~GCT CAAATTCTCCAAAAGAACICTIGCACACCAIICCTCCTCTICTCAAAICTCIAIITTCCT TCCCCAAAGCCAGTAACICCITCTCACCCICACCCIGTGCIIICTITCCCCICATIGCGA IIAGAAICCICCTTGCTTCTGTGCTGAICCCAA~CCCIITIGCCCICAG~TCCTCCTCTCC

IICCCTGGCCCTGCTCTGTATTGGCIGIGGGGIGGGGGIGGCGGTCGAACTCACCCCIGC

12,557

2

I I I I

2

CIGTCACCCACGATCCAGICC*CTCCTCTC*TCATAGCICACTGCAGCCTCGAACTCCTC

11.297

1

13561

3

I

FIG. 1. continued.

families with type I hereditary angioedema (47). Recombination events between repetitive DNA elements have also resulted in the deletion of a portion of the human factor VI11 gene in a patient with hemophilia A (48) and the duplication of part of the low density lipoprotein receptor gene in apatient with familial hypercholesterolemia (49). It remains to be determined what function,if any, these Alu elements have in the serglycin gene, and if an altered serglycin gene exists in certain patientswho are immunodeficient because of a recombination event at one of these elements. An important modification event leading to the transcrip-

Direction Number Type

Location

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

5”Flanking region 5’-Flanking region Intron 1 Intron 1 Intron 1 Intron 1 Intron 1 Intron 1 Intron 1 Intron 1 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2 Intron 2

S S S S S S J S S J J S

R F F F F F R F F F R

J(1/2)

F F R F F

J J

JW2)

S S J S S

R

R F R F

tional regulation of many genes is the methylation of DNA (50, 51). Mammalian genomic DNA is methylated primarily at cytosines in the dinucleotide sequence CpG. Although hypermethylation of a gene can result in its increased transcription (52), in most instances hypermethylation results in the suppression of transcription. In some instances, positive regulatory trans-actingfactors cannot bind to critical cisacting elements that are methylated (53). In other instances, methylated DNA binds negative trans-acting factors, which directly induce suppression of transcription (54, 55). The levels of the mRNA that encodes human decorin proteoglycan peptide core in cells of patients with colon cancer are correlated with the extent of methylation of a site within one of this gene’s exons (56, 57). The most common method of analyzing gene methylation utilizes the isoschizomeric restriction endonucleases HpaII and MspI. HpaII does not cleave the sequence 5’-CCGG-3’ if

Human Serglycin Gene

13562

the internal cytosine is methylated (38), but MspIdoes. The to identically sized fragments in the HpaII and MspIdigests deducednucleotide sequence of the human serglycingene of genomic DNAfromHL-60 cells (lanes 13 and 14). In (Fig. 1)was used to determine the methylation pattern of this contrast, several of the other sitesin intron 2 of the serglycin gene incells that do and do not transcribe it. Because of their gene were at least partially methylated in HL-60 cells. Which hybridization to corresponding regions within other genes, it of the five HpaII/MspI sites at the 3’ end of this gene are was not possible to probegenomic DNA blots with short DNAmethylated could not be conclusively determined because of fragments of the human serglycin gene that contained Alu their proximity to one another, but probes K and M both DNA sequences (data not shown). Thus, knowledge of the hybridized to larger DNA fragments in the HpaII digest, as exact location of the Alu repetitive elements within the ser- compared with the MspI digest (data not shown). Thus, some, glycin gene (Fig. 2 A ) permitted the avoidance of these se- if not all,of these sitesat the3’ end of the serglycin gene are quences in the methylation study. Because HL-60 cells, but methylated in HL-60 cells. Whereas probes G, H , and I all not Molt-4 cells, contain serglycin mRNA (Fig. 3), DNA was hybridized to theexpected size of DNA fragments afterdigesisolated from these two cell types, and the methylation pat- tion of genomic DNA with MspI, they hybridized to two to terns of their serglycin genes were determined. The location three fragments after digestion with HpaII. The presence of of all of the sites susceptible to HpaII and MspI within the a 3.2-kb DNA fragment thathybridizes to bothprobe H (lane human serglycin gene were determined (Fig. 2B), and PCR 9) and probe I (lune 11) argues that the HpaII sites that methodology was used to construct 13 probes (designated A- reside at 11.5 and 11.8 kb in the serglycin gene are methylated M) to determine how many of these 5’-CCGG-3’ sequences in most HL-60 cells. In a second experiment, these two sites contained an internal5-methylcytosine. were methylated in almostall of the HL-60cells in the culture When blots containingdigested genomic DNA from HL-60 (data not shown). Exon2 probe G hybridized to two approxcells were analyzed with the intron1probes A-F, each probe imately equal fragments in the HpaII-digest (lane 7), indicathybridized to a DNA fragment in the MspI digest that was ing that the HpaII/MspI site at 9.5 kb in the serglycin gene identical insize with the corresponding fragment in the HpaII was methylated in approximately 50% of the HL-60 cells in digest. Therefore, the 5”flanking region and intron 1 of the the culture. These findings are not the result of incomplete serglycin gene wereboth hypomethylated in HL-60cells. The digestion of the DNA samples or of nonspecific hybridization results with probes A , C,and E are shown in Fig. 4 (lunes 1- of the probes witha fragment from another gene because the 6 ) . Like the intron 1 probes, the intron2 probe J hybridized same blot yielded single bands after hybridization with other probes and because single bands were detected when MspIdigested genomic DNAs were analyzed with these same ser1 2 glycin-derived probes. In contrast to the gene in HL-60 cells, the serglycin gene in Molt-4 cells was highly methylated. Probing with any of the PCR-derived DNAs yielded genomic fragments of >10 kb, indicating that most, if not all, of the HpaII sites in the Actinserglycin gene of Molt-4 cells were methylated. The results obtained with probe C are depicted in Fig. 4 (lanes 15 and 16).

Serglycin-

Several genes (58-63) have been reported preferentially to contain cis-acting regulatory elements in their first introns. Although it has not been determined if the transcriptionregulatory activities of any of these elements in intron-1 are effected by methylation, it has been shown that CpG methylation of the CAMPresponse element found in the promoters FIG.3. RNA blot analysis of HL-60 cells and Molt-4 cells. of many genes abolishesits transcriptionalregulatory activity Total RNA from HL-60 cells (lane I ) and Molt-4 cells (lane 2 ) was electrophoresed in a formaldehyde/agarose gel and transferred to the (53). The diminished methylation of the first intron of the membrane. The resulting RNA blot was probed with cDNAs that serglycin gene in a cell that contains abundantlevels of this encode serglycin and @actin. transcript, but not ina cell that does not transcribe thegene, A

FIG.4. Blots of H’II-digested

or MspI-digested genomic DNA from HL-60 cells and Molt-4 cells. Digests of genomic DNAwere electrophoresed in 1%agarose gels and transferred to membranes. The resulting blots were probed under conditionsof high stringency with PCR-derivedprobesfromvarious regions of thehuman serglycin gene. Probes A , C, and E are located in intron 1; probe G is located inexon 2; and probes H, I , and J are located in intron 2. Lanes 1-14 contain genomic DNA prepared from HL-60 cells; lanes 15-16 contain genomic DNA prepared from Molt4 cells. The arrows indicate the 0.8- to >lO-kb generated DNA fragments that hybridize with the seven different probes.

C

Y

02

1

2

3

4

5

6

7

8

9 1410

13 11 12

I

I ”

15

16

Human Serglycin Gene suggests that specific methylation-dependent nucleotide sequencesinintron 1 act inconcertwith the identified sequences in the 5’-flanking region (19) to regulate transcription of the serglycin gene in differentcell types. Acknowledgments-We thank Dr. S. Avraham (Harvard Medical School) for his helpful advice in these studies and S. Purdy for her technical assistance. REFERENCES 1. Yurt, R. W., Leid, R. W., Jr., Austen, K. F., and Silbert, J. E. (1977) J. Biol. Chem. 252,518-521 2. Robinson, H. C., Horner, A. A., Hook, M., Ogren, S.,and Lindahl, U. (1978) J. Biol. Chem. 263,6687-6693 3. Metcalfe, D. D., Smith, J. A,, Austen, K. F., and Silbert, J. E. (1980) J. Biol. Chem. 2 5 6 , 11753-11758 4. Razin, E., Stevens, R. L., Akiyama, F., Schmid, K., and Austen, K.F. (1982) J. Biol. Chem. 257,7229-7236 5. Bourdon, M. A., Oldberg, A., Pierschbacher, M., and Ruoslahti, E. (1985) Proc. Natl. Acad. Sci. U.S. A. 8 2 , 1321-1325 6. Bourdon, M. A., Shiga, M., and Ruoslahti, E. (1986) J. Biol. Chem. 261,12534-12537 7. Bourdon, M. A,, Shiga, M., and Ruoslahti, E. (1987) Mol. Cell. BWl. 7,33-40 8. MacDermott, R. P., Schmidt, R.E., Caulfield, J. P., Hein, A.,

Bartley, G. T., Ritz, J., Schlossman, S. F., Austen, K. F., and Stevens, R. L. (1985) J. Exp. Med. 162, 1771-1787 9. Seldin, D. C., Austen, K. F., and Stevens, R. L. (1985) J. Biol. Chem. 2 6 0 , 11131-11139 10. Stevens, R. L., Otsu, K., and Austen, K. F. (1985) J. Biol. Chem. 2 6 0 , 14194-14200 11. Stevens, R. L., Lee, T. D. G., Seldin, D. C., Austen, K. F., Befus, A. D., and Bienenstock, J. (1986) J. Immunol. 137,291-295 12. Stevens, R.L., Avraham, S., Gartner, M. C., Bruns, G. A. P., Austen, K. F., and Weis, J. H. (1988) J . Biol. Chem. 263,72877291 13. Stevens, R. L., Fox, C. C., Lichtenstein, L. M., and Austen, K. F. (1988) Proc. Natl. Acad. Sci. U.S. A. 8 5 , 2284-2287 14. Tantravahi, R. V., Stevens, R. L., Austen, K. F., and Weis, J. H. (1986) Proc. Natl. Acad. Sci. U.S. A. 8 3 , 9207-9210 15. Rothenberg, M. E., Pomerantz, J. L., Owen, W. F.,Jr., Avraham, S., Soberman, R. J., Austen, K. F., and Stevens, R. L. (1988) J. Biol. Chem. 2 6 3 , 13901-13908 16. Avraham, S., Stevens, R. L., Gartner, M. C., Austen, K. F., Lalley, P. A., and Weis, J. H. (1988) J. Biol. Chem. 2 6 3 , 7292-7296 17. Avraham, S., Stevens, R. L., Nicodemus, C. F., Gartner, M. C., Austen, K. F., and Weis, J. H. (1989) Proc. Natl. Acad. Sci. 86, 3763-3767 18. Avraham, S., Austen, K. F., Nicodemus, C. F., Gartner, M. C., and Stevens, R. L. (1989) J. Bwl. Chem. 2 6 4 , 16719-16726 19. Avraham, S., Avraham, H., Austen, K.F., and Stevens, R. L. (1992) J. Biol. Chem. 2 6 7 , 610-617 20. Thompson, H. L., Schulman, E. S., and Metcalfe, D. D. (1988) J. Immunol. 1 4 0 , 2708-2713 21. Forsberg, L. S., Lazarus, S. C., Seno, N., DeVinney, R., Caughey, G. H., and Gold, W. M. (1988) Bwchim. Biophys. Acta 9 6 7 , 416-428 22. Alliel, P. M.,Perin, J-P., Maillet, P., Bonnet, F., Rosa, J.-P., and JollBs, P. (1988) FEBS Lett. 2 3 6 , 123-126 23. Stellrecht, C. M., and Saunders, G. F. (1989) Nucleic Acids Res. 17,7523 24. Kjellen, L., Pettersson, I., Lillhager, P., Steen, M.-L., Pettersson,

U., Lehtonen, P., Karlsson, T., Ruoslahti, E., and Hellman, L. (1989) Biochem. J. 2 6 3 , 105-113 25. Lohmander, L. S., Arnljots, K., and Yanagishita, M. (1990) J. Biol. Chem. 266, 5802-5808 26. Angerth, T., Huang, R., Aveskogh, M., Pettersson, I., Kjellen, L., and Hellman, L. (1990) Gene (Amst.) 93,235-240 27. Davidson, S., Gilead, L., Amira, M., Ginsburg, H., and Razin, E. (1990) J.Biol. Chem. 265, 12324-12330

13563

28. Nicodemus, C. F., Avraham, S., Austen, K. F., Purdy, S., Jablonski, J., and Stevens, R. L. (1990) J. Biol.Chem. 2 6 6 , 58895896 29. Krilis, S. A., Austen, K. F., Macpherson, J. L., Nicodemus, C. F., Gurish, M. F., and Stevens, R. L. (1992) Blood 7 9 , 144-151 30. Mattei, M. G., Perin, J.-P., Alliel, P. M., Bonnet, F., Maillet, P., Passage, E., Mattei, J.-F., and Jollbs, P. (1989) Hum. Genetics 82,87-88 31. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,5463-5467 32. Zhang, H., Scholl, R.,Browse, J., and Somerville, C. (1988) Nucleic Acids Res. 16,1220 33. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, H. A. (1988) Science 239,487-490 34. Chomczynski, P., and Sacchi, N. (1987) Anal.Biochem. 1 6 2 , 156-159 35. Thomas, P. S. (1980) Proc. Natl. Acad. Sci. U. S. A. 7 7 , 52015205 36. Spiegelman, B.M., Frank, M., and Green, H. (1983) J. Biol. Chem. 268,10083-10089 37. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., pp. 9.16-9.19, Cold

Spring Harbor Laboratory, Cold Spring Harbor, NY 5, 38. Waalwijk, C., and Flavell, R. A. (1978) NucleicAcidsRes. 3231-3236 39. Southern, E.M. (1975) J. Mol. Biol. 98,503-517 40. Schmid, C. W., and Shen, C.-K. J. (1985) Molecular Evolutionary Genetics (MacIntyre, R. J., ed.) pp. 323-358, Plenum Press,

New York 41. Donehower, L. A., Slagle, B. L., Wilde, M., Darlington, G., and Butel, J. S. (1989) Nucleic Acids Res. 17, 699-710 42. Ullu, E., Murphy, S., and Melli, M. (1982) Cell 2 9 , 195-202 43. Slagel, V., Flemington, E., Traina-Dorge, V., Bradshaw, H., and Deininger, P. (1987) Mol. Biol. Euol. 4, 19-29 44. Jurka, J., andSmith, T. (1988) Proc. Natl. Acad. Sci. U. S. A. 8 5 , 4775-4778 45. Willard, C., Nguyen, H. T., and Schmid, C.W. (1987) J. Mol. Euol. 26, 180-186 46. Britten, R. J., Baron, W. F., Stout, D. B., and Davidson E. H. (1988) Proc. Natl. Acud. Sci. U. S. A. 85,4770-4774 47. Ariga, T., Carter, P. E., and Davis, A. E., I11 (1990) Genomics 8 , 607-613 48. Woods-Samuels, P., Kazazian, H. H., Jr., and Antonarakis, S. E. (1991) Genomics 10,94-101 49. Lehrman, M. A., Goldstein, J. L., Russell, D. W., and Brown, M. S. (1987) Cell 48,827-835 50. Doerfler W. (1983) Annu. Reu. Biochem. 52.93-124 51. Adams, R. L. P. (1990) Biochem. J. 265, 309-320 52. Tanaka, K., Appella, E., and Jay, G. (1983) Cell 3 6 , 457-465 53. Iguchi-Ariga, S. M. M., and Schaffner, W. (1989) Genes & Deu. 3,612-619 54. Keshet, I. J., Lieman-Hurwitz, J., and Cedar, H. (1986) Cell 44, 535-543 55. Meehan, R. R., Lewis, J. D., McKay, S., Kleiner, E. L., and Bird, A. P. (1989) Cell 58, 499-507 56. Adany, R., Heimer, R., Caterson, B., Sorrell, J. M., and Iozzo, R. V. (1990) J. Biol. Chem. 2 6 5 , 11389-11396 57. Adany, R., and Iozzo, R. V. (1991) Biochem. J. 276, 301-306 58. Liska, D. J., Slack, J. L., and Borstein, P. (1990) Cell Regul. 1, 487-498 59. Reid, L. H., Gregg, R. G., Smithies, O., and Koller, B. H. (1990) Proc. Natl. Acad. Aci. U. S. A. 87, 4299-4303 60. Franklin, G. C., Donovan, M., Adam, G. I. R., Holmgren, L., Pfeifer-Ohlsson, S., and Ohlsson, R. (1991) EMBO J. 10,13651373 61. Rhodes, C., Savagner, P., Line, S., Sasaki, M., Chirigos, M., Doege, K., andYamada, Y . (1991) NucleicAcidsRes. 19,19331939 62. Burbelo, P. D., Bruggeman, L. A., Gabriel, G. C., Klotman, P. E., and Yamada, Y. (1991) J. Biol. Chem. 266.22297-22302 63. Wang, L., Balakir, R., and Horton, W. E.,. Jr. (1991) J . Eiol. Chem. 266, 19878-19881